11. How do we build Documentation?

11.1. Review from getting unstuck with git

11.1.1. Add a results folder and prevent its content from being committed.

Recall, this is our GitHub inclass repo:

cd github-in-class-brownsarahm
ls
CONTRIBUTING.md	README.md	config.txt	docs		setup.py
LICENSE.md	about.md	config.yml	mymodule	tests

We saw last week to use the .gitignore file to ignore files that match a pattern.

cat .gitignore

we used a very simple pattern

config.*

to ignore any config file in this repository.

If we push

git push
Enumerating objects: 44, done.
Counting objects: 100% (42/42), done.
Delta compression using up to 8 threads
Compressing objects: 100% (32/32), done.
Writing objects: 100% (38/38), 2.96 KiB | 1.48 MiB/s, done.
Total 38 (delta 22), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (22/22), completed with 3 local objects.
To https://github.com/introcompsys/github-in-class-brownsarahm.git
   cea6a93..62b943f  main -> main

and view online, we can see that the two config files we have locally are not pushed, becuase they were not added to the .git

Important

This is another example of how git push does not push the files in the directory, but instead the .git directory (compressed) and then GitHub (and other git hosts) use git operations to show us the contents of the files based on the repository database.

Now, back to our task we first create the directory and then append a new line to it.

mkdir results
echo "results/" >> .gitignore

Note that the two > is needed, not one so tht we append instead of erase.

Try it yourself

What could you do if you had used only one and overwritten the .gitignore file instead of appending to it?

How can we tell that worked? As usual, git status is a good place to start.

git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   .gitignore

no changes added to commit (use "git add" and/or "git commit -a")

When we do git status, we see that we changed the .gitignore file but this does not show us that what we put in there actually matches the directory we want to ignore yet.

Important

git does not track empty directories.

If we create another empty directory we can see that this is also not showing as an untracked file.

mkdir res2

git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   .gitignore

no changes added to commit (use "git add" and/or "git commit -a")

We can add an empty file to test that our ignore actually worked.

touch results/test1
git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   .gitignore

no changes added to commit (use "git add" and/or "git commit -a")

and compare to what hapens if we add an empty file to the directory that does not match our pattern.

touch res2/test1

git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   .gitignore

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	res2/

no changes added to commit (use "git add" and/or "git commit -a")

Note

a directory where all files are untracked stays as just the directory no matter how many files it is, until you add at least one.

touch res2/test2

git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   .gitignore

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	res2/

no changes added to commit (use "git add" and/or "git commit -a")

Note

merge conflict file explanation to be added later

11.2. What does it mean to build documentation?

Note

For now, see the prismia transcript for an outline here

11.2.1. What is a build?

11.2.2. Why is documentation so important?

documentation types table

ethnography of docuemtnation data science

11.3. Building Documentation with Jupyter book

There are many Documenation Tools because we (programmers) don’t like

linux kernel uses sphinx

why and how it works

Jupyterbook wraps sphinx and uses markdown instead of restructured text

11.3.1. Creating a Jupyterbook from the template

We’ll naviate out of the github in class repo

cd ..

and use jupyterbook to create a new book from the template

jupyter-book create tiny-book

We see this


===============================================================================

Your book template can be found at

    tiny-book/

===============================================================================

and that it created a new directory

ls
github-in-class-brownsarahm	test				tiny-book
nand2tetris			test2

If we navigate to that directory we can create a new repo here as well

cd tiny-book/
git init .
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: 	git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: 	git branch -m <name>
Initialized empty Git repository in /Users/brownsarahm/Documents/sysinclass/tiny-book/.git/

and again we’ll change our default branch to main

git branch -m main

git status
On branch main

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	_config.yml
	_toc.yml
	intro.md
	logo.png
	markdown.md
	notebooks.ipynb
	references.bib
	requirements.txt

nothing added to commit but untracked files present (use "git add" to track)

and we will commit the template.

git add .

git commit -m 'empty template'
[main (root-commit) a8a489e] empty template
 8 files changed, 362 insertions(+)
 create mode 100644 _config.yml
 create mode 100644 _toc.yml
 create mode 100644 intro.md
 create mode 100644 logo.png
 create mode 100644 markdown.md
 create mode 100644 notebooks.ipynb
 create mode 100644 references.bib
 create mode 100644 requirements.txt

11.3.2. Strucutre of a Jupyter book

ls
_config.yml		intro.md		markdown.md		references.bib
_toc.yml		logo.png		notebooks.ipynb		requirements.txt

A jupyter book has two required files (_config.yml and _toc.yml)

the extention (.yml) is yaml, which stands for “YAML Ain’t Markup Language”. It consists of key, value pairs and is deigned to be a human-friendly way to encode data for use in any programming language.

cat _config.yml

The configuration file, tells it basic iformation about the book, it provides all of the settings that jupyterbook and sphinx need to render the content as whatever output format we want.

# Book settings
# Learn more at https://jupyterbook.org/customize/config.html

title: My sample book
author: The Jupyter Book Community
logo: logo.png

# Force re-execution of notebooks on each build.
# See https://jupyterbook.org/content/execute.html
execute:
  execute_notebooks: force

# Define the name of the latex output file for PDF builds
latex:
  latex_documents:
    targetname: book.tex

# Add a bibtex file so that we can create citations
bibtex_bibfiles:
  - references.bib

# Information about where the book exists on the web
repository:
  url: https://github.com/executablebooks/jupyter-book  # Online location of your book
  path_to_book: docs  # Optional path to your book, relative to the repository root
  branch: master  # Which branch of the repository should be used when creating links (optional)

# Add GitHub buttons to your book
# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository
html:
  use_issues_button: true
  use_repository_button: true

The table of contents file describe how to put the other files in order.

cat _toc.yml
# Table of contents
# Learn more at https://jupyterbook.org/customize/toc.html

format: jb-book
root: intro
chapters:
- file: markdown
- file: notebooks

Try it yourself

Which files created by the template are not included in the rendered output? How could you tell?

The other files are optional, but common. Requirements.txt is the format for pip to install python depndencies. There are different standards in other languages for how

cat requirements.txt
jupyter-book
matplotlib
numpy

the .bib file allows you to put structured data in bibtex format and then jupyterbook can write a bibliography for you.

11.3.3. How do I get output?

jupyter-book build .
Running Jupyter-Book v0.11.1
Source Folder: /Users/brownsarahm/Documents/sysinclass/tiny-book
Config Path: /Users/brownsarahm/Documents/sysinclass/tiny-book/_config.yml
Output Path: /Users/brownsarahm/Documents/sysinclass/tiny-book/_build/html
Running Sphinx v3.2.1
making output directory... done
[etoc] Changing master_doc to 'intro'
checking for /Users/brownsarahm/Documents/sysinclass/tiny-book/references.bib in bibtex cache... not found
parsing bibtex file /Users/brownsarahm/Documents/sysinclass/tiny-book/references.bib... parsed 5 entries
myst v0.13.7: MdParserConfig(renderer='sphinx', commonmark_only=False, dmath_allow_labels=True, dmath_allow_space=True, dmath_allow_digits=True, update_mathjax=True, enable_extensions=['colon_fence', 'dollarmath', 'linkify', 'substitution'], disable_syntax=[], url_schemes=['mailto', 'http', 'https'], heading_anchors=None, html_meta=[], footnote_transition=True, substitutions=[], sub_delimiters=['{', '}'])
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 3 source files that are out of date
updating environment: [new config] 3 added, 0 changed, 0 removed
Executing: notebooks in: /Users/brownsarahm/Documents/sysinclass/tiny-book                             

looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] notebooks                                                                     
generating indices...  genindexdone
writing additional pages...  searchdone
copying images... [100%] _build/jupyter_execute/notebooks_2_0.png                                      
copying static files... ... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded.

The HTML pages are in _build/html.
[etoc] missing index.html written as redirect to 'intro.html'

===============================================================================

Finished generating HTML for book.
Your book's HTML pages are here:
    _build/html/
You can look at your book by opening this file in a browser:
    _build/html/index.html
Or paste this line directly into your browser bar:
    file:///Users/brownsarahm/Documents/sysinclass/tiny-book/_build/html/index.html            

===============================================================================

Then we can see in browser what the output looks like.

we can also see that a _build directory was created and examine that

cd _build/
ls

it has some cache (jupyter_execute) and our output (html). Building to html is the default, but there are many types of buils output

html		jupyter_execute

We can see in the html folder

cd html/
ls

that there are html files for each input file, plus some extra (genindex) and ther is javascript for searching the site

_images		_sources	genindex.html	intro.html	notebooks.html	search.html
_panels_static	_static		index.html	markdown.html	objects.inv	searchindex.js

the static folder contains the rest of the formatting

ls _static/
__init__.py
__pycache__
basic.css
clipboard.min.js
copy-button.svg
copybutton.css
copybutton.js
copybutton_funcs.js
css
doctools.js
documentation_options.js
file.png
images
jquery-3.5.1.js
jquery.js
js
language_data.js
logo.png
minus.png
mystnb.css
panels-main.c949a650a448cc0ae9fd3441c0e17fb0.css
panels-variables.06eb56fa6e07937060861dad626602ad.css
plus.png
pygments.css
searchtools.js
sphinx-book-theme.12a9622fbb08dcb3a2a40b2c02b83a57.js
sphinx-book-theme.acff12b8f9c144ce68a297486a2fa670.css
sphinx-book-theme.css
sphinx-thebe.css
sphinx-thebe.js
togglebutton.css
togglebutton.js
underscore-1.3.1.js
underscore.js
vendor
webpack-macros.html

We didn’t have to write any html and we got a responsive site!

If you wanted to change the styling with sphinx you can use built in themes which tell sphinx to put different files in the _static folder when it builds your site, but you don’t have to change any of your content! If you like working on front end things (which is great! it’s just not alwasy the goal) you can even build your own theme that can work with sphinx.

Jupyterbook is pretty opinionated on the styling, but for more general documenation websites the options are good

cd ../../

ls
_build			intro.md		notebooks.ipynb
_config.yml		logo.png		references.bib
_toc.yml		markdown.md		requirements.txt

We’re going to ignore the built files and use this later to automatically build them on GitHub.

echo "_build/" > .gitignore

git status
On branch main
Untracked files:
  (use "git add <file>..." to include in what will be committed)
	.gitignore

nothing added to commit but untracked files present (use "git add" to track)

and then add and commit

git add .

git commit -m 'ignroe build'
[main 3e249ce] ignroe build
 1 file changed, 1 insertion(+)
 create mode 100644 .gitignore

11.4. How do I push a repo that I made locally to GitHub

For today, create an empty github repo shared with me.

More generally, you can create a repo

That default page for an empty repo if you do not initiate it with any files will give you the instructions for what remote to add.

THen we add the remote

git remote add origin https://github.com/introcompsys/tiny-book-brownsarahm.git

and push to origin main.

git push -u origin main
\Enumerating objects: 13, done.
Counting objects: 100% (13/13), done.
Delta compression using up to 8 threads
Compressing objects: 100% (11/11), done.
Writing objects: 100% (13/13), 16.07 KiB | 8.03 MiB/s, done.
Total 13 (delta 1), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (1/1), done.
To https://github.com/introcompsys/tiny-book-brownsarahm.git
 * [new branch]      main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.
(base) brownsarahm@tiny-book $

11.5. Prepare for next Class

  1. review the notes

  2. (priority) Convert your KWL repo to a jupyter book. Ignore the builds from your reposiotry. Use the documenation to choose your settings: link it to your repo, do not serve it publicly on github. Be sure that it can build and if you encounter problems, create a jupyterbooktroubleshooting.md file and log them with solutions if you find them.

  3. complete Feedback survey if you have not alredy.

11.6. More Practice

  1. build your kwl repo to a pdf or one other format locally.

  2. Add templating.md to your KWL repo and explain templating in your own words using other programming concepts you have learned so far. Include in a markdown (same as HTML <!-- comment --> ) comment the list of CSC courses you have taken for context while we give you feedback.

  3. Learn about the documentation ecosystem in another language that you know. In docs.md include a summary of your findings and compare and contrast it to jupyter book/sphinx.

11.7. Questions after class

11.7.1. Could I use jupyterbook to put a resume online?

In theory, yes. However there are other tools that work similar in the sense of separating content from HTML/CSS/javascript in order to generate a website. Sphinx is more customizable, as noted above, but also, Static site generators are a whole category of software tools.

GitHub pages is free hosting, which by default uses jekyll however, that is no longer very active, due to the death of one of the core maintainers.

Some are centered on blog type sites (like jekyll) others are not. There are even ways to use a javascript slide tool to write in markdown but render for the web.