8. How does git work?

8.1. Review

How can you write the history of a repo to a file?

We’ll do this inside our in-class repo.

cd github-in-class-brownsarahm/

Recall, the git log command

git log

displays it in a text editor:

commit cea6a93d576ecd042823fca24553a58a6cd6565b (HEAD -> main, origin/main, origin/HEAD)
Author: Sarah Brown <brownsarahm@uri.edu>
Date:   Thu Feb 3 16:42:12 2022 -0500

    try to prevernt repeated running

commit c15cf43b6807e172aaba7cf3b57adc7214b91082 (test)
Author: Sarah M Brown <brownsarahm@uri.edu>
Date:   Thu Feb 3 13:38:42 2022 -0500

    insclass 2-3

commit 17320fc6f26806eb7d1ffc62c23b3bf1361b58b2
Author: Sarah M Brown <brownsarahm@uri.edu>
Date:   Tue Feb 1 13:32:46 2022 -0500

    complete about closes #2

commit b81cf1525e96782e868e96a20eacf6eb26e882b7
Author: Sarah M Brown <brownsarahm@uri.edu>
Date:   Tue Feb 1 13:20:05 2022 -0500

use the q key to exit

We can use a redirect > to send that to a file instead of to where it normally goes.

git log > gitlog.txt

We can confirm this is the same as we saw before with cat

cat gitlog.txt
commit cea6a93d576ecd042823fca24553a58a6cd6565b
Author: Sarah Brown <brownsarahm@uri.edu>
Date:   Thu Feb 3 16:42:12 2022 -0500

    try to prevernt repeated running

commit c15cf43b6807e172aaba7cf3b57adc7214b91082
Author: Sarah M Brown <brownsarahm@uri.edu>
Date:   Thu Feb 3 13:38:42 2022 -0500

    insclass 2-3

commit 17320fc6f26806eb7d1ffc62c23b3bf1361b58b2
Author: Sarah M Brown <brownsarahm@uri.edu>
Date:   Tue Feb 1 13:32:46 2022 -0500

    complete about closes #2

commit b81cf1525e96782e868e96a20eacf6eb26e882b7
Author: Sarah M Brown <brownsarahm@uri.edu>
Date:   Tue Feb 1 13:20:05 2022 -0500

    create empty about

commit f707186cd978072d2888b0d2112023b5618b6414
Author: Sarah Brown <brownsarahm@uri.edu>
Date:   Tue Feb 1 12:52:28 2022 -0500

    create readme closes #3

commit 611180dd89ffd8977dd9e5c502bcd4147c4980be
Author: github-classroom[bot] <66690702+github-classroom[bot]@users.noreply.github.com>
Date:   Tue Feb 1 17:45:49 2022 +0000

    Setting up GitHub Classroom Feedback

commit 3fbd9ee19ff64241b102fa61f279e9c4683d0e69
Author: github-classroom[bot] <66690702+github-classroom[bot]@users.noreply.github.com>
Date:   Tue Feb 1 17:45:49 2022 +0000

    GitHub Classroom Feedback

commit 270a43daebe1ba7bb719f141b327ca35e8e187ad
Author: github-classroom[bot] <66690702+github-classroom[bot]@users.noreply.github.com>
Date:   Tue Feb 1 17:45:47 2022 +0000

    Initial commit

8.2. Review: Anatomy of git

Recall that the HEAD file contains a pointer to the place in the git database that matches the current status of the directory

cat .git/HEAD

in this case it points to the head ref called main

ref: refs/heads/main

This is what git status uses, we can see that using our first git “plumbing” command 1

git cat-file -p refs/heads/main

Before we look at the output of this, let’s examine the command:

  • git cat-file: displays git content

  • -p (pretty print) option figure out the type of content and display it appropriately

tree 4cbe76bea90531a07dbc8cc413de26f39994478b
parent c15cf43b6807e172aaba7cf3b57adc7214b91082
author Sarah Brown <brownsarahm@uri.edu> 1643924532 -0500
committer GitHub <noreply@github.com> 1643924532 -0500
gpgsig -----BEGIN PGP SIGNATURE-----

 wsBcBAABCAAQBQJh/Ew0CRBK7hj4Ov3rIwAATFIIAAeEB38i/5hNr1OUfYMJ/1PA
 RKdcQYspkvi70ISXpkLTEvTMwkFfmJO6Y12QqKdy7WJHIeD0qRsigmOv9oUCv2LJ
 YIdd1d6m9gfe6lwrxh5QG3itURfXyhVIff0j8YHYzq0HN7gAjw+s6UHGGjJiR7u4
 Jq01ekrQJ/OrXWmQnr6UWJNqwBx/7rSgeh0yWpx+5f8z+dFp5N57nk7MGZE04YBn
 huJG6lDPRxTHifYwDx1Ib5C0BvrLKQO0j9TvI/86LhcFSyoDBG3/diHokrQO5bdU
 okm+JsNZAx4prZebZ8eKcMEu5YiAX0wllQxrAEBKEUCc3EU2W4MMRFeiCOY3UNI=
 =oLlf
 -----END PGP SIGNATURE-----


try to prevernt repeated running

This object has the hash for the tree that this commit is in, it also has the hash of the parent commit to this commit, then author information and the last commit message

8.3. Git Plumbing in a Fresh repo

Recall we used git init test to create a new repo on Tuesday, let’s got back to that.

cd ../test/

We can use git status to confirm that it is completely blank

git status
On branch main

No commits yet

nothing to commit (create/copy files and use "git add" to track)

The directory is also empty

ls -a

Except for the .git database

.	..	.git

And recall the different files.

ls .git/
HEAD		description	info		refs
config		hooks		objects

Try it yourself

Try to remember what each of those files/directories does, make a table and try to write each one’s role. After, check your answers using Tuesday’s notes

8.4. Git Objects

we noted above that the git cat-file only works to git objects. Lets see what objects there are. Today we will see three types today:

  • blob objects: the content of your files (data)

  • tree objects: stores file names and groups files together (organization)

  • Commit Objects: stores information about the sha values of the snapshots

First, lets examine the objects directory. This time, instead of using ls we’ll use the bash command find. This pattern matches and looks for files or directories and lists the results

find .git/objects/

We see that it finds, the object directory itself and two subdirectories.

.git/objects/
.git/objects//pack
.git/objects//info

Let’s look at another option of this command, the -type option filters the results and only returns the ones of the desired type. We’ll look at only files using f

find .git/objects/ -type f

In this empty repository, there are no files.

8.4.1. Hashing

Let’s create an object. We have mentioned breifely before that git stores data by hashing it, so the plumbing command git hash-object does exactly that.

echo 'text content' | git hash-object -w --stdin

Let’s examine this command:

  • git hash-object by default hashes the return the unique key to that particular content

  • the -w option then tells git to alsow write that object to the database

  • the --stdin option tells git hash-object to get the content to be processed from stdin instead of a file

  • the | is called a pipe (what we saw before was a redirect) it pipes a process output into the next command

  • echo would write to stdout, with the pipe it passes that to stdin for the git hash-object to use

and we see it returns the unique key to us:

c182a93374d6b18a87e1f1e8a9a18812639c58c8

We can then view the file that was added to the database with git cat-file again. Recall the -p option, uses information about the file type to display it in an appropriate manner.

git cat-file -p c182a93374d6b18a87e1f1e8a9a18812639c58c8
text content

So far, we added “text content” to the git database, but nothing in the directory:

ls

shows us that its an empty directory.

Lets create a file and add that to our git database. We will create the file with echo and a redirect:

echo 'version 1' > test.txt

then we can hash it and write to the database from the file this time.

git hash-object -w test.txt
83baae61804e65cc73a7201a7252750c76066a30

we can list the directory so and note that more directories have been made:

ls .git/objects/
83	c1	info	pack

Even better, we can use the find command filtered by type that we saw earlier.

find .git/objects/ -type f
.git/objects//c1/82a93374d6b18a87e1f1e8a9a18812639c58c8
.git/objects//83/baae61804e65cc73a7201a7252750c76066a30

We see that this created an additional file for each of the two has objects that we have created.

Let’s append more text to the file, recall two >> redirects to append instead of overwrite.

echo 'version 2' >> test.txt

We can confirm this worked as we expected:

cat test.txt
version 1
version 2

And then has the file

git hash-object -w test.txt

and view the hashed content using the key that was returned.

git cat-file -p 0c1e7391ca4e59584f8b773ecdbbb9467eba1547
version 1
version 2

So far, we have been adding a new hash of the whole file each time. We will see later that git can be more efficient than this because this would begin to take a lot of space very quickly.

8.4.2. Tree Objects

The next type of object to inspect is the tree, since the tree is a git object, we can use git cat-file again.

git cat-file -p main^{tree}

but this test repo that we are working with does not have a tree yet, so it

fatal: Not a valid object name main^{tree}

We can confirm this using git status

git status
On branch main

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	test.txt

nothing added to commit but untracked files present (use "git add" to track)

In a more complete repo, we can view the tree:

git cat-file -p main^{tree}
040000 tree c91892d93170077fea81f1d8009c4ccc91af1c29	.github
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	CONTRIBUTING.md
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	LICENSE.md
100644 blob 7987a001e70d28376129bfe0538f98f9aa281a55	README.md
100644 blob b5f4fcb8f875fe33781256ebf6fdde7547188d6f	about.md
040000 tree 55651ac0763db741988f02a45842aa020a77c14d	docs
040000 tree 3581dc38ca3929efcbbc6f7ce510ded2c7dd7309	mymodule
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	setup.py
040000 tree 45fcb1dd311e5e45af759cb3627dca5f47f58f04	tests

We see a blob for each file and a tree for each directory. The blob objects refer to the last hashed version of the file added to this branch (main)

cat HEAD
ref: refs/heads/main

8.5. Creating a Commit manually

Warning

this is revised relative to in class

In class we tried to add directly to the commit tree:

echo 'first commit' | git commit-tree c182a9

but this failed:

fatal: c182a93374d6b18a87e1f1e8a9a18812639c58c8 is not a valid 'tree' object

because we had skipped a step. We need to create the tree first.

You use this command to artificially add the earlier version of the test.txt file to a new staging area.

git update-index --add --cacheinfo 100644 \
  83baae61804e65cc73a7201a7252750c76066a30 test.txt

the \ allows us to continue typing on a second visual line in one long command. this command puts the hashed object 83baae with file named test.txt

  • this the plumbing command git update-index updates (or in this case creates an index, the staging area of our repository)

  • the --add option is because the file doesn’t yet exist in your staging area (you don’t even have a staging area set up yet)

  • --cacheinfo because the file you’re adding isn’t in your directory but is in your database. Then, you specify the mode, SHA-1, and filename

n this case, you’re specifying a mode of 100644, which means it’s a normal file.

git status
On branch main

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	test.txt

nothing added to commit but untracked files present (use "git add" to track)
git add test.txt
git status
On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)
	new file:   test.txt

Note

We tried viewing the index but it did not work. the documentation shows the specification for the index, it is not readable

git commit -m 'first commit'
[main (root-commit) 2948028] first commit
 1 file changed, 2 insertions(+)
 create mode 100644 test.txt
find .git/objects/ -type f
.git/objects//0c/1e7391ca4e59584f8b773ecdbbb9467eba1547
.git/objects//c1/82a93374d6b18a87e1f1e8a9a18812639c58c8
.git/objects//29/480288d80e3850d6bf7ae170bd3bea5b3164c7
.git/objects//83/baae61804e65cc73a7201a7252750c76066a30
.git/objects//25/8231c1cee8048eef3a8057cfbdab76261277c6
git cat-file -p main^{tree}
100644 blob 0c1e7391ca4e59584f8b773ecdbbb9467eba1547	test.txt
git cat-file -p 2948028
tree 258231c1cee8048eef3a8057cfbdab76261277c6
author Sarah M Brown <brownsarahm@uri.edu> 1645121714 -0500
committer Sarah M Brown <brownsarahm@uri.edu> 1645121714 -0500

first commit
git cat-file -p 258231
100644 blob 0c1e7391ca4e59584f8b773ecdbbb9467eba1547	test.txt

8.6. How is git efficient?

git gc
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
ls
test.txt
find .git/objects -type f
.git/objects/c1/82a93374d6b18a87e1f1e8a9a18812639c58c8
.git/objects/pack/pack-07b0c08cc0268d55d02ea62ea880c61f810956d8.idx
.git/objects/pack/pack-07b0c08cc0268d55d02ea62ea880c61f810956d8.pack
.git/objects/info/commit-graph
.git/objects/info/packs
.git/objects/83/baae61804e65cc73a7201a7252750c76066a30
git verify-pack
usage: git verify-pack [-v | --verbose] [-s | --stat-only] <pack>...

    -v, --verbose         verbose
    -s, --stat-only       show statistics only
    --object-format <hash>
                          specify the hash algorithm to use

git verify-pack -s
usage: git verify-pack [-v | --verbose] [-s | --stat-only] <pack>...

    -v, --verbose         verbose
    -s, --stat-only       show statistics only
    --object-format <hash>
                          specify the hash algorithm to use

8.7. Why didn’t git cat-file work on HEAD?

We tried:

git cat-file -p HEAD

and saw an error

fatal: Not a valid object name HEAD

To confirm what it was doing wrong, we checked another repository:

git init test2
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: 	git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: 	git branch -m <name>
Initialized empty Git repository in /Users/brownsarahm/Documents/sysinclass/test2/.git/
git cat-file -p HEAD
fatal: not a git repository (or any of the parent directories): .git
cd test2
git cat-file -p HEAD
fatal: Not a valid object name HEAD

We confirmed that this does not work.

This was because I made a mistake, the HEAD file does not need git cat-file it is a plain text file, not a git blob (type) object.

8.7.1. Could the lack of pushing be why?

cat .git/config

lets compare the results of this across two repos first for the small test repo:

[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true
	ignorecase = true
	precomposeunicode = true

and for github-inclass:

[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true
	ignorecase = true
	precomposeunicode = true
[remote "origin"]
	url = https://github.com/introcompsys/github-in-class-brownsarahm.git
	fetch = +refs/heads/*:refs/remotes/origin/*
[branch "main"]
	remote = origin
	merge = refs/heads/main
[branch "reset"]
	remote = origin
	merge = refs/heads/reset

This is all of the difference that is caused by the lack or configuring a remote.

We can view the heads by listing that directory

ls .git/refs/heads
main	reset	test

8.8. Prepare for next Class

  1. Review the notes

  2. For the core "Porcelain" git commands we have used (add, commit), make a table of which git plumbing commands they use in gitplumbing.md in your KWL repo

  3. In a gitunderstanding.md list 3-5 items from the following categories (1) things you have had trouble with in git in the past and how they relate to your new understandign (b) things that your understanding has changed based on today's class (c) things about git you still have questions about

  4. Make notes on how you use IDEs for the next week or so using the template idethoughts.md file in the course notes.

8.9. More Practice

  1. Add to your gitplumbing.md file explanations of the main git operations we have seen (add, commit, push) in your own words in a way that will either help you remember or behow you would explain it to someone else at a high level. This might be analogies or explanations using other progrmaming concepts or concepts from a hobby. Add this under a subheading ## with a descriptive title (for example "Git In terms of ")

  2. For one thing your understanding changed or an open question you, look up or experiment to find the answer

# IDE Thoughts

## Actions Accomplished
<!-- list what things you do: run code/ edit code/ create new files/ etc; no need to comment on what the code you write does -->


## Features Used
<!-- list features of it that you use, like a file explorer, debugger, etc -->

8.10. Questions After Class

8.10.1. when does the grade-free zone end?

It has ended, starting with the feedback given today, there will be grading implications, but remember there are flexibility systems with most of it.

8.10.2. Are there other uses for the hashes besides identifying git files in our repositories

for these particular hashes, no, they are the keys that mathc to values of items in our git database, but we can do other things with other hashes.

8.10.3. what does print pretty / -p mean?

this option of the git cat-file command displays the contents of a blob in a human readable format.

8.10.4. will we ever need to use the plumbing tools in a bind?

Some of them, yes! For example, today, it was helpful to use the plumbing command git update-ref to create a branch from a particular past commit. They could also help you to view a commit in greater detail and some other beyond what we saw today could help you edit a commit or alter the “history” of your repo if you need to.

The porcelain commands will get you through your day to day everything is going right and the most common mistakes.

8.10.5. Questions addressed above integrated into the narrative

• I am still confused about the idea of the objects and what they do exactly/their purpose

• so all of the plumbing commands we used today are all the parts that are wrapped up in the processes like adding/committing/pushing things to GitHub?

8.10.6. Questions to practice with

  • What if we would change the file again (added 3rd line) and committed again. How many hash objects would be created? What connection would there be between the commits?

8.10.7. Questions we will come back to next

  • how are hash number formed?

  • From init, first commit, to first push, what is the exact sequence of events that happens under the hood of git?


1

low level git commands that allow the user to access the inner workings of git.