# Git Deep Dive A Version 1 Technical Meetup talk covering the deep internals of git. ## Introduction This is the slide-deck and set-up scripts used to give the *Git Deep Dive* technical meet-up talk on 2018-06-25 in Version 1, by Éibhear Ó hAnluain. The pack comes with the following: - This document - The slide-deck, `GitDeepDive.pdf` - A script to set up a simple git repository for exploration purposes: `simple-setup.sh` - A script to set up the mimic of a development team and processes: `gitDemo.sh` - A script to set up another simple repository to run through the process for completely clearing a file out of a git database: `largeFile-setup.sh`. - A `.gitignore` file. ## Setup 1. Run the script `simple-setup.sh`. This will create a git repository in `simpleRepo` and populate it with some commits. The commits in this repository will correspond to the diagrams in the slide-deck that relate to merging. 2. Run the script `largeFile-setup.sh`. This will create another git repository in `largeFileRepo` and populate it with the same commits as in `simpleRepo`, and then puts in some more to create and manipulate a large binary file, and then to remove it. This will be used for the demonstration to remove a file completely from the repository. 3. Run the script `gitDemo.sh`. This will replicate a previous git repository as a bare repo in `devTeamDemo/javaBootcampNoEclipse.git`, and then create a number of clones to represent the actions of a team lead and two developers, and mimic collaborative development among them. `gitDemo.sh` takes one optional parameter, `-s`, which will cause the progression to stop for 15 seconds following each "actor's" push in order to afford the opportunity to look at the git log. 4. Use the `gpg` utility to generate a key-pair so that you can work through the commit- and tag-signing slide and sample commands. If you have access to `gpg2` and not `gpg`, you'll need to set the following `git config` to make sure it's picked up: `git config --global gpg.program gpg2`. 5. Open the file `GitDeepDive.pdf`, and as you go through the slides, refer to the *Sample Commands* section below for commands you can execute to get further insights. ## Sample commands ### Slide: Configuration # Go to where the GitDeepDive.pdf file is and set your BASE_DIR # environment variable. cd export BASE_DIR=$(pwd) # List your basic git config git config --list # Go into the simple repo location and look at the various config # contexts cd ${BASE_DIR}/simpleRepo git config --local --list git config --global --list git config --system --list # Set some settings git config --global user.name "" git config --global user.email "" # Use a text editor to edit your config git config -e ### Slide: fetch and merge, not pull # Go into the clone belonging to one of the developers in the # development team demo area cd ${BASE_DIR}/devTeamDemo/javaBootcampNoEclipse.dev1 # Update the clone, but don't merge anything git fetch --prune # Review the local and remote branches. git branch -va ### Slide: Merging approaches: fast-forward cd ${BASE_DIR}/simpleRepo # Check out the master branch and review it's log git checkout master git log --decorate --graph --oneline --all # Merge in the Rel1 branch and review the new log. git merge Rel1 git log --decorate --graph --oneline --all ### Slide: Merging approaches: merging strategies cd ${BASE_DIR}/simpleRepo # Check out the master branch, merge the Rel2 branch and review the # new log. git checkout master # You'll be prompted for a commit message here. git merge Rel2 git log --decorate --graph --oneline --all ### Slide: Merging approaches: Rebase # Remove and refresh the simpleRepo ${BASE_DIR}/simple-setup.sh cd ${BASE_DIR}/simpleRepo # Check out master, review its log, merge Rel1 and review the log git checkout master git log --decorate --graph --oneline --all git merge Rel1 git log --decorate --graph --oneline --all # Check out Rel2 git checkout Rel2 # Rebase Rel2 onto the now-new master git rebase master # Review the log git log --decorate --graph --oneline --all ### Slide: refs # Remove and refresh the simpleRepo ${BASE_DIR}/simple-setup.sh cd ${BASE_DIR}/simpleRepo # Look at the contents of .git/refs/heads and one of the heads itself ls -l .git/refs/heads/ cat .git/refs/heads/Rel2 # Tag a branch, look at the branch ref and the tag ref git tag Rel1.0 Rel1 cat .git/refs/heads/Rel1 cat .git/refs/tags/Rel1.0 # Look at the HEAD ref cat .git/HEAD # Go to another clone and look at the refs for the origin remote cd ${BASE_DIR}/devTeamDemo/javaBootcampNoEclipse.dev1 ls -l .git/refs/remotes/origin/ # .git/packed-refs contains the refs that haven't been interacted with # yet. cat .git/packed-refs ### Slide: Annotated tags cd ${BASE_DIR}/simpleRepo # Look at a "lightweight" tag cat .git/refs/tags/Rel1.0 # What type of object is it pointing to? git cat-file -t $(cat .git/refs/tags/Rel1.0) # What's in the object it's pointing to git cat-file -p $(cat .git/refs/tags/Rel1.0) # Create an annotated tag and look at *it* git tag -a -m "Formal release of 1.0" Rel1.0.prod Rel1 cat .git/refs/tags/Rel1.0.prod # What type of object is it pointing to? git cat-file -t $(cat .git/refs/tags/Rel1.0.prod) # What's in the object it's pointing to git cat-file -p $(cat .git/refs/tags/Rel1.0.prod) ### Slide: blame # Look at the lines of the file on master git checkout master git blame information.md # .. and on Rel1 git checkout Rel1 git blame information.md # And on master after Rel2 has been merged in. git merge Rel2 git blame information.md # Slide ### Slide: Tag and commit signing # Look at your secret keys gpg --list-secret-keys # Check out master and make a change to information.md git checkout master cat <> information.md An additional line for demonstrating commit-signing. EOF # Add and commit the change, signing the commit. git add information.md git commit -S -m "Update to information.md" # Look at the commit object git cat-file -p master # Create an annotated tag from the Rel2 branch, signing it. git tag -s -u -m "Release 2." Rel2.0 Rel2 # Look at the tag object git cat-file -p Rel2.0 # Verify the signed tag and the signed commit. git tag -v Rel2.0 git log --show-signature -1 ### Slide: Git Objects – blobs # Generate the SHA1 of the contents of the ~/.bash_history file as # though git would cat ~/.bash_history | git hash-object --stdin ### Slide: Git Objects – trees cd ${BASE_DIR}/devTeamDemo/javaBootcampNoEclipse.dev1 # Get the tree object for the latest version of the top-level of the # project git cat-file -p master | grep '^tree' # Look at the contents of that tree object: git cat-file -p $(git cat-file -p master | grep '^tree' | sed 's/^tree //') ### Slide: Git Objects – commits cd ${BASE_DIR}/simpleRepo git log --graph --all --decorate --oneline git cat-file -p HEAD git cat-file -p ### Slide: Git Objects – tags # Find all the tag objects git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize)' --batch-all-objects | grep tag git cat-file -p ### Slide: "Content Addressable Filesystem" # Find all the objects, and select one that refers to a file git rev-list --all --objects # Look at the contents of the selected object git cat-file -p # Use git to get the SHA1 of the contents of that object git cat-file -p | git hash-object --stdin # The name of the file for that object is based on the SHA1. ls -l .git/objects/... ### Slide: The reflog # Look at the reflog, then clear it completely and look at it again. git reflog git reflog expire --expire=now --expire-unreachable=now --verbose --all git reflog ### Slide: fsck and gc git fsck git gc ### Slide: Useful commands # Go into the repo where the large file had been created cd ${BASE_DIR}/largeFileRepo # List all the objects in increasing order of object size. git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize)' --batch-all-objects | sed -n 's/^blob //p' | sort -n --key=2 # Look for the file name associated with an object git rev-list --objects --all | grep # Look for the commits that made changes to a specific file git log --follow -- "largeInformation.md" ### Slide: Permanently removing a file from your git db # Preserve information on the tags, as you may need this later. for tag in $(git tag) do echo "${tag},$(git log --format="%H,\"%cn\",\"%ci\",\"%s\"" ${tag} | head -1)" done | tee /tmp/tag_list.csv # Create a local branch for all the remote branches. This does nothing # in this demo as there is no remote. for branch in $(git branch -r | grep -v HEAD | sed 's/\ \ origin\///') do git branch ${branch} origin/${branch} done # Check out each branch and determine the amount of space it uses for branch in $(git branch | sed 's/^..//') do git checkout -q ${branch} du -sk . | sed "s/\./${branch}/" done | tee /tmp/branch_sizes.out # Check out the Rel1 branch git checkout Rel1 # Use git-filter-branch to remove the file from all the commits on the # branch git filter-branch --tree-filter 'rm -f largeInformation.md' --prune-empty HEAD # Update refs (will fail for some branches) git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d # Do the same for all the other branches for branch in Rel2 Rel3 Rel4 master do git checkout ${branch} git filter-branch --tree-filter 'rm -f largeInformation.md' --prune-empty HEAD git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d done | tee /tmp/cleanup.out # Clear out the reflog git reflog expire --expire-unreachable=now --all # Run the garbage collector on the repository git gc --prune=now # Run an FSCK on the repo git fsck --unreachable --no-reflogs # Check out each branch again and determine the amount of space it # uses for branch in $(git branch | sed 's/^..//') do git checkout -q ${branch} du -sk . | sed "s/\./${branch}/" done | tee /tmp/branch_sizes_post_process.out