GitDeepDive/GitDeepDive.org

341 lines
12 KiB
Org Mode

#+OPTIONS: ^:{} toc:nil author:nil num:nil
* Git Deep Dive
A Version 1 Technical Meetup talk covering the deep internals of git.
** Introduction
This is the slide-deck and set-up scripts used to give the /Git
Deep Dive/ technical meet-up talk on [2018-06-25 Mon] in Version 1,
by Éibhear Ó hAnluain.
The pack comes with the following:
- This document
- The slide-deck, =GitDeepDive.pdf=
- A script to set up a simple git repository for exploration
purposes: =merge-setup.sh=
- A script to set up the mimic of a development team and processes:
=gitDemo.sh=
- A script to set up another simple repository to run through the
process for completely clearing a file out of a git database:
=largeFile-setup.sh=.
- A =.gitignore= file.
** Setup
1) Run the script =merge-setup.sh=. This will create a git
repository in =simpleRepo= and populate it with some
commits. The commits in this repository will correspond to the
diagrams in the slide-deck that relate to merging.
2) Run the script =largeFile-setup.sh=. This will create another
git repository in =largeFileRepo= and populate it with the same
commits as in =simpleRepo=, and then puts in some more to create
and manipulate a large binary file, and then to remove it. This
will be used for the demonstration to remove a file completely
from the repository.
3) Run the script =gitDemo.sh=. This will replicate a previous git
repository as a bare repo in
=devTeamDemo/javaBootcampNoEclipse.git=, and then create a
number of clones to represent the actions of a team lead and two
developers, and mimic collaborative development among
them. =gitDemo.sh= takes one optional parameter, =-s=, which
will cause the progression to stop for 15 seconds following each
"actor's" push in order to afford the opportunity to look at the
git log.
4) Use the =gpg= utility to generate a key-pair so that you can
work through the commit- and tag-signing slide and sample
commands. If you have access to =gpg2= and not =gpg=, you'll
need to set the following =git config= to make sure it's picked
up: =git config --global gpg.program gpg2=.
5) Open the file =GitDeepDive.pdf=, and as you go through the
slides, refer to the /Sample Commands/ section below for
commands you can execute to get further insights.
** Sample commands
*** Slide: Configuration
#+BEGIN_SRC shell
# Go to where the GitDeepDive.pdf file is and set your BASE_DIR
# environment variable.
cd <where-you-have-this-repo-cloned-to>
export BASE_DIR=$(pwd)
# List your basic git config
git config --list
# Go into the simple repo location and look at the various config
# contexts
cd ${BASE_DIR}/simpleRepo
git config --local --list
git config --global --list
git config --system --list
# Set some settings
git config --global user.name "<YourName>"
git config --global user.email "<YourEmailAddress>"
# Use a text editor to edit your config
git config -e
#+END_SRC
*** Slide: fetch and merge, not pull
#+BEGIN_SRC shell
# Go into the clone belonging to one of the developers in the
# development team demo area
cd ${BASE_DIR}/devTeamDemo/javaBootcampNoEclipse.dev1
# Update the clone, but don't merge anything
git fetch --prune
# Review the local and remote branches.
git branch -va
#+END_SRC
*** Slide: Merging approaches: fast-forward
#+BEGIN_SRC shell
cd ${BASE_DIR}/simpleRepo
# Check out the master branch and review it's log
git checkout master
git log --decorate --graph --oneline --all
# Merge in the Rel1 branch and review the new log.
git merge Rel1
git log --decorate --graph --oneline --all
#+END_SRC
*** Slide: Merging approaches: merging strategies
#+BEGIN_SRC shell
cd ${BASE_DIR}/simpleRepo
# Check out the master branch, merge the Rel2 branch and review the
# new log.
git checkout master
# You'll be prompted for a commit message here.
git merge Rel2
git log --decorate --graph --oneline --all
#+END_SRC
*** Slide: Merging approaches: Rebase
#+BEGIN_SRC shell
# Remove and refresh the simpleRepo
${BASE_DIR}/merge-setup.sh
cd ${BASE_DIR}/simpleRepo
# Check out master, review its log, merge Rel1 and review the log
git checkout master
git log --decorate --graph --oneline --all
git merge Rel1
git log --decorate --graph --oneline --all
# Check out Rel2
git checkout Rel2
# Rebase Rel2 onto the now-new master
git rebase master
# Review the log
git log --decorate --graph --oneline --all
#+END_SRC
*** Slide: refs
#+BEGIN_SRC shell
# Remove and refresh the simpleRepo
${BASE_DIR}/merge-setup.sh
cd ${BASE_DIR}/simpleRepo
# Look at the contents of .git/refs/heads and one of the heads itself
ls -l .git/refs/heads/
cat .git/refs/heads/Rel2
# Tag a branch, look at the branch ref and the tag ref
git tag Rel1.0 Rel1
cat .git/refs/heads/Rel1
cat .git/refs/tags/Rel1.0
# Look at the HEAD ref
cat .git/HEAD
# Go to another clone and look at the refs for the origin remote
cd ${BASE_DIR}/devTeamDemo/javaBootcampNoEclipse.dev1
ls -l .git/refs/remotes/origin/
# .git/packed-refs contains the refs that haven't been interacted with
# yet.
cat .git/packed-refs
#+END_SRC
*** Slide: Annotated tags
#+BEGIN_SRC shell
cd ${BASE_DIR}/simpleRepo
# Look at a "lightweight" tag
cat .git/refs/tags/Rel1.0
# What type of object is it pointing to?
git cat-file -t $(cat .git/refs/tags/Rel1.0)
# What's in the object it's pointing to
git cat-file -p $(cat .git/refs/tags/Rel1.0)
# Create an annotated tag and look at *it*
git tag -a -m "Formal release of 1.0" Rel1.0.prod Rel1
cat .git/refs/tags/Rel1.0.prod
# What type of object is it pointing to?
git cat-file -t $(cat .git/refs/tags/Rel1.0.prod)
# What's in the object it's pointing to
git cat-file -p $(cat .git/refs/tags/Rel1.0.prod)
#+END_SRC
*** Slide: blame
#+BEGIN_SRC shell
# Look at the lines of the file on master
git checkout master
git blame information.md
# .. and on Rel1
git checkout Rel1
git blame information.md
# And on master after Rel2 has been merged in.
git merge Rel2
git blame information.md
# Slide
#+END_SRC
*** Slide: Tag and commit signing
#+BEGIN_SRC shell
# Look at your secret keys
gpg --list-secret-keys
# Check out master and make a change to information.md
git checkout master
cat <<EOF >> information.md
An additional line for demonstrating commit-signing.
EOF
# Add and commit the change, signing the commit.
git add information.md
git commit -S<secretKeyID> -m "Update to information.md"
# Look at the commit object
git cat-file -p master
# Create an annotated tag from the Rel2 branch, signing it.
git tag -s -u <secretKeyID> -m "Release 2." Rel2.0 Rel2
# Look at the tag object
git cat-file -p Rel2.0
# Verify the signed tag and the signed commit.
git tag -v Rel2.0
git log --show-signature -1
#+END_SRC
*** Slide: Git Objects -- blobs
#+BEGIN_SRC shell
# Generate the SHA1 of the contents of the ~/.bash_history file as
# though git would
cat ~/.bash_history | git hash-object --stdin
#+END_SRC
*** Slide: Git Objects -- trees
#+BEGIN_SRC shell
cd ${BASE_DIR}/devTeamDemo/javaBootcampNoEclipse.dev1
# Get the tree object for the latest version of the top-level of the
# project
git cat-file -p master | grep '^tree'
# Look at the contents of that tree object:
git cat-file -p $(git cat-file -p master | grep '^tree' | sed 's/^tree //')
#+END_SRC
*** Slide: Git Objects -- commits
#+BEGIN_SRC shell
cd ${BASE_DIR}/simpleRepo
git log --graph --all --decorate --oneline
git cat-file -p HEAD
git cat-file -p <anyOtherCommitId>
#+END_SRC
*** Slide: Git Objects -- tags
#+BEGIN_SRC shell
# Find all the tag objects
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize)' --batch-all-objects | grep tag
git cat-file -p <anyOfTheTagObjects>
#+END_SRC
*** Slide: "Content Addressable Filesystem"
#+BEGIN_SRC shell
# Find all the objects, and select one that refers to a file
git rev-list --all --objects
# Look at the contents of the selected object
git cat-file -p <selectedBlobObject>
# Use git to get the SHA1 of the contents of that object
git cat-file -p <selectedBlobObject> | git hash-object --stdin
# The name of the file for that object is based on the SHA1.
ls -l .git/objects/...
#+END_SRC
*** Slide: The reflog
#+BEGIN_SRC shell
# Look at the reflog, then clear it completely and look at it again.
git reflog
git reflog expire --expire=now --expire-unreachable=now --verbose --all
git reflog
#+END_SRC
*** Slide: fsck and gc
#+BEGIN_SRC shell
git fsck
git gc
#+END_SRC
*** Slide: Useful commands
#+BEGIN_SRC shell
# Go into the repo where the large file had been created
cd ${BASE_DIR}/largeFileRepo
# List all the objects in increasing order of object size.
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize)' --batch-all-objects | sed -n 's/^blob //p' | sort -n --key=2
# Look for the file name associated with an object
git rev-list --objects --all | grep <blobID>
# Look for the commits that made changes to a specific file
git log --follow -- "largeInformation.md"
#+END_SRC
*** Slide: Permanently removing a file from your git db [1/2]
#+BEGIN_SRC shell
# Preserve information on the tags, as you may need this later.
for tag in $(git tag)
do
echo "${tag},$(git log --format="%H,\"%cn\",\"%ci\",\"%s\"" ${tag} | head -1)"
done | tee /tmp/tag_list.csv
# Create a local branch for all the remote branches. This does nothing
# in this demo as there is no remote.
for branch in $(git branch -r | grep -v HEAD | sed 's/\ \ origin\///')
do
git branch ${branch} origin/${branch}
done
# Check out each branch and determine the amount of space it uses
for branch in $(git branch | sed 's/^..//')
do
git checkout -q ${branch}
du -sk . | sed "s/\./${branch}/"
done | tee /tmp/branch_sizes.out
# Check out the Rel1 branch
git checkout Rel1
# Use git-filter-branch to remove the file from all the commits on the
# branch
git filter-branch --tree-filter 'rm -f largeInformation.md' --prune-empty HEAD
# Update refs (will fail for some branches)
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
# Do the same for all the other branches
for branch in Rel2 Rel3 Rel4 master
do
git checkout ${branch}
git filter-branch --tree-filter 'rm -f largeInformation.md' --prune-empty HEAD
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
done | tee /tmp/cleanup.out
# Clear out the reflog
git reflog expire --expire-unreachable=now --all
# Run the garbage collector on the repository
git gc --prune=now
# Run an FSCK on the repo
git fsck --unreachable --no-reflogs
# Check out each branch again and determine the amount of space it
# uses
for branch in $(git branch | sed 's/^..//')
do
git checkout -q ${branch}
du -sk . | sed "s/\./${branch}/"
done | tee /tmp/branch_sizes_post_process.out
#+END_SRC