MARY:  Is that better?  How about that?  Okay?  Medium?  
Terrible?  Just raise your hand if you can't hear me at any 
point and I will talk louder.  I'm going to talk about git from 
the inside out.  And yeah.  Also, quick overview of the talk 
structure.  Quick overview of the talk structure, git ultimately 
is a graph and it's this graph that dictates this behavior so if 
you understand this graph, you understand git.
           So we're going to run a bunch of commands on our git 
repository and then observe how those commands change the git 
graph.  So this started my creative project.  I'm going to make 
a directory called alpha to create the project and then cd into 
that directory.  I'm going to make a directory called data 
inside that directory and then inside that creative file called 
letter.txt, it contains the letter A.  So quick overview, I've 
got an alpha directory that contains our whole repo, and inside 
that, a data directory, and then inside that, a letter.txt that 
has the letter A.  So let's initialize this git repository.  So 
we had git in it.  So here's what we had before, the working 
copy, and the working copy is the letter.txt file.  And then 
this dot git directory that's not our stuff, that's git's stuff, 
that contains everything that git needs to do its job.  So let's 
add files to git.  So I'm going to add letter.txt.  So stuff has 
changed.  So we have git directory but now we've got a new 
folder called 2e and inside that we've got a file called 65, so 
what the hell is that?  So a quick digression about hashes, or 
hashes in general.  So a hash is a way of taking large pieces of 
information and piecing it down to a unique identifier.  So for 
example you could take a novel character and hash it, and create 
a 40 character string and get Anna Karenina.  And git uses 
hashes extensively to identify pieces of information.  So for 
example if we use git to hash the character contents of the 
letter.txt file and then we get the hash 2e65.  So like I said, 
hashes are about four characters long, but I've just shortened 
them to four characters for the clarity of this presentation.  
So this is making a little more sense now I've got the 2o folder 
which is the first part of the hash and then the file which is 
the second part inside the directory.  This is not helpful.  
It's compressed but we can use this super cool command called 
Giotto cat file which you just give it the hash of something in 
your directory.  So for example, 2e65, the hash we presented 
originally and it will give you a human readable form of 
whatever you've saved away.
AUDIENCE MEMBER:  Whoa...
MARY:  So the second part to adding a file to the index, and 
remember this all started with git add.  The second part we've 
added a blob object that stores the contents of the letter.txt 
file.  Now we make an entry in the index.  And so here's the 
index file and unfortunately when you cut it, you don't get 
anything super helpful but you can use the ls files command to 
give you a human-readable version of the index.  So you can see 
our index has one entry and the first part of the entry is the 
file part, or the file that we have in it.  And then the second 
part is the hash content at the moment that we've added to it so 
there's our old friend, 2o65.  So let's readd a file to our 
repository.  So this is a sort of notional representation of the 
git graph so on the left we've got the working copy which has 
one thing in it, the letter.txt file and that points at the 
content, A.  And that gray square represents the content of the 
letter.txt file.  So our working copy file and then on the 
right, then there's the representation of the index so again 
there's one entry and it points at that A blob blob that's in 
the objects directory.  So the blue blob represents the blob 
that's inside our directory file that we've created.  So let's 
create a numbers.txt and put one, two, three, four in it.  Now 
let's git add number.txt.  And a new entry added to the index 
which points at that blob, okay?  Now let's change number to be 
one, so it did read one, two, three, four.  And now on the left 
bottom copy, we've updated that to one.  Of course we can read 
to in the existing entry.  So the old one, two, three, blob.  
Nobody's pointing at that now the index entry for number.txt has 
been repointed, the new one blob.  So now let's make a commit.  
So I'm going to do a commit of message A1, so all the messages 
will look like this.  So I'll have a message A, which represents 
the contents of the letters file, and the number one, which 
represents the contents of the number file.  So here's our graph 
again without the working copy or the index.  So we've got these 
two ultimately, the A blob and the one blob.  And when we 
committed that, a bunch of things happened.  A bunch of objects 
got created and here's our first customer.  First object was 
hash 00eed.  It's got two entries.  Number one it says, hey, 
it's got a blob.  It's got this hash, and it represents this.  
So this of course represents the letter.txt file inside our 
working copy.  So what we're trying to do here is we're trying 
to represent the contents of our working copy and its structure. 
 So we've successfully saved the contents of those files, which 
is great but there's no indication those two files existed in 
those two files, or 22 pieces of files existed in letter.txt and 
there's no indication that they were inside a data directory of 
anything like that.  So these two files, they all represent 
that.  So we can see this first file represents the contents of 
the data directory, okay, so it's got two entries, one for 
letter and one for number and now that's in the objects 
directory.  Now, there's another new file which has one entry in 
it which is a tree which is called data.  So this, of course, 
represents the contents of the alpha directory, the top level of 
the tree.  So let's add that to our graph so what we're doing 
like I said, is representing the structure as objects.  Now, 
there's another file that's in the commit object.  So we can 
just count that as inside the objects directory just like 
everything else and it's forgot a few interesting things.  
Number one it's got a pointer to the top of that tree graph that 
we've authored.  It's got the author, the time stamp and it's 
got the commitment stage.  So let's add that to our git graphs.  
So now we've got things shaping up to the pink object, A1, which 
points at its content.  Next step three of committing, we cap 
head.  Now, that's totally fine.  It's inside the dot git 
directory, it's called head.  We can see that it contains 
ref/heads/master.  So let's cat that file.  So we get dot 
git/refs/heads/master.  So okay, so this represents the disk.  
And it represents the master branch.  So now we can add a 
pointer to the master branch to the graph.  So we've got master 
which points at the A1 commit and then the A81 commit points at 
its content, okay?  Let's pop back to head for a second and then 
head is making a little more sense.  Head just tell us what just 
checked out.  Therefore, head points another master and master 
points at commit and so on.  Okay?  Now, let's make a commit 
that is not broken.  So here's where we were just a quick 
summary I've added back in the working copy in the index so the 
working copy, the two separate files there.  Notice how the 
index points at the two blobs that we've got okay.  Now let's 
change number it won't be two.  And let's do git add.  So when 
we change number it won't be two then the entry in the working 
copy in the bottom left there, that gray box changed from one to 
two.  Now, when we add that file to git then of course, we're 
going to add the index again.  Now, the index points at that new 
two blob that got created as part of the add step.  So here's 
where we we are.  Now let's do another commit.  Let's do the A2 
commit.  So let's again let's follow through those commit steps 
number one make the contents of the index.  Psh take the index 
and turns it into objects, okay, so here it looks pretty 
similar.  It totally is pretty similar, except this time the 
data object tree points at A but the original A blob and this 
new two blob.  Okay?  Now step two create the commit object.  
The main interesting thing here is that it's got a parent.  So 
the A2 commit on the bottom right there, points at its parent, 
the A1 commit and we'll see that's handy in a second.  Next 
step, point head at the new commit.  Now, we're not actually 
repointing head.  Head just stays pointing at master because we 
stay on the master branch but what gets changed, is master, 
master is just a change that contains the hash.  That used to be 
the one hash now it's the A2 hash.  So what have we learned from 
this?  Number one, content is stored as trees.  And this means 
that the objects database stores diffs.  So we were able to 
reuse that A blob from the A1 commit in the content of the A2 
commit and you would imagine that the use would be way more 
pervasive for each commit.  Next repository containers o 
contains a parent which means that the stores a history of the 
project.  This means that commits can be given meaningful names. 
 So the A1 message or the A2 message, that's a fine message.  
It's helpful.  It tells us what that commit does.  It doesn't 
really tell us what that commit's place is in the whole 
repository whereas master, it's great.  It's telling us, this is 
the state-of-the-art of this repository right now.  Next, 
objects are mutable meaning content is edited not deleted.  
Those objects mostly just stay inside the directory objects and 
so you can always find it eventually, usually.
           Next the refs are unlike objects mutable.  Which 
means that refs can change.  But in the future pointer A5 maybe 
and that would be the new state-of-the-art of our repo.  Now, 
let's check out a commit.  Normally one checks out branches.  
We're going to check out a commit.  So we give it the hash we 
want it to check out.  So this is the hash of the A2 commit.  
This is weird because we've got we've already got the A2 commit 
checked out, right?  So what's the point of this, we're on the 
A2 master branch and now we're on A2 directly.  That has some 
funny ramifications.  Check out step one, we write the commit 
tree to the working copy so we just take whatever the content of 
the commit we're trying to check out is and we just write it to 
the working copy now there's no changes because we're just 
staying on the A2 commit so the working copy stays on A and two. 
 Next write commits to the A2 index.  No changes here because 
the A2 is still on the commit so it's still A and two next we 
check out the thing that was checked out.  You see when I count 
head it has a hash in it.  It doesn't the ref/head/master 
anymore.  And this is what it means to have a master.  Which 
means that your head doesn't have a branch, it points directly 
at that commit.  So we can see in our git graph our masters are 
are happening in A2, and head points directly at A2 rather than 
an A2 by master.  Okay that's cool.  Let's change number to be 
three and we can commit that.  We still attached to the state 
which makes sense when you think about it.  It so look how now 
head points directly at A3 rather than A2.  Now, let's create a 
branch because that A3 commit though that's fine work, it's not 
on a branch which means it's easy to lose track of it.  So we 
don't want to do that.  So we can create a branch to solve this 
problem and so we create a branch called deputy.  And deputy is 
just another file on disk just like master.  So it just contains 
a commit hash.  It's super easy.  So that is the hash of the A3 
commit.  So now we've got in our git graph we've got diff 
pointing at A3 as well as head which means that A3 is safely in 
a branch.  We're still in detached head state though.  So what 
have we learned from this?  Well, branches are just refs and 
refs are just files.  And that's why people say git branches are 
lightweight because whenever you want to branch essentially a 
new branch is created and the hash, is put inside that file and 
that's it.  So that's insanely fast.  
Now let's check out a branch.  So let's check out master.  So 
before we checked out a commit we've checked out a branch and 
let's follow through those checkout steps again.  So here there 
is a change to the working copy.  It did read A and three, now 
it reads A and two because we're moving to the two which is 
where the master is on and then we write the commit tree to the 
index again we've got a change here from A and three to A and 
two.  Next point head at the thing that was checked out.  We 
just point head at master which means we're back at the branch 
which means that our head is no longer detached.  Now let's do 
some merging.  Let's merge an ancestor.  So I'm going to check 
out deputy just as prep for this.  So the important point here 
is that we're on the A3 commit we're going to try to merge A2.  
It and so we're on A3, we're going to merge A2.  It's already 
up-to-date so nothing happens.  So there's been nothing at all 
to the git repo and the reason for this a commit is a set of 
changes so if you cast your mind back to when we were on the A1 
commit then we made some nice changes to the working copy, and 
then we committed them to produce the A2 commit.  So we used a 
set of changes to produce a new commit now, given that we're now 
in A3 and we're trying to bring in A2, what does that really 
mean?  It means we're trying to bring in some sets of changes, 
A1 and A2 changes but they're already in our history.  So that 
doesn't make sense to headache any changes.  We've already got 
those changes incorporated into our graphs.  So why bother and 
that's why git, if an ancestor is merged into a descendant then 
git does nothing.  Now let's merge a descendant.  So let's 
checkout master just as prep.  No now, we're on the A2 commit.  
That's the important point.  And we're going to merge to A3.  So 
there was a change there, did anyone see that so what's happened 
is I'll fast forward and ultimately a fast forward means that 
there's already a commit that has the changes I want therefore, 
I can just fast forward my ref to that new place.  So the master 
used to point at A2 now it points at A3.  Now, let's just think 
through why that is.  So we're still on A2, we're on master and 
we're about to merge on A3.  So, so a commit is a set of 
changes.  
So there were some changes that were on A2 and we've made some 
changes back in the day.  And produced the A3 commit that's what 
all happened before and now, we're trying to merge the A3 
commit.  But the thing is though, there's already a commit that 
embodies the changes that we want to make, it's just the A3 
commit so we can just repoint our master at that commit and say, 
"This is where master is now."  There's no need to meddle around 
with the commit history or create any new commits or anything 
like that.  So that's why this is a fast forward commit which is 
to say if a descendant is merged into an ancestor, history isn't 
changed, only commits, but head is changed.  So head got 
repointed to this new commit, to the existing commit, I'm sorry. 
 Now, let's just change number to be four and commit that.  And 
then check out deputy and change -- sorry, letter to be B and 
commit that.  So I know we've whizzed through this fast, but 
that's okay, we'll step through.  So we were on A3 before now, 
we've got two new commits.  There's B3 where our letter is B, 
and number is three.  Then there's A4 where letter is A and 
number is four.  So the important point here is that in one new 
commit, letter was changed and in the other new commit, number 
was changed, okay?  And we're on deputy, coincidental, on the B3 
commit now, what allows this?  Commits can check parents which 
means that new linkage I can't change can be created.  So this 
is how this is allowed.  So B3 and A4, they both have the same 
parent which is just A3.  Now let's merge two commits from 
different lineages.  So this is where we're going.  We're going 
here.  So we've got the B4 commit Cher going to create as a 
result of the merge and the reason this is allowed is because 
lineages can be joined with a merge commit.  So B4 is going to 
have two parents.  A -- sorry, B3 and A4, and that's what joins 
these lineages together.  And now let's do the merge.  So we say 
just merge.
>> Seems to go okay.  What are our steps?  So this is where 
we're going.  Let's walk it through.  Commits have parents which 
means that it's possible to find the point at which two lineages 
diverge.  So the B3 commit and the A4 commit they're both the 
start of two separate new lineages but they did come from the 
same place; they came from the A3 commit and it possible to find 
that if, like, most recent common ancestor it's sometimes 
called, or base commit.  So now, let's bring the three actors on 
stage.  So we've got the base commit, the A3 commit where both 
of our commits from merging came from, we've got the receiver 
commit, which is the one that we're on, and the giver commit A4, 
which is what we're bringing in.  So first we want to generate 
the diff that combines the changes given by the giver.  So we 
want to take the thanks the receiver made, take the changes that 
the giver made and bring those together.  And so just to recap, 
the receiver changed letter from A to B, and the giver changed 
number from three to four.  We just want to merge those two sets 
of changes together.  And so, if we considered the changes made 
to letter, then, A was -- sorry, letter -- A in the base, B in 
the receiver, and A in the giver.  Now, a merge is a raised 
commit as we've seen.  Which means that git can automatically 
resolve the merge of the file that has changed the base in only 
the receiver or the giver.  So this is exactly the case that we 
were basing here.  So A changed in the receiver, but it stayed 
the same in the giver which means that the new diff is just 
changed letter to B.  Super easy.  Same story for number.  This 
time it was the giver that made the change but that doesn't 
matter.  The diff is just hey, let's change from three to four 
in number.  Now, step two of a merge.  Apply the diff to the 
working copy.  Super easy, we just follow the instructions in 
the diff, which is to say, change letter to B and number to 
four.  
And then, step three of the merge, apply the diff to the index, 
super easy, just change the entry to B and four.  The hash is 
off, obviously.  Now, the updated index is committed, the main 
thing to note here, two parents.  So the B4 commit has B3 and 
A4.  And then next finally just point head at the new commit, so 
now deputy is on the B4 commit, okay?  So that was merging two 
ancestors from two different lineages.  Now, let's merge commits 
from different lineage so the same thing that we just did before 
but where the commits modify both files.  So there's something 
terrible coming.  So we're just going to check out master and 
then merge in deputy just to bring ourselves up to date.  So the 
point is, both the master and deputy are in the B4 commit.  So 
check out master, change number so six and commit that.  So 
here's our two new friends.  We've got B6 where number is six 
and then we've got B5 where number is five.  So the point here 
is number was changed in both of them, which is what I want.  
Now let's merge deputy and we've got a conflict.  So it says 
hey, there's a conflict in number.txt.  So let's just follow 
those steps again, nice and easy.  So we generate the changes 
that were made by the receiver and the giver.  So the letter 
file, easy-peasiy.  We're locked down.  So there's no diff.  
Number.txt, the story's different.  So, in the base, it's four.  
In the receiver, it's six and in the giver it's five.  Okay?  So 
let's think about what git wants to do at this point.  It just 
says, hey, I'm going to write a diff where number is changed 
from four to six.  That seems pretty okay?  And oh hey, I'm 
going to write a diff where number is changed from four to five. 
 That seems pretty okay, too, but unfortunately those two 
changes aren't compatible.  So that what produces a conflict.  
So the diff is, let's change number to six and let's change 
number to five which obviously, doesn't work.  So this git just 
kind of merrily marches on, just trying to apply the process to 
the working copy and that sets number.txt to look like this, 
which I expected before.  So git is going to give you both 
versions and just let you sort that stuff out, yeah?  And then 
git applies the diff to the index as well.  So here's what the 
index looked like before the merge and I kind of lied to you a 
little bit when I said that index entry is just a file path and 
a hash.  That was true but it's also a number.  And that number 
is called the stage.  And every time, up until now, all of our 
index entries had a stage of zero so everything's been 
absolutely fine.  Stage of zero means unconflicted.  So this is 
what letters look like.  Letters is fine.  It's still a stage of 
zero but numbers has entered in three times, so it's got a stage 
of one, where the hash is entered in the base.  It's got a stage 
two -- where the content is the content of the giver.  It's the 
presence of these three entries that tells git that it's the 
same conflict.  Let's carry on, their solution is to just say, 6 
+ 5 = 11.  
So I'm going to set the number to be 11.  And then, the user is 
also asked to call in the index, and the way they do that is 
just git add.  And so in this context, git add means hey, 
resolve these conflicts, everything is fine now.  So here's 
after resolving the conflicts.  So now, the letter entry is the 
same, is safely zero which means it's unconflicted and that 
hashes the 11 content.  So now the user commits the merge.  And 
everything's okay.  So that's the B commit bringing those 
lineages together.  Now, let's remove a file.  So here we were 
after the B commit, the working copy had a B and 11 in it.  And 
the index had a B and 11 in it.  The user runs git rm, and this 
removed the file from the working copy so it just deletes the 
file of this, so letter.txt is just gone and it also removes an 
entry fallout file from the index, or an entry from the index so 
now the index just points at the content of number which means 
that, excuse me, when the user commits then, remember how a 
commit works?  It just walks the index and produces a tree 
graphs of its content so that index no longer has an entry for 
letter.txt which means that the entry is going to be completely 
missing so now the new tree graph points at 11; it doesn't point 
at letter.txt anymore.  Okay?  So let's copy to our repository 
so we're going to use cp for this, not git.  So we're going to 
cp into the directory above and then cp to a new alpha bravo.  
So now bravo has exactly the same thing.  So this is a complete 
facsimile.  Let's connect a repository to another repository.  
So, cd back into alpha.  Only to git remote add.  And git remote 
add says hey, git, remote, alpha.  It turns out there's another 
repo that's a lot like you, it's called bravo and it lives here. 
 So that's just a relative far apart.  So what's the change?  So 
config would change.  So there's just a couple new lines that 
would say, hey, remote, there's a new one bravo, it's at this 
URL.  Let's branch from our remote directory is basically saying 
what's going on in this other repository.  Let me get that 
stuff, bring it over here, but not integrate into my stuff yet.  
So let me just put it in this separate location and just keep it 
separate for now.  So let's cd into bravo and let's change 
number to 12 and commit that.  So here's where we are.  Bravo's 
in the 12th commit and bravo is still in the stage in the 13th 
commit.  Step one, find the head commit on the repository being 
factored which is possibly the 12th commit on bravo that we're 
bringing in.  Step two, copy that commit and all its dependent 
objects over to alpha.  Okay, so now alpha has the 12th commit 
in its tree graph and all that good stuff.  
But notice how alpha's master is still pointing at 11, right, so 
though it's got the new objects, it hasn't updated itself to 
point at the new commit, okay can?  So all intents and purposes, 
if you look to alpha then it would look the same it's just got 
these new objects.  Step three, point the refuse for the remote 
branch at the fetched commit.  What does that mean?  So inside 
that there's a new folded called bravo, and inside that there's 
a new file called master which contains the hash of the 12th 
commit.  So we've managed to super secretly say say the contents 
thousand local repository, which is what fetching is all about.  
So let's add that refuse to our branch.  So notice that alpha's 
master is still on the 11, but we've got this record of bravo 
master pointing at 12, okay?  Next we point something called the 
fetch head at the fetch commit.  So let's just add that to the 
graph, also pointing at 12 and the fetch head is ultimately just 
a record of the last time you fetched.  So it says, hey, I've 
got this record of bravo, and it's got this hash.  Hey, that's 
lovely.  So there's the hash of the 12th commit.  So what have 
we learned from this?  Objects can be copied.  Which means that 
they can be shared between repositories.  Which means a 
repository can record locally, the state of a branch on a 
remote.  Now let's merge fetch_head.  So we do merchandise head 
and fast forward.  So fetch head ultimately resolves to please 
change your bar to this commit.  So in our case, it's please 
change your master to this commit hash that I've got here.  So 
that's the 12th commit hash.  So the result is, before merging 
fetch head master is pointing at 11 now it gets fast forwarded 
to 12.  Okay?  So let's pull a branch from a merge and we do git 
pull bravo master and already says up-to-date no change at all 
and the reason for this is because pull is just shorthand for 
fetch and then merge fetch_head, okay?  Let's clone our 
repository.  So we're going to clone cd into the directory 
above, and then clone alpha called Charlie.  So new repo called 
repository.  So the next steps.  Number two, cd into it.  Number 
three, call git in it to create in it, I'm sorry, a dot git 
directory for Charlie.  Number four, then check out whatever 
branch was checked out in alpha.  So master was checked out on 
alpha.  That's the thing that we're cloning.  So we just check 
out master on Charlie, and then five, do a pull so that 
Charlie's master is in the same state as alpha's master.  So 
again, there's our old friend, the 12th commit.  So we can see 
that -- I'm sorry, the master is now pointed at the 12th commit 
or Charlie.  So that's pretty similar to the cp that we did, 
it's a little more efficient but ultimately, it's the same 
result.  So now let's cd back to alpha, let's create a new 
commit, 13 and let's add Charlie as a remote on alpha and let's 
do push.  So let's do git push Charlie master.  So what that 
says is let's push the state of alpha over to Charlie.  Now, 
something good happens.  Writing objects happens.  So we've kind 
of seen this happen before.  Here's before the push.  
Alpha's on the 13 and Charlie's on the 12.  So after the push, 
Charlie's on the new 13 push but again alpha is still on 12 so 
something didn't happen and what did happen was git said, hey, 
I'm refusing to update the checked-out branch because it will 
make the working copy inconsistent.  So what's happening here.  
So let's say I'm making some changes to Charlie and I'm making 
some existing changes to my working copy on Charlie and suddenly 
someone pushes alpha over to me.  So now, a push is effectively 
a checkout so what's going to happen is it's going to send some 
new objects which is great but it's also going to say hey, your 
working copy it now needs to reflect whatever commit was just 
pushed so my working copy change is just gone away, gone, that's 
terrible.  And in a similar fashion, the same thing happens but 
if I were to make some changes to the index, by staging some 
stuff then again my index is going to be blown away by the 
contents of the hash of the commit that was just pushed.  So 
that's no good.  So git won't push to things that are currently 
checked-out in the repo but that doesn't make sense because we 
push all the time, we push to GitHub.  How does that happen?  
That happens with bare repositories.  So let's clone a bare 
repository and see how that works.  So we're going to clone a 
create a new repository called delta that's a complete clone of 
alpha now notice when I look inside the delta directory of our 
new repo, there's no git directory.  The contents of the top 
directory have been just puked into the top level.  So what does 
that mean?  It means there's no working copy.  There's nothing 
for poor delta and Charlie.  So this poor bare repo has no 
working copy.  So the version on 13, alpha and delta on the 13 
commit.  So let's cd back to the alpha.  And back into delta as 
a remote.  And this is all recap.  Let's make a new one called 
14.  So let's commit that.  So alpha is on the 14th commit and 
delta is still on the 13th commit and then we do get push delta 
master and this totally works.  Everything is fine.  So the 
objects get pushed over so now the delta repository has the 14th 
commit.  And then, master gets updated to pointer 14 so it's got 
the new stuff.  Phew.  Git is a graph.  This graph dictates 
git's behavior which means if you understand this graph then you 
understand graph.  Thank you.

[ Applause ]

LINDSEY:  That was extremely cool.  Thank you, Mary.  And by the 
way, one reason why Mary has so much knowledge of git is because 
she wrote an implementation of git in Javascript, which is 
called gitlet and you can find be it on her website, which is 
maryrosecook.com we're going to have a short break and then 
reconvene in 15 minutes.
           
 [ Short Break ]