<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jan Varwig &#187; git</title>
	<atom:link href="http://jan.varwig.org/archive/tag/git/feed" rel="self" type="application/rss+xml" />
	<link>http://jan.varwig.org</link>
	<description>Somewhere between Hello World and HAL9000</description>
	<lastBuildDate>Sat, 03 Dec 2011 00:15:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Advanced Git Part 2</title>
		<link>http://jan.varwig.org/archive/advanced-git-part-2?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=advanced-git-part-2</link>
		<comments>http://jan.varwig.org/archive/advanced-git-part-2#comments</comments>
		<pubDate>Thu, 26 May 2011 14:09:17 +0000</pubDate>
		<dc:creator>Jan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[advanced git]]></category>
		<category><![CDATA[bcruhr]]></category>
		<category><![CDATA[bcruhr4]]></category>
		<category><![CDATA[git]]></category>

		<guid isPermaLink="false">http://jan.varwig.org/?p=554</guid>
		<description><![CDATA[In the first post of this series, I explained the data structures Git uses to store files and working directory history in its database. A string of commit objects is used to keep track of your progress. But I did not explain how you can actually access a commit from outside without knowing its internal [...]]]></description>
			<content:encoded><![CDATA[<p>In the <a href="http://jan.varwig.org/archive/advanced-git">first post of this series</a>, I explained the data structures Git uses
to store files and working directory history in its database. A string of commit
objects is used to keep track of your progress. But I did not explain how you
can actually access a commit from outside without knowing its internal SHA1
identifier. This is one of the mysteries that will be revealed in this post,
in which I talk about the <strong>structure of the <code>.git</code> directory</strong> and about what
exactly a branch is and <strong>how branches work</strong>.</p>

<p>As before, I want you to <strong>go into a repository of your choice</strong> and poke around a bit yourself while you&#8217;re reading my explanations.</p>

<h3>The .git directory</h3>

<p>In your working directory, run <code>cd .git</code> to visit the .git directory.
Here, Git stores everything it needs to run: configuration, the database,
hooks and refs. I want to explain every subdirectory briefly, before going
into the details of the more interesting parts.</p>

<div class="dean_ch" style="white-space: wrap;"><br />
jan@mops $ ls .git<br />
COMMIT_EDITMSG<br />
FETCH_HEAD<br />
HEAD<br />
ORIG_HEAD<br />
config<br />
description<br />
hooks<br />
index<br />
info<br />
logs<br />
objects<br />
packed-refs<br />
refs<br />
&nbsp;</div>

<h4>The configuration file <code>config</code></h4>

<p>This file stores options you have configured directly via <code>git config</code> or
automatically through other commands. For example, <code>git clone</code> populates this
file with the default &#8220;origin&#8221; remote:</p>

<div class="dean_ch" style="white-space: wrap;"><br />
[remote &quot;origin&quot;]<br />
&nbsp; url = user@server:path<br />
&nbsp; fetch = +refs/heads/*:refs/remotes/origin/* &nbsp;<br />
&nbsp;</div>

<p>You can edit this file in any text editor. This is sometimes easier than using
<code>git config</code>.</p>

<h4>The uppercase files</h4>

<p>You will notice some files in the .git directory that are named in uppercase
letters. Let&#8217;s keep this brief, I will get into more detail later.</p>

<div class="dean_ch" style="white-space: wrap;"><br />
COMMIT_EDITMSG &#8211; Used to pass the commit message to your text editor<br />
FETCH_HEAD &nbsp; &nbsp; &#8211; Git stores the last fetched branches in here<br />
HEAD &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#8211; Points to the branch you&#8217;re currently working on<br />
ORIG_HEAD &nbsp; &nbsp; &nbsp;- Used to backup the value of HEAD before a potentially dangerous operation<br />
MERGE_HEAD, CHERRY_PICK_HEAD &#8211; Used temporarily during merging or cherry-picking<br />
&nbsp;</div>

<h4><code>description</code>, <code>hooks</code> and <code>info</code></h4>

<p>The <code>description</code> file contains a description of your repository. <strong>You&#8217;ll likely never use this</strong>
unless you plan to publish your repository through <a href="https://git.wiki.kernel.org/index.php/Gitweb">gitweb</a>.</p>

<p>The <code>hooks</code> directory contains <a href="http://www.kernel.org/pub/software/scm/git/docs/githooks.html">callback scripts</a>
that are executed by git everytime certain event occurs (like a commit or a rebase).
These can be used to send out emails everytime someone pushes a commit to a server for example.</p>

<p>Inside the <code>info</code> directory, the only file you&#8217;ll probably ever touch is the
<code>excludes</code> file, which contains your <strong>private excludes</strong>. You can use it to prevent
temporary files from showing up in <code>git st</code> without adding them to <code>.gitignore</code>.</p>

<h4><code>index</code> and <code>logs</code></h4>

<p>The index is a central mechanism of Git. Basically it contains the <strong>content of
your next commit</strong>. I like to call the index an <em>unborn commit</em>. By adding
and removing files through <code>git add</code> and <code>git rm</code> you shape it&#8217;s content to 
your liking and then store it in the database as a proper commit through <code>git commit</code>.</p>

<p>The <code>logs</code> directory contains specials files known as <em>reflogs</em>. Each of the files
here corresponds to a branch. Whenever you are working in that branch, an entry
is created in the reflog. This makes it possible to see what commit your <strong>branch
was pointing to, at any given moment in time</strong> using <code>git reflog &lt;branchname&gt;</code>.
I will talk a bit more about the reflog in the next part of the series.</p>

<h4>The <code>objects</code> directory</h4>

<p>Now it gets interesting. The <code>objects</code> directory contains the <strong>actual database</strong> of
all the objects in the repository. The objects are stored in files and directories
that are based on the objects SHA1 ids. The first two characters of the SHA1 form
a directory, the rest is the filename. If you <code>cat</code> any of the files in there, 
you&#8217;ll see the binary contents of the object, compressed with zlib. To see the
uncompressed content, use the following command (you&#8217;ll obviously need Ruby for this):</p>

<div class="dean_ch" style="white-space: wrap;"><br />
ruby -rzlib -e&#8217;puts Zlib::Inflate.inflate(File.read(ARGV[0]))&#8217; &lt;PATH_TO_OBJECT_FILE&gt;<br />
&nbsp;</div>

<p>Remember what I told you at the end of part one? That git stores all of its
objects as actual files? Here you see them. Also, remember that I told you that
it didn&#8217;t actually do that <em>all of the time</em>?
Well, run <code>git gc</code> and list the contents of the objects directory again. Most
of the directories should be gone now. They went into one of the files in <code>objects/pack</code>.
These are <strong>compressed archives</strong> that allow for much more efficient storage of the objects.
But it helps to still think of them as the actual files we&#8217;ve seen before.</p>

<h4>The <code>refs</code> directory</h4>

<p>The <code>refs</code> directory sits at the interface between the user and the object database.
Here, <strong>branches and tags are stored</strong>, enabling you to access commits by an easy
to remember name instead of the SHA1. Inside <code>refs</code> you&#8217;ll see several subirectories:
<code>heads</code> and <code>remotes</code> store branches for the local and remote repositories respectively, <code>tags</code> contains tags.
If you&#8217;ve used <code>git bisect</code> or <code>git stash</code> before, you&#8217;ll also find corresponding
files for them here.</p>

<p>You can take a look at what your refs are pointing to by just
looking at their content. They <strong>simply store the hash of the object</strong> they&#8217;re referencing in plain text.</p>

<p>You might be wondering, that <code>git branch -av</code> is showing you quite a lot more
branches than you see files in the refs directory. That&#8217;s because only branches
you&#8217;re actually working with are listed here as files. The rest can be found in
the file <code>packed-refs</code> in your <code>.git</code> directory.</p>

<h3>Working with branches</h3>

<p>Now that you know how branches are stored, you can probably imagine how
some of Gits common operations are implemented. Lets take a simple commit
for example.</p>

<p><img src="http://jan.varwig.org/wp-content/uploads/2011/05/advanced-git-2.png" alt="" title="advanced git 2" width="574" height="432" class="aligncenter size-full wp-image-601" /></p>

<ul>
  <li>
    Let&#8217;s assume you&#8217;re working in the master branch. Your HEAD will point to that
    branch. Execute a <code>cat&nbsp;.git/HEAD</code> and you&#8217;ll see a reference to the master
    branch:
<div class="dean_ch" style="white-space: wrap;"><br />
ref: refs/heads/master<br />
&nbsp;</div>
    Master itself might point to a commit:
<div class="dean_ch" style="white-space: wrap;"><br />
jan@mops$ cat .git/refs/heads/master<br />
05c80116a36bbbdd7a453255aee5a1d2c7b01fd7<br />
jan@mops$ git rev-parse master<br />
05c80116a36bbbdd7a453255aee5a1d2c7b01fd7<br />
&nbsp;</div>
    <code>HEAD</code> can either point to a branch, like shown, or directly to a commit
    (That&#8217;s called a <em>detached HEAD</em>, a term you might have encountered already).
    Git has no problems resolving <code>HEAD</code> to a commit in any case:
<div class="dean_ch" style="white-space: wrap;"><br />
jan@mops$ git rev-parse HEAD<br />
05c80116a36bbbdd7a453255aee5a1d2c7b01fd7<br />
&nbsp;</div>
    This situation is displayed in the illustration.
  </li>
  <li>
    Before you start editing, your working tree, your index and the tree object that belongs to the current commit that <code>HEAD</code> points to have identical content. This is <strong>situation 1</strong> in the illustration.
  </li>
  <li>
    You will now edit a file. The <code>git status</code> command will report that there&#8217;s
    a difference between your working directory and the index and list the
    file under &#8220;Changed but not updated&#8221;. This is <strong>situation 2</strong> in the illustration.
  </li>
  <li>
    After adding our changes to the index with <code>git add</code>, <code>git st</code> will now report
     difference between the index an the <code>HEAD</code> under &#8220;Changes to be committed&#8221;. We&#8217;re now at <strong>situation 3</strong>.
  </li>
  <li>
    If you&#8217;re done with your work, you finally call <code>git commit</code>. Git then takes your
    index and creates a tree object from it. A commit object is created,
    containing the commit message, your name and the current time.
    The commits parent will be set to the commit that is referenced by
    the current HEAD and its tree reference will point to the tree that was just
    created. This is the transition from <strong>situation 4 to situation 5</strong>.
  </li>
  <li>
    Finally, to treat that newly created commit as the new tip of your development
    history, git updates HEAD to point to it. In case HEAD references a branch, 
    the branch is updated. At every step you can see the pointers changing by
    looking into your <code>HEAD</code> and <code>refs/*</code> files. You&#8217;re now at <strong>situation 6</strong> and your repository is in a clean state again.
  </li>
</ul>

<p>By now, you can probably already imagine how <strong>branches are created</strong>. Git simply
places a file with the name of the branch in <code>refs/heads</code> and lets it point to
the commit you provided to <code>git branch</code>.</p>

<p><strong>Checkouts</strong> are a little more interesting.
If you instruct git to checkout a branch, three things happen:</p>

<ul>
<li>The index is set to the same contents as the commit you&#8217;re checking out</li>
<li>The working directory is also adjusted to the same contents</li>
<li>If you&#8217;re checking out an actual branch (as opposed to, say a tag or a
SHA1-identified commit), git updates <code>HEAD</code> to point to that branch.</li>
</ul>

<p>Now you know what the <code>HEAD</code> file I introduced in the &#8220;uppercase files&#8221; section is used for. Just as <code>HEAD</code> stores the pointer to your current branch, the other uppercase files point to other branches, or other commits that are interesting in some situations like a merge or fetch operation.</p>

<h3>Summary</h3>

<p>The <a href="http://jan.varwig.org/archive/advanced-git">last part of the series</a> described the data structures behind Gits object database.
By discussing the contents of the <code>.git</code> directory, you understand the operations that git performs to organize the content in the object database, and to create branches.
Given the knowledge about these files, you should have a clear idea now how Git implements its commands.</p>

<p>In the next part of the series, I want to take a closer look at some of them, especially the dreaded <code>rebase</code>.</p>
 <p><a href="http://jan.varwig.org/?flattrss_redirect&amp;id=554&amp;md5=60774a35f0fed505992127089f45a995" title="Flattr" target="_blank"><img src="http://jan.varwig.org/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://jan.varwig.org/archive/advanced-git-part-2/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Advanced Git</title>
		<link>http://jan.varwig.org/archive/advanced-git?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=advanced-git</link>
		<comments>http://jan.varwig.org/archive/advanced-git#comments</comments>
		<pubDate>Thu, 28 Apr 2011 14:12:01 +0000</pubDate>
		<dc:creator>Jan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[advanced git]]></category>
		<category><![CDATA[bcruhr]]></category>
		<category><![CDATA[bcruhr4]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://jan.varwig.org/?p=390</guid>
		<description><![CDATA[At the Barcamp Ruhr 4 this year I held an intermediate level talk about one of my favorite tools of all time: Git. After a very successful introductory presentation two years ago, I wanted to help people to get a deeper understanding of Git so they can use it better. If you used Git before [...]]]></description>
			<content:encoded><![CDATA[<p>At the Barcamp Ruhr 4 this year I held an intermediate level <a href="http://jan.varwig.org/archive/advanced-git-slides-from-barcamp-ruhr-4">talk</a> about one of my favorite tools of all time: Git.
After a very successful introductory presentation two years ago, I wanted to help people to get a deeper understanding of Git so they can use it better.</p>

<p>If you used Git before and kinda like it but feel unsure about using some of its advanced commands because you think you don&#8217;t completely understand whats going on <strong>under Gits hood</strong>, if you like what <strong>rebase</strong> can do for you but are <strong>afraid to use it</strong> because you&#8217;ve read somewhere that the sky will fall on your head if you make a mistake, then this article is for you.
Git only reveals its <strong>true, awesome power</strong> if you use it to its fullest potential. And to do that it is <strong>essential to understand how Git works internally</strong>.</p>

<p><span id="more-390"></span></p>

<p>The talk I gave at the Barcamp was roughly about four topics:</p>

<ol>
<li>Data structures in Git</li>
<li>The layout of the .git directory</li>
<li>The benefits of using rebase and why rebase isn’t nearly as harmful as everyone thinks</li>
<li>Several Tips and Tricks for making day-to-day tasks easier</li>
</ol>

<p>After the talk I decided to write it down here for the benefit of everyone who ever struggled to understand what Git exactly does when you tell it to pull, merge or commit. I will write a series of four posts, based on the topics of the talk.</p>

<h3>Data Structures in Git</h3>

<p>Git’s core database is a <strong>directed object graph</strong> with four different types of objects.
Each object has an identifier that is calculated as a SHA1 hash of its contents.
That hash is formed by a cryptographic function returning a 128bit key. The fundamental property of a hash function in this context is that it’s a true function in the mathematical sense. Same inputs yield the same outputs.
This ensures that identical objects are always assigned to the same identifier.
<strong>There is no duplication, ever</strong>. I will describe this priniciple in more detail in the following
paragraphs.</p>

<p>It will help your understanding to refer back to the following graphic when reading these paragraphs:
<img src="http://jan.varwig.org/wp-content/uploads/2011/04/git_graph.jpg" alt="Illustration of Git&#039;s graph database" title="git_graph" width="500" height="626" class="size-full wp-image-531 aligncenter" />
This is a representation of Git&#8217;s object graph. For brevity I focused on the structural properties of each object, omitting the content (the binary content in case of a blob, the commit message in commits etc.). I also shortened the SHA1s to three characters.</p>

<p>The <strong>four object types</strong> in Git are:</p>

<h4>Blobs</h4>

<p>Blobs are simply chunks of binary data with no other properties, no metadata no nothing. Just the <strong>pure data</strong>.
They are used to store the content of files in the repository.
They do not correspond 1:1 to files however. They correspond to file <em>content</em>. Two files in your repository, with different names or at different locations, with <strong>identical file  content will use the same blob object</strong> to represent that content in the database. The blob is identified by the SHA1 hash of its content.</p>

<h4>Trees</h4>

<p>Trees represent <strong>directory structures</strong>. This is a tree object:</p>

<div class="dean_ch" style="white-space: wrap;"><br />
100644 blob 159202af1c0374e33374f2a0e20b5e0ecbc0c19e&nbsp; &nbsp; .gitignore<br />
100644 blob 37e1207dab85993425ee5f4ceb2a59055dccfc77&nbsp; &nbsp; .gitmodules<br />
100644 blob e04728e8d391f57a6fa0c3325118750c602ef5ef&nbsp; &nbsp; Capfile<br />
100644 blob 2af0fb1133d03dcedf1f2bbca9a9b04444ef84f0&nbsp; &nbsp; README<br />
100644 blob 3bb0e8592a41ae3185ee32266c860714980dbed7&nbsp; &nbsp; Rakefile<br />
100644 blob 70d0345e4619e790993e852fe0ed1946d8d53afc&nbsp; &nbsp; TODO.txt<br />
040000 tree 875c4668c815306dcb1de23407973e2f1fb9d3a8&nbsp; &nbsp; app<br />
040000 tree 942fb533688aa713f5302b525cf0b8cfeb245d8b&nbsp; &nbsp; config<br />
040000 tree da543b1ab388687f5612e6fb7c06fc778b8026bc&nbsp; &nbsp; db<br />
040000 tree 0269300738b048a5cc34769d1436d9f228499018&nbsp; &nbsp; doc<br />
040000 tree 5a86b1e544e01c8951edafc39a3b0ca7bf09c2e9&nbsp; &nbsp; lib<br />
040000 tree 0289883d028de7e3c8c54a7fa09c2851fda8346f&nbsp; &nbsp; public<br />
040000 tree 5ecf890b2a8c6d1e6b76b7d2ac25a4e40cf2cc67&nbsp; &nbsp; script<br />
040000 tree c900b82e1d3f53af6392341f4ecf2a271961c26a&nbsp; &nbsp; spec<br />
040000 tree 3d5fc32106bf1848bcb79ef8a9f0fbf06e858fed&nbsp; &nbsp; vendor<br />
&nbsp;</div>

<p>Where does this representation come from? Well, Git offers a command for that, <code>git cat-file</code>.
Its most common usage is <code>git cat-file -p &lt;object&gt;</code>. I got the above printout
by passing the SHA1 of a tree as <code>&lt;object&gt;</code>.</p>

<p>You can see that a tree is simply a <strong>list of your directory</strong> that consists of 
<strong>links to other objects</strong>, blobs (for files) and trees (for subdirectories), together
with metadata (file permissions and filenames). What this says, for example, is that
there&#8217;s a file named &#8220;Capfile&#8221;, whose content is stored in the blob with the SHA1
<code>e04728e8d391f57a6fa0c3325118750c602ef5ef</code>:</p>

<div class="dean_ch" style="white-space: wrap;"><br />
$ git cat-file -p e04728e8d391f57a6fa0c3325118750c602ef5ef<br />
load &#8216;deploy&#8217; if respond_to?(:namespace) # cap2 differentiator<br />
Dir['vendor/plugins/*/recipes/*.rb'].each { |plugin| load(plugin) }<br />
<br />
load &#8216;config/deploy&#8217; # remove this line to skip loading any of the default tasks<br />
&nbsp;</div>

<p>Or that &#8220;app&#8221; is a subdirectory whose content is found in the tree <code>875c4668c815306dcb1de23407973e2f1fb9d3a8</code>:</p>

<div class="dean_ch" style="white-space: wrap;"><br />
$ git cat-file -p 875c4668c815306dcb1de23407973e2f1fb9d3a8<br />
040000 tree 3e6fae3a140890d75eb9d51ce0974f7969194661&nbsp; &nbsp; controllers<br />
040000 tree 77b99dd8afcf55ad613d51e9e18a3df1aafa3f62&nbsp; &nbsp; helpers<br />
040000 tree 41c9c92d8f11afe27f2f25fa6bad867d6427cfbe&nbsp; &nbsp; models<br />
040000 tree c958c76cd2ccda49b3d514a1a12b2236974300c2&nbsp; &nbsp; sweepers<br />
040000 tree 850a76c0d5d056c43d56bc5d987836ea296584f4&nbsp; &nbsp; views<br />
040000 tree 4c713208aee9f3bfb172424e3a68f2a1f10d715a&nbsp; &nbsp; workers<br />
&nbsp;</div>

<h4>Commits</h4>

<p>Until now, our trees and blobs have been floating around in the database with no way
of getting at any object without knowing its SHA1. Also, we&#8217;ve seen the information
that blobs and trees can store but there was no discernible way of storing the history
of anything. Pretty useless for a version control system, you say?</p>

<p>This is were commits come into play. <strong>Their job is to record history</strong>. Let&#8217;s
start by looking at the commit on top of our master branch with <code>git cat-file -p master</code>:</p>

<div class="dean_ch" style="white-space: wrap;"><br />
tree dcad9007245d68ff56d90fcf96af38f686eb61c1<br />
parent 4d9cb9b0d6248bb5c0868261039ef7f56ce47494<br />
author Jan Varwig &lt;jan@varwig.org&gt; 1300882710 +0100<br />
committer Jan Varwig &lt;jan@varwig.org&gt; 1300882710 +0100<br />
<br />
Wrote Helper methods in User to aid with taking down accounts<br />
&nbsp;</div>

<p>Here you can see what kind of information is stored in a commit:</p>

<ul>
<li>An <strong>author</strong> and a time of authoring as well as a committer and the date the
commit was created. This distinction is made because Git supports
patches that are authored by one person but committed by someone else,
something that&#8217;s not uncommon in big open source projects.<br />
This can also occur when <em>all</em> developers have commit rights: Whenever you
cherry-pick a commit, <em>you</em> become the committer, but the original author
remains the same.
However, for the sake of discussing Git&#8217;s data structure this distinction has no relevance.</li>
<li>A reference to a <strong>tree</strong> object that represents the <strong>state of the working
directory</strong> at the time the commit was created.</li>
<li>One or more references to <strong>parent commits</strong>. This is what actually <strong>builds the history</strong> 
of your repository. Each regular commit has one parent, one previous state
of the working directory. When you perform a merge, a commit can even have
two or more parents, pointing to the different branches of development that
have been merged.  </li>
</ul>

<p>Taking the example from above, if we inspect the parent we see such a merge commit:</p>

<div class="dean_ch" style="white-space: wrap;"><br />
$ git cat-file -p 4d9cb9b0d6248bb5c0868261039ef7f56ce47494<br />
tree c7e8830f84488241ee185842b5e226c70e629653<br />
parent 49951c31b899631e974deb15766389978603b47a<br />
parent bfefebbfcdf1dec22eca969a91ac9bbf1d3d499e<br />
author Jan Varwig &lt;jan@varwig.org&gt; 1300445073 +0100<br />
committer Jan Varwig &lt;jan@varwig.org&gt; 1300445073 +0100<br />
<br />
Merge branch &#8216;stage&#8217; of dev.9elements.de:imgly into stage<br />
&nbsp;</div>

<p>This explains how Git strings a series of commits together to form a history, but I didn&#8217;t tell how to actually get at an object without knowing its SHA1. This is were <strong>branches</strong> come into play. They are essentially <strong>readable aliases for SHA1s</strong> that get updated every time you perform certain actions (like committing, merging, etc.). I will explain this in more detail in the next post of this series.</p>

<h4>Tags</h4>

<p>The last type of objects are tags. To be more specific, <em>Annotated</em> Tags.
Simple tags are not objects (more on that later), but annotated tags are.
You get an annotated tag if you use the <code>-a</code> option when creating a tag</p>

<div class="dean_ch" style="white-space: wrap;"><br />
$ git tag -a test_tag<br />
$ git cat-file -p test_tag<br />
object 4a06c46ee6d58ce4be09954ee054921b18269cd6<br />
type commit<br />
tag test_tag<br />
tagger Jan Varwig &lt;jan@varwig.org&gt; Fri Apr 15 19:42:02 2011 +0200<br />
<br />
This is the message for the test tag<br />
&nbsp;</div>

<p>A tag consists of an object reference, but what&#8217;s special about it, is that it can refer
to <em>any kind</em> of object. What exactly the tag is referring to, can be seen in the
<code>type</code> field. The tag we&#8217;re seeing here points to a commit with the SHA1 4a06c46ee6d58ce4be09954ee054921b18269cd6.
The tag also has a name, given in the <code>tag</code> field, a tagger and a message.</p>

<p>To be honest I never worked with annotated tags and most of you probably never will either.
Their main use case over regular tags is that they can be cryptographically signed
(as can commits).</p>

<h4>Summary</h4>

<p>That was it. Four very simple types of objects.</p>

<ul>
<li>Blobs &#8211; Storing the content of files</li>
<li>Trees &#8211; Storing the structure of your working directory</li>
<li>Commits &#8211; Putting trees into a sequence to preserve history</li>
<li>Tags &#8211; Reliable mechnism to point to objects in the database</li>
</ul>

<p>Maybe you should <strong>take a look at one of your own repositories</strong> now, starting with
<code>git cat-file -p master</code> and poking around a bit.</p>

<p>These objects and their references to each other are the absolute core of Git and
understanding their structure and relationships is essential to working
well with Git. As soon as you start thinking of your <strong>repository as this objectgraph</strong>, you&#8217;ll
realize that the Git toolchain is nothing but a set of <strong>manipulations on that graph database</strong>,
creating new objects all the time, pointing to other objects.</p>

<p>There are two additional implementation details that you should be aware of:</p>

<ol>
<li>Even though my description and the results of <code>git cat-file</code> make it seem like you&#8217;re
dealing with full fledged objects, the reality is that Git uses very efficient
compression algorithms to reduce the amount of actual data stored in its database.
Trees or blobs aren&#8217;t usually stored in full but described as differences to other
similar objects.<br />
But for reasoning about objects, you can and should think of them as being self-contained
and independent.</li>
<li>On the other hand, objects often actually <em>are</em> uncompressed in the database.
Also Git never deletes objects! If you remove a file from a tree, the blob
for that file will still exist. If you <strong>lose a commit</strong> through rebasing or merge problems
or by accidentally deleting a branch, <strong>as long as you know its SHA1, you can get back to it</strong>.
The only time Git actually deletes and compresses objects is during its garbage collection run.
If you clone a new project, you will retrieve compressed objects from the remote server, but
objects you create in your local repository will be uncompressed at first.
Git repacks and garbage collects on its own from time to time, but you can also trigger
this processs manually by calling <code>git gc</code>. Do this when you notice that working with your
repository becomes slow or that the repository becomes too large.</li>
</ol>

<h4>Next post</h4>

<p>In the <a href="http://jan.varwig.org/archive/advanced-git-part-2">next post</a> of this series, I will explain the <strong>structure of the <code>.git</code> directory</strong>. This is where you will find your branches and regular tags, as well as the actual object database files. You will learn what a <strong>branch actually is</strong> and how to effectively manipulate them.</p>
 <p><a href="http://jan.varwig.org/?flattrss_redirect&amp;id=390&amp;md5=d99a5407268477ddb8cfa8c7abe13b77" title="Flattr" target="_blank"><img src="http://jan.varwig.org/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://jan.varwig.org/archive/advanced-git/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Advanced Git slides from Barcamp Ruhr 4</title>
		<link>http://jan.varwig.org/archive/advanced-git-slides-from-barcamp-ruhr-4?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=advanced-git-slides-from-barcamp-ruhr-4</link>
		<comments>http://jan.varwig.org/archive/advanced-git-slides-from-barcamp-ruhr-4#comments</comments>
		<pubDate>Sun, 17 Apr 2011 08:38:20 +0000</pubDate>
		<dc:creator>Jan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[advanced git]]></category>
		<category><![CDATA[bcruhr]]></category>
		<category><![CDATA[bcruhr4]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[slides]]></category>

		<guid isPermaLink="false">http://jan.varwig.org/?p=475</guid>
		<description><![CDATA[At the Barcamp Ruhr 4 this year I held a session about Git for advanced users. I&#8217;m currently preparing the content of that session as a series of blog posts, but in the meantime, here are the slides: Download/View Slides The talk has also been recorded by Oliver Überholz, who promised to send me a [...]]]></description>
			<content:encoded><![CDATA[<p>At the <a href="http://barcampruhr3.de">Barcamp Ruhr 4</a> this year I held a session about Git for advanced users.
I&#8217;m currently preparing the content of that session as a series of blog posts, but in the meantime, here are the slides:</p>

<p><a href="http://jan.varwig.org/wp-content/uploads/2011/03/advanced_git.pdf">Download/View Slides</a></p>

<p>The talk has also been recorded by <a href="http://twitter.com/getoliverleon">Oliver Überholz</a>, who promised to send me a copy but so far hasn&#8217;t replied to my messages. Please give him a nudge :)</p>

<p><strong>UPDATE</strong>: <a href="http://jan.varwig.org/archive/advanced-git">First post of of the <em>Advanced Git</em> series is online</a>.</p>
 <p><a href="http://jan.varwig.org/?flattrss_redirect&amp;id=475&amp;md5=e22d6f1753e107c0fbcdc6fba9b13384" title="Flattr" target="_blank"><img src="http://jan.varwig.org/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://jan.varwig.org/archive/advanced-git-slides-from-barcamp-ruhr-4/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>REST in Place now on Github</title>
		<link>http://jan.varwig.org/archive/rest-in-place-now-on-github?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=rest-in-place-now-on-github</link>
		<comments>http://jan.varwig.org/archive/rest-in-place-now-on-github#comments</comments>
		<pubDate>Sat, 20 Sep 2008 21:34:10 +0000</pubDate>
		<dc:creator>Jan</dc:creator>
				<category><![CDATA[on Rails]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[mercurial]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[rest in place]]></category>

		<guid isPermaLink="false">http://jan.varwig.org/?p=90</guid>
		<description><![CDATA[After using Mercurial for 7 months we at 9elements have finally given in to the internet peer pressure und switched to git (Well, to be honest, several shortcomings in Mercurial played an important role too). Since then I&#8217;ve become accustomed to git and today ported over REST in Place from Subversion to Github. The Github [...]]]></description>
			<content:encoded><![CDATA[<p>After using <a href="http://selenic.com/mercurial/">Mercurial</a> for 7 months we at 9elements have finally given in to the internet peer pressure und switched to <a href="http://git.or.cz/">git</a> (Well, to be honest, several shortcomings in Mercurial played an important role too). Since then I&#8217;ve become accustomed to git and today ported over <a href="http://jan.varwig.org/projects/rest-in-place">REST in Place</a> from Subversion to Github.<br />
The Github project page ist located at <a href="http://github.com/janv/rest_in_place/">http://github.com/janv/rest_in_place/</a>, the repository can be found at <code>git://github.com/janv/rest_in_place.git</code>.</p>

<p>I&#8217;ve updated the README and the <a href="http://jan.varwig.org/projects/rest-in-place">project page</a> with the new information.</p>

<p>I&#8217;ve also published my <a href="http://github.com/janv/dbserialize/">dbserialize plugin at Github</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jan.varwig.org/archive/rest-in-place-now-on-github/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>One of those nights</title>
		<link>http://jan.varwig.org/archive/one-of-those-nights?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=one-of-those-nights</link>
		<comments>http://jan.varwig.org/archive/one-of-those-nights#comments</comments>
		<pubDate>Thu, 03 Jan 2008 14:27:31 +0000</pubDate>
		<dc:creator>Jan</dc:creator>
				<category><![CDATA[on Rails]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[avi bryant]]></category>
		<category><![CDATA[evan phoenix]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[randal schwartz]]></category>
		<category><![CDATA[rubinius]]></category>
		<category><![CDATA[seaside]]></category>
		<category><![CDATA[smalltalk]]></category>
		<category><![CDATA[zed shaw]]></category>

		<guid isPermaLink="false">http://jan.varwig.org/archiv/one-of-those-nights</guid>
		<description><![CDATA[Last night Zed Shaws Rails is a Ghetto Shitstorm was brought to my attention. Zeds rant provides enough meat for a post of its own but it&#8217;s not what I want to write about today. Following some comments on Zed article on Technorati I stumbled (again) into one of those evenings full of great discoveries. [...]]]></description>
			<content:encoded><![CDATA[<p>Last night Zed Shaws <a href="http://www.zedshaw.com/rants/rails_is_a_ghetto.html">Rails is a Ghetto</a> Shitstorm was brought to my attention.
Zeds rant provides enough meat for a post of its own but it&#8217;s not what I want to write about today.
Following some comments on Zed article on Technorati I stumbled (again) into one of those evenings full of great discoveries.</p>

<h2>Git</h2>

<p>Shortly before <a href="http://git.or.cz/">Git</a> became really popular some months ago, I had become interested in <a href="http://www.darcs.net/">darcs</a> and distributed revision control systems in general.
The topic is kinda difficult though and none of the texts I was reading at the time could really communicate the benefits of DRCS to me.
I always had some gripes about svn but it wasn&#8217;t clear to me how DRCS were able to solve them.</p>

<p>I lost interest, following posts about git only loosely until last night a colleague pointed me to <a href="http://video.google.com/videoplay?docid=1251251453592758541">Randal Schwartz&#8217; Git presentation at Google Tech Talks</a>. Holy crap, I need to check this out. What appeals most to me:</p>

<ul>
<li><strong>The ability to have the entire repository available locally</strong><br />
I was extremely sceptical when I first heard about this, but when Schwartz claimed that the entire repository of the linux kernel is half the size of a checkout I was sold.</li>
<li><strong>Subversion interoperability</strong><br />
Didn&#8217;t know about this before. Makes the transition much easier.</li>
<li><strong>Having local-only repositories inside your working dir</strong><br />
I have many smaller projects that I&#8217;d love to keep locally contained. In Subversion I always had to create a repository on my server for everything.</li>
<li><strong>Other small things</strong><br />
High compression, the simple database system behind git, the optimization for speed, staged commits, the ability to completely erase files from a repository (e.g. stuff not intended for publication, something that was very hard to do on subversion), the placement of <em>all</em> metadata in a single directory (opposed to littering every dir in the working copy with <code>.svn</code> directories)</li>
</ul>

<p>Despite some shortcomings that was enough to make me install git on my mac (<code>sudo port install git-core</code>). I&#8217;m eager to check it out later today.</p>

<h2>Smalltalk and Seaside</h2>

<p>Ever since watching Evan Phoenix <a href="http://rubyconf2007.confreaks.com/d2t1p3_rubinius.html">Rubinius Presentation</a> at RubyConf 2007 and listening to Avi Bryants <a href="http://itc.conversationsnetwork.org/shows/detail3432.html">Smalltalk&#8217;s Lessons for Ruby</a> Keynote from RailsConf 2007 I&#8217;ve been curious about Smalltalk. I mean, I was curious about it before, after all it&#8217;s probably the language that has the most influence on what I&#8217;m doing today (through its promotion of object-orientation and through providing key principles behind ruby), but since listening to Avi and Evan I&#8217;ve become really interested in VM implementations (see <a href="http://users.ipa.net/~dwighth/smalltalk/bluebook/bluebook_imp_toc.html">Smalltalk-80: The Language and Its Implementation</a> for an excellent in-depth description of the orignal Smalltalk-80 interpreter) and real world usage of smalltalk.</p>

<p>To be honest, as much as I love Ruby as a language, its implementations all suck. And Evan explained why: Implementing most of the base language on another Platform (C for MRI, Java for JRuby) turns out to be a leaky abstraction when you want to extend the language. Additionally, as pure and beautiful the Ruby language is in concept, as ugly is its implementation. On the one hand, what I like so much about Ruby is its conceptual purity, its very limited set of axioms, syntax and exceptions from its own rules, on the other hand, this purity is not present in the interpreter when high-level data structures (like arrays and hashes) are implemented in C for performance reasons. Smalltalk has always had a strong philosophy of implementing as much as possible in Smalltalk itself and only resorting to C for a minimal subset (<a href="http://www.cincomsmalltalk.com/userblogs/avi/blogView?showComments=true&amp;entry=3284695382">&#8220;Turtles all the way down&#8221;</a>).</p>

<p><a href="http://www.rubini.us/">Rubinius</a> aims to implement a Ruby interpreter on the design principles of smalltalk. I love the project and there seem to be only the most brilliant people working on it (Evan Phoenix, Eric Hodel, Ryan Davis and others, full time). As many others have said already, Rubinius is likely to become the main Ruby implementation if they manage to take off (and they will undoubtedly).</p>

<p>Yet, something was bugging me: If Ruby and Smalltalk are so similar, if Smalltalk has been around, specified and stable for 25 years in many different, <em>compatible</em> implementations, commercial <em>and</em> open source, why take the long route and bend Ruby to look like Smalltalk? Why not use Smalltalk directly? These questions became even more nagging after reading Randal Schwartz&#8217; (yeah, the guy who sold me on Git earlier) <a href="http://methodsandmessages.vox.com/library/post/transcript-show-hello-world-cr.html">Transcript show: &#8216;Hello, world!&#8217;, cr</a>
and <a href="http://www.akitaonrails.com/">Fabio Akita</a>s excellent Interview with Avi Bryant (<a href="http://www.akitaonrails.com/2007/12/15/chatting-with-avi-bryant-part-1">part 1</a>, <a href="http://www.akitaonrails.com/2007/12/22/chatting-with-avi-bryant-part-2">part 2</a>).
Listening to yet <em>another</em> chat with Avi (linked in the second part of Fabio&#8217;s interview) at <a href="http://twit.tv/floss21">Floss Weekly</a> (with Randal Schwartz again, highly recommended) before falling asleep I finally decided to check out <a href="http://www.squeak.org/">Squeak</a>, the (most popular?) opensource Smalltalk implementation. I was amazed at the simplicity of the installation: Download the VM, Download the Squeak image, load the image into the VM, done.</p>

<p>I have read about <a href="http://www.seaside.st/">Seaside</a>, Avi&#8217;s web development framework, before, mainly in reagard to it&#8217;s clever use of continuations, but some of the stuff he described at Floss sound almost too good to be true. Live debugging of your app <em>in the browser</em> ? With hot code swapping over the wire? And I thought Rails&#8217; rdebug integration was great.</p>

<p>Well, at 2am I was finally falling asleep but the stuff I&#8217;ve been discovering will probably keep me occupied for quite some time.
As I explore and discover more about the topics mentioned, I&#8217;ll report my findings here on my blog.</p>
]]></content:encoded>
			<wfw:commentRss>http://jan.varwig.org/archive/one-of-those-nights/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

