
Question:
Does the staging area contain the snapshot of content of the last commit? Or something else?
<hr />If git status
shows:
On branch master
nothing to commit, working directory clean
then if I issue git diff --cached
so HEAD
is compared to what?
Whether or not the working directory is clean doesn't really have anything to do with what the staging area contains. The staging area contains stuff that's been staged for commit (e.g. with git add
). Yes, any files that <em>haven't</em> been staged will be in the same state as they are on HEAD
, unless you've done something weird. git diff --cached
is comparing the contents of the staging area with HEAD
(It can also be called as git diff --staged
). Since git commit
turns the contents of the staging area into a commit, git diff --staged
shows you what you would commit if you committed now.
Since your git status
output says "Nothing to commit", it's already telling you that there are no staged changes, and git diff --staged
should report nothing.
<strong>Prerequisite reading</strong>: <a href="http://www.git-scm.com/book/en/v2/Git-Internals-Git-Objects" rel="nofollow">Read about Git objects, particularly trees and blobs, first</a>.
<strong>Short answer</strong>: the index always contains the IDs of the blobs of the files in HEAD, plus a bunch of flags to track changes to the files. It does <em>not</em> contain a snapshot of the last commit, nor is it a tree object (exception below).
<a href="https://git.kernel.org/cgit/git/git.git/tree/Documentation/technical/index-format.txt?id=HEAD" rel="nofollow">The really long answer can be read here</a>.
<strong>Slightly longer answer</strong>: The index (.git/index
) always stores a list of the blob IDs and filepaths of all the files in HEAD, plus metadata about the files (permissions, modification times, owners, etc...).
The index can also contain pre-computed <a href="http://www.git-scm.com/book/en/v2/Git-Internals-Git-Objects" rel="nofollow">tree objects (how Git stores directories)</a> to speed up committing. It also stores information about conflicts.
So an "empty" index contains a list of all the filepaths, their blob IDs, meta information about the files, and space to store conflict information. Because it only stores the blob IDs (160 bits) the index avoids being redundant with HEAD. Index files for my projects are anywhere from less than 1K to 500K for large projects like Perl and Git.
You can poke around the index using <a href="https://libgit2.github.com/libgit2/#HEAD/group/index/git_index_read" rel="nofollow">libgit2</a> which has wrappers in many programming languages, for example <a href="https://metacpan.org/pod/Git::Raw" rel="nofollow">Git::Raw</a> in Perl.