Does the staging area contain the snapshot of content of the last commit? Or something else?<hr />
git status shows:
On branch master nothing to commit, working directory clean
then if I issue
git diff --cached
HEAD is compared to what?
Whether or not the working directory is clean doesn't really have anything to do with what the staging area contains. The staging area contains stuff that's been staged for commit (e.g. with
git add). Yes, any files that <em>haven't</em> been staged will be in the same state as they are on
HEAD, unless you've done something weird.
git diff --cached is comparing the contents of the staging area with
HEAD (It can also be called as
git diff --staged). Since
git commit turns the contents of the staging area into a commit,
git diff --staged shows you what you would commit if you committed now.
git status output says "Nothing to commit", it's already telling you that there are no staged changes, and
git diff --staged should report nothing.
<strong>Prerequisite reading</strong>: <a href="http://www.git-scm.com/book/en/v2/Git-Internals-Git-Objects" rel="nofollow">Read about Git objects, particularly trees and blobs, first</a>.
<strong>Short answer</strong>: the index always contains the IDs of the blobs of the files in HEAD, plus a bunch of flags to track changes to the files. It does <em>not</em> contain a snapshot of the last commit, nor is it a tree object (exception below).
<a href="https://git.kernel.org/cgit/git/git.git/tree/Documentation/technical/index-format.txt?id=HEAD" rel="nofollow">The really long answer can be read here</a>.
<strong>Slightly longer answer</strong>: The index (
.git/index) always stores a list of the blob IDs and filepaths of all the files in HEAD, plus metadata about the files (permissions, modification times, owners, etc...).
The index can also contain pre-computed <a href="http://www.git-scm.com/book/en/v2/Git-Internals-Git-Objects" rel="nofollow">tree objects (how Git stores directories)</a> to speed up committing. It also stores information about conflicts.
So an "empty" index contains a list of all the filepaths, their blob IDs, meta information about the files, and space to store conflict information. Because it only stores the blob IDs (160 bits) the index avoids being redundant with HEAD. Index files for my projects are anywhere from less than 1K to 500K for large projects like Perl and Git.
You can poke around the index using <a href="https://libgit2.github.com/libgit2/#HEAD/group/index/git_index_read" rel="nofollow">libgit2</a> which has wrappers in many programming languages, for example <a href="https://metacpan.org/pod/Git::Raw" rel="nofollow">Git::Raw</a> in Perl.