views:

355

answers:

4

Is it possible to get info about how much space is wasted by changes in every commit — so I can find commits which added big files or a lot of files. This is all to try to reduce git repo size (rebasing and maybe filtering commits)

+3  A: 

You are probably looking for git log --log-size.

My personal suggestion though - I think you are asking the wrong question.

Disk, memory and CPU are pretty cheap nowadays to care about this. Having the full repository and history of your project is a valuable resource and unless you do version control of HD video you should maybe consider expanding the storage you allocate to your repo.

Hope that helps, Sorin

Sorin Mocanu
You are right about cheap hardware. But that is also about teaching my coworkers not to create giant commits and about network — pushing, pulling and cloning is pain when repo weights about 100-200 Mb
tig
The --log-size option only tells you how big the log message is -- not the size of files committed.
Pat Notz
A: 

git cat-file -s <object> where can refer to a commit, blob, tree, or tag.

Ramkumar Ramachandra
A: 

You could do this:

git ls-tree -r -t --full-name HEAD | sort -n -k 4

This will show the largest files at the bottom (fourth column is the file (blob) size.

If you need to look at different branches you'll want to change HEAD to those branch names. Or, put this in a loop over the branches, tags, or revs you are interested in.

Pat Notz
+1  A: 

Forgot to reply, my answer is:

git rev-list --all --pretty=format:'%H%n%an%n%s'    # get all commits
git diff-tree -r -c -M -C --no-commit-id #{sha}     # get new blobs for each commit
git cat-file --batch-check << blob ids              # get size of each blob
tig