Is it possible to get info about how much space is wasted by changes in every commit — so I can find commits which added big files or a lot of files. This is all to try to reduce git repo size (rebasing and maybe filtering commits)
You are probably looking for git log --log-size
.
My personal suggestion though - I think you are asking the wrong question.
Disk, memory and CPU are pretty cheap nowadays to care about this. Having the full repository and history of your project is a valuable resource and unless you do version control of HD video you should maybe consider expanding the storage you allocate to your repo.
Hope that helps, Sorin
git cat-file -s <object>
where can refer to a commit, blob, tree, or tag.
You could do this:
git ls-tree -r -t --full-name HEAD | sort -n -k 4
This will show the largest files at the bottom (fourth column is the file (blob) size.
If you need to look at different branches you'll want to change HEAD to those branch names. Or, put this in a loop over the branches, tags, or revs you are interested in.
Forgot to reply, my answer is:
git rev-list --all --pretty=format:'%H%n%an%n%s' # get all commits
git diff-tree -r -c -M -C --no-commit-id #{sha} # get new blobs for each commit
git cat-file --batch-check << blob ids # get size of each blob