Howdy!
I have been trying to figure out how to retrieve (quickly) the number of files on a given HFS+ drive with python.
I have been playing with os.statvfs and such, but can't quite get anything (that seems helpful to me).
Any ideas?
Edit: Let me be a bit more specific. =]
I am writing a timemachine-like wrapper around rsync for various reasons, and would like a very fast estimate (does not have to be perfect) of the number of files on the drive rsync is going to scan. This way I can watch the progress from rsync (if you call it like rsync -ax --progress
, or with the -P
option) as it builds its initial file list, and report a percentage and/or ETA back to the user.
This is completely separate from the actual backup, which is no problem tracking progress. But with the drives I am working on with several million files, it means the user is watching a counter of the number of files go up with no upper bound for a few minutes.
I have tried playing with os.statvfs with exactly the method described in one of the answers so far, but the results do not make sense to me.
>>> import os
>>> os.statvfs('/').f_files - os.statvfs('/').f_ffree
64171205L
The more portable way gives me around 1.1 million on this machine, which is the same as every other indicator I have seen on this machine, including rsync running its preparations:
>>> sum(len(filenames) for path, dirnames, filenames in os.walk("/"))
1084224
Note that the first method is instantaneous, while the second one made me come back 15 minutes later to update because it took just that long to run.
Does anyone know of a similar way to get this number, or what is wrong with how I am treating/interpreting the os.statvfs numbers?