Tricky to answer - which is probably why it spent so long getting any answer at all.
In part, the answer will depend on the file system - different file systems will give different answers. However, doing 'ls
' requires reading the pages that hold the directory entries, plus reading the pages that hold the inodes identified in the directory. How many pages that is - and therefore how many disk reads - depends on the page size and on the directory size. If you think in terms of 6-8 bytes of overhead per file name, you won't be too far off. If the names are about 12 characters each, then you have about 20 bytes per file, and if your pages are 4096 bytes (4KB), then you have about 200 files per directory page.
If you just list names and not other attributes with 'ls
', you are done. If you list attributes (size, etc), then the inodes have to be read too. I'm not sure how big a modern inode is. Once upon a couple of decades ago on a primitive file system, it was 64-bytes each; it might have grown since then. There will be a number of inodes per page, but you can't be sure that the inodes you need are contiguous (adjacent to each other on disk). In the worst case, you might need to read another page for each separate file, but that is pretty unlikely in practice. Fortunately, the kernel is pretty good about caching disk pages, so it is unlikely to have to reread a page. It is impossible for us to make a good guess on the density of the relevant inode entries; it might be, perhaps, 4 inodes per page, but any estimate from 1 to 64 might be plausible. Hence, you might have to read 50 pages for a directory containing 200 files.
When it comes to running 'cat
' on the files, the system has to locate the inode for each file, just as with 'ls
'; it then has to read the data for the file. Unless the data is stored in the inode itself (I think that is/was possible in some file systems with biggish inodes and small enough file bodies), then you have to read one page per file - unless partial pages for small files are bunched together on one page (again, I seem to remember hearing that could happen in some file systems).
So, for a 200 file directory:
- Plain
ls
: 1 page
ls -l
: 51 pages
cat *
: 251 pages
I'm not sure I'd trust the numbers very far - but you can see the sort of data that is necessary to improve the estimates.