views:

4395

answers:

9

Is there a maximum number of inodes in a single directory?

I have a directory of 2 million+ files and can't get an the ls command to work against that directory. So now I'm wondering if I've exceeded a limit on inodes in Linux. Is there a limit before a 2^64 numerical limit?

A: 

Can you get a real count of the number of files? Does it fall very near a 2^n boundry? Could you simply be running out of RAM to hold all the file names?

I know that in windows at least file system performance would drop dramatically as the number of files in the folder went up, but I thought that linux didn't suffer from this issue, at least if you were using a command prompt. God help you if you try to get something like nautilus to open a folder with that many files.

I'm also wondering where these files come from. Are you able to calculate file names programmatically? If that's the case, you might be able to write a small program to sort them into a number of sub-folders. Often listing the name of a specific file will grant you access where trying to look up the name will fail. For example, I have a folder in windows with about 85,000 files where this works.

If this technique is successful, you might try finding a way to make this sort permanent, even if it's just running this small program as a cron job. It'll work especially well if you can sort the files by date somewhere.

Joel Coehoorn
+1  A: 

No. Inode limits are per-filesystem, and decided at filesystem creation time. You could be hitting another limit, or maybe 'ls' just doesn't perform that well.

Try this:

tune2fs -l /dev/DEVICE | grep -i inode

It should tell you all sorts of inode related info.

Jordi Bunster
A: 

Unless you are getting an error message, ls is working but very slowly. You can try looking at just the first ten files like this:

ls -f | head -10

If you're going to need to look at the file details for a while, you can put them in a file first. You probably want to send the output to a different directory than the one you are listing at the moment!

ls > ~/lots-of-files.txt

If you want to do something to the files, you can use xargs. If you decide to write a script of some kind to do the work, make sure that your script will process the list of files as a stream rather than all at once. Here's an example of moving all the files.

ls | xargs -I thefilename mv thefilename ~/some/other/directory

You could combine that with head to move a smaller number of the files.

ls | head -10000 | xargs -I x mv x /first/ten/thousand/files/go/here

You can probably combine ls | head into a shell script to that will split up the files into a bunch of directories with a manageable number of files in each.

Joseph Bui
ls | head -10 doesn't work to get an immediate result, because ls is sorting -- so it needs to read everything before it can print anything.
Charles Duffy
In that case, try: ls -f | head -10
Joseph Bui
+2  A: 

df -i should tell you the number of inodes used and free on the filesystem

tonylo
+2  A: 

Try "ls -U" or "ls -f".

"ls", by default, sorts the files alphabetically. If you have 2 million files, that sort can take a long time. If "ls -U" (or perhaps "ls -f"), then the file names will be printed immediately.

Rob Adams
A: 

Maximum directory size is filesystem-dependent, and thus the exact limit varies. However, having very large directories is a bad practice.

You should consider making your directories smaller by sorting files into subdirectories. One common scheme is to use the first two characters for a first-level subdirectory, as follows:

${topdir}/aa/aardvark
${topdir}/ai/airplane

This works particularly well if using UUID, GUIDs or content hash values for naming.

Charles Duffy
A: 

As noted by Rob Adams, ls is sorting the files before displaying them. Note that if you are using NFS, the NFS server will be sorting the directory before sending it, and 2 million entries may well take longer than the NFS timeout. That makes the directory unlistable via NFS, even with the -f flag.

This may be true for other network file systems as well.

While there's no enforced limit to the number of entries in a directory, it's good practice to have some limit to the entries you anticipate.

mpez0
A: 

For NetBackup, the binaries that analize the directories in clients perform some type of listing that timeouts by the enormous quantity of files in every folder (about one million per folder, SAP work directory).

My solution was (as Charles Duffy write in this thread), reorganize the folders in subfolders with less archives.

Thanks at all.

mario
A: 

Another option is find:

find . -name * -exec somcommands {} \;

{} is the absolute filepath.

The advantage/disadvantage is that the files are processed one after each other.

find . -name * > ls.txt

would print all filenames in ls.txt

find . -name * -exec ls -l {} \; > ls.txt

would print all information form ls for each file in ls.txt

markus
You have to include the wildchar within single quotes if you don't want it to be expanded by the shell (it can be quite long if there are +2 millions files!)
Didier Trosset
You should learn about the `xargs` command. It is much more efficient than the -exec option of the find command.
Didier Trosset