ansaurus

Question

PHP writing large amounts of files to one directory

Answer 1

A:

File I/O in general is relatively slow. If you are looping over 1000's of files, writing them to disk, the slowness could be normal.

I would move that over to a nightly job if that's a viable option.

Zack 2009-07-29 14:47:23

Well I can also have it so that it only caches on page request, I suppose in this case that would be a better option?

zuk1 2009-07-29 14:51:50

Answer 2

+3 A:

Remember that in UNIX, everything is a file.

When you put that many files into a directory, something has to keep track of those files. If you do an :-

ls -la

You'll probably note that the '.' has grown to some size. This is where all the info on your 10000 files is stored.

Every seek, and every write into that directory will involve parsing that large directory entry.

You should implement some kind of directory hashing system. This'll involve creating subdirectories under your target dir.

eg.

/somedir/a/b/c/yourfile.txt /somedir/d/e/f/yourfile.txt

This'll keep the size of each directory entry quite small, and speed up IO operations.

Paul Alan Taylor 2009-07-29 14:51:00

Ok this is actually damned easy to do given the way my system will be, thanks, this is the sort of info I was looking for.

zuk1 2009-07-29 14:54:25

you should accept the answer, then.

ithcy 2009-07-29 15:22:37

I was going to, however I wanted to wait a while incase anyone else weighed in with a contrary/improved answer.

zuk1 2009-07-29 15:28:33

"You should implement some kind of directory hashing system." Most filesystems do this for you.

jrockway 2009-07-29 19:04:42

Is this the same with folders listings. IE my cache folder has 100,000,000 sub folders in it, would requesting a file from one of those subfoldes be slow because of the amount of folders in it's parent folder?

zuk1 2009-07-30 09:43:26

jrockway may be able to speak to this with more authority, but I don't think NTFS works the same way as some of the UN*X fs's - employing a master file table instead.

Paul Alan Taylor 2009-07-30 10:19:53

Answer 3

A:

The number of files you can effectively use in one directory is not op. system but filesystem dependent.

You can split your cache dir effectively by getting the md5 hash of the filename, taking the first 1, 2 or 3 characters of it and use it as a directory. Of course you have to create the dir if it's not exsists and use the same approach when retrieving files from cache.

For a few tens of thousands, 2 characters (256 subdirs from 00 to ff) would be enough.

Csaba Kétszeri 2009-07-29 18:54:35

Answer 4

A:

You may want to look at memcached as an alternative to filesystems. Using memory will give a huge performance boost.

http://php.net/memcache/

Al 2009-07-29 19:01:55

ansaurus

tags:

views:

answers:

PHP writing large amounts of files to one directory

related questions