views:

101

answers:

4

Hi,

I am trying to build a Quadtree data structure(or let's just say a tree) on the secondary memory(Hard Disk).

I have a C++ program to do so and I use fopen to create the files. Also, I am using tesseral coding to store each cell in a file named with its corresponding code to store it on the disk in one directory.

The problem is that after creating about 1,100 files, fopen just returns NULL and stops creating new files. I can create further files manually in that directory, but using C++ it can not create any further files.

I know about max limit of inode on ext3 filesystem which is (from Wikipedia) 32,000 but mine is way less than that, also note that I can create files manually on the disk; just not through fopen.

Also, I really appreciate any idea regarding the best way to store a very dynamic quadtree on disk(I need the nodes to be in separate files and the quadtree might have a depth of 50).

Using nested directories is one idea, but I think it will slow down the performance because of following the links on the filesystem to access the file.

Thanks, Nima

+1  A: 

Whats the errno value of the failed fopen() call?

Do you keep the files you have created open? If yes you are most probably exceeding the maximum number of open files per process.

Frank Meerkötter
fopen returns NULL.Oh, yes... You are right. I didn't notice that.I am keeping them open.Thanks.
Nima
A: 

When you use directories as data structures, you delegate the work of maintaining that structure to the file system, which is not necessarily designed to do that.

Edit: Frank is probably right that you'v exceeded the number of available file descriptors. You can increase those, but that shows that you're also using internals of your ABI as a data structure. Slow and (as resources are exhausted) unstable.

Either code for a very specific OS installation, or use a SQL database.

Potatoswatter
A SQL database would be a terrible idea for this too. The best option would be a file system that is optimized for this type of work (I don't know of any), or a key-value store.
Zifre
@Zifre: A key-value store is one of the fundamental applications of a SQL database.
Potatoswatter
A: 

I have no idea why fopen wouldn't work. Look at errno.

However, storing everything in one directory is a bad idea. When you add a lot of files, it will get slow. Having a directory for every level of the tree will also be slow.

Instead, combine multiple levels into one directory. You could, for example, have one directory for every four levels of the tree. This would limit the number of directories, amount of nesting, and number of files per directory, giving very good performance.

Zifre
A: 

The limitation could come from:

  • stdio (C library). most 256 handles. Can be increased to 1024 (in VC, call _setmaxstdio)
  • OS kernel on the file hanldes per process (usually 1024).
Sherwood Hu