views:

592

answers:

5

I'm not sure what the "general" name of something like this might be. I'm looking for a library that gives me a file format to store different types of binary data in an expanding single file.

  • open source, non-GPL (LGPL ok)
  • C interface
  • the file format is a single file
  • multiple files within using a POSIX-like file API (or multiple "blobs" within using some other API)
  • file/structure editing is done in-place
  • reliable first, performant second

Examples include:

Problems with the above:

  • whefs doesn't appear to be very mature, but best describes what I'm after
  • HDF, CDF, NetCDF are usable (also very reliable and fast), but they're rather complicated and I'm not entirely convinced of their support for opaque binary "blobs"

Edit:
Forgot to mention, one other relevant question:
http://stackoverflow.com/questions/1361560/simple-virtual-filesystem-in-c-c
Another similar question:
http://stackoverflow.com/questions/374417/is-there-an-open-source-alternative-to-windows-compound-files

Edit:
Added condition of in-place editing.

A: 

It sounds like you're talking about the Linux loopback device, which lets you treat a file on a filesystem as a first-class block device (and then proceed to mkfs, mount, etc.)

(What sort of platform are you targetting? A fully-featured Unixlike? Something in the embedded space with a small footprint?)

crazyscot
Cross-platform is the intent. Certainly Windows and an embedded Linux version. I considered the loopback device, but didn't mention it because I wasn't sure whether it could grow in size and it won't work on Windows.
Ioan
Right. If I were in your shoes, then, I'd also be looking into the various log-structured filesystems to see if there was one with an acceptable license which you could hack up well enough to work in userland and to work off a file as backing store (as opposed to a block device).
crazyscot
A: 

The WxWindows library supports ZIP files (see http://docs.wxwidgets.org/stable/wx_wxarc.html#wxarc). This has also the advantage that you can look at the contents using a ZIP manager (e.g. WINZIP).

A commercial alternative is ChillKat (http://www.chilkatsoft.com/)

If security is a concern, encrypt the file contents and mangle the file names in the ZIP archive.

Patrick
Security isn't a concern. I haven't looked into normal archive types too much, but I wonder how well they perform with regards to large data sets, high speed, random access of the internal file contents, and simultaneous reader/writer...
Ioan
+1  A: 

This appears to do what I was looking for: libgsf

Still need to test its reliability/performance and how cross-platform the binary format is.

Ioan
A: 

Eet library from the Enlightenment project maybe?

http://en.wikipedia.org/wiki/Enlightenment_Foundation_Libraries#EET http://docs.enlightenment.org/api/eet/html/

kazanaki
Nice idea, but not mature yet. It depends on Eina which is currently unstable. Also, it basically stores a hash map of "chunks", which requires an extra layer to manage them.
Ioan
The Ogg container format is also very similar to Eet in simplicity: http://www.xiph.org/ogg/doc/
Ioan
A: 

What about BerkeleyDB? It's not exactly a filesystem but it's quite transparent to store 'binary data' in a file. License seems to be quite permissive as well.

lorenzog
According to http://www.oracle.com/technology/software/products/berkeley-db/htdocs/licensing.html it requires a commercial license for closed-source applications. Also, I don't know how much faster it is than SQLite, which was a few times slower than simply storing directly to a file. This was a simple test, batch commit to store data.
Ioan
Well you did not mention you were doing a closed-source application. And yes, it might be just ike SQLite, but with a few more years of stability on the embedded side.
lorenzog
Sorry, my comment was misleading. The current BerkeleyDB open-source license is GPL-like (as in, it requires your entire application source be released), violating one of the requirements I mentioned in the question.
Ioan