views:

122

answers:

2

I'm serializing a large number of objects to binary files, however I want to keep everything neatly organized and don't really want hundreds of files in a folder. Is there anyway to group them into zip files, and then access the individual files within that zip?

For example, say I created 100 binary files and zipped them. Would I be able to access a single file in that zip and deserialize it without unzipping everything?

+1  A: 

Yes, zip has an archive directory that allows jumping to a specific file. If you intend to spend much more time reading from the archive than changing it, this should be effective. If you have to be able to commit changes back to the non-volatile storage, then some other format besides zip would be better.

Another approach you could try would be storing blobs (binary large objects) in a lightweight database.

Ben Voigt
A: 

You may want to use HDF5, a file format for structured storage and a set of libraries to work with it.

I still never used it, but I have to adopt it on a future project. Quoting from their site:

HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data. HDF5 is portable and is extensible, allowing applications to evolve in their use of HDF5. The HDF5 Technology suite includes tools and applications for managing, manipulating, viewing, and analyzing data in the HDF5 format

The HDF5 technology suite includes:

  • A versatile data model that can represent very complex data objects and a wide variety of metadata.

  • A completely portable file format with no limit on the number or size of data objects in the collection.

  • A software library that runs on a range of computational platforms, from laptops to massively parallel systems, and implements a high-level API with C, C++, Fortran 90, and Java interfaces.

  • A rich set of integrated performance features that allow for access time and storage space optimizations. Tools and applications for managing, manipulating, viewing, and analyzing the data in the collection.

I know they provide a wrapper for .net, andy you can also find some c# example of its use.

Andrea Parodi
Space efficient perhaps, doesn't seem to be at all access-time efficient. Reading < 1MB of data from a 1GB HDF5 file took so long, I have to assume it read the whole file. And this on a solid-state drive, so seek time can't explain the inefficient random access.
Ben Voigt