tags:

views:

40

answers:

2

I need an indexed file format that can hold a few hundred large variable sized binary blobs.

Blobs are around 1-5MB and the file could be as large as 1 GB. I need to be able to quickly find, read, add and remove blobs without recreating the the entire file. I have no need to compress the blobs, however if blobs were removed, I'd like to reclaim or reuse the space.

Ideally there would be a Java API.

I'm currently doing this with a ZIP format, but there's no known way to update a ZIP file without recreating it and performance is bad.

I've looked into SQLite but its blob performance was slow, and its overkill for my needs.

Any thoughts, or should I roll my own? And if I do roll my own, any book or web page suggestions?

Thanks...

+2  A: 

Berkeley DB Java Edition does what you need. It's free.

Steve Emmerson
Looks perfect, but GPL or License will be a problem for my company I think.
awbranch
Then how about Apache Derby? Or, as meriton pointed out, filesystems are designed for this kind of thing. Perhaps your use cases warrant an indexed file format, in which case you should mention your requirements.
The Alchemist
I may be able to get away with just using a folder in the filesystem. Its for a project file format for a multimedia editing program that can combine images, sounds and video. Sort of like a powerpoint file.
awbranch
A: 

You need some virtual file system. Our SolFS is the one of the options yet we have only JNI layer, as the engine is written in C. There exists one more option, CodeBase, but as they don't provide an evaluation version of their file system, I know a few about it.

SolFS is ideally suitable for your task, because it lets you have alternative streams for files and associate searchable metadata with each file or even alternative stream.

Eugene Mayevski 'EldoS Corp