ansaurus

Question

mmap-loadable data structure library for C++ (or C)

Answer 1

+2 A:

Sounds like maybe you could use one of the "perfect hash" utilities out there. These spend some time opimising the hash function for the particular data, so there are no hash collisions and (for minimal perfect hash functions) so that there are no (or at least few) empty gaps in the hash table. Obviously, this is intended to be generated rarely but used frequently.

CMPH claims to cope with large numbers of keys. However, I have never used it.

There's a good chance it only generates the hash function, leaving you to use that to generate the data structure. That shouldn't be especially hard, but it possibly still leaves you where you are now - maintaining at least some of the code yourself.

Steve314 2010-02-20 09:44:46

Answer 2

+2 A:

You could try to create a memory mapped file and then create the STL map structure with a customer allocator. Your customer allocator then simply takes the beginning of the memory of the memory mapped file, and then increments its pointer according to the requested size. In the end all the allocated memory should be within the memory of the memory mapped file and should be reloadable later.

You will have to check if memory is free'd by the STL map. If it is, your customer allocator will lose some memory of the memory mapped file but if this is limited you can probably live with it.

Patrick 2010-02-20 09:48:27

The issue with STL data structures in memory mapped files is pointers - you'd have to be sure you can always map the file to the exact same address range. I don't know if this is possible or not (or even if the iPhone *supports* memory-mapped files) - just something to check. If you can scan all the nodes and find their pointers, of course, you could do some kind of fix - but that kind of messing with private internals is very very fragile.

Steve314 2010-02-20 09:58:37

@Steve: iPhoneOS do support `mmap(3)` the API call, and I think the kernel, which is almost the same as Mac OS X's, genuinely support it too.

KennyTM 2010-02-20 10:05:23

You're right Steve, you have to be sure that the memory mapped file can be loaded at exactly the same memory address as where you created the file.Luckily in Windows you can choose the address using the function MapViewOfFileEx. For iPhone I don't know whether this is possible

Patrick 2010-02-20 10:21:26

mmap allows you to specify the address, but there isn't any warranty that you will get it. You *should* get it as long as it is the first thing you do after startup.

jbcreix 2010-02-20 12:22:01

You could replace pointers by a template pointer class that keeps the load address of the mmap() as a static member, and the offset relative to the mmap area as a member.

Carsten Kuckuk 2010-02-21 13:41:04

Answer 3

A:

Just thought of another option - Datadraw. Again, I haven't used this, so no guarantees, but it does claim to be a fast persistent database code generator.

Steve314 2010-02-20 10:02:38

Sounds good, but I need to create the files during runtime. (Updated question to clarify.)

KennyTM 2010-02-20 10:08:42

That may rule out my perfect hash answer, but not necessarily datadraw. "Persistent" doesn't mean "precompiled" - it just means there's built-in support for updating/reloading a file.

Steve314 2010-02-20 10:14:19

Actually, I think I see the confusion - the code Datadraw generates *doesn't* have your data embedded into it. It just generates code to handle (mutable) data structures.

Steve314 2010-02-20 10:19:12

Answer 4

A:

WRT boost.intrusive, I've just been having a look. It's interesting. And annoying, as it makes one of my own libraries look a bit pointless.

I thought this section looked particularly relevant.

If you can use "smart pointers" for links, presumably the smart pointer type can be implemented using a simple offset-from-base-address integer (and I think that's the point of the example). An array subscript might be equally valid.

There's certainly unordered set/multiset support (C++ code for hash tables).

Steve314 2010-02-20 10:38:29

Answer 5

A:

Boost intrusive containers and offset pointers look great. Has anybody experienced this?

2010-03-25 15:33:15

Answer 6

A:

Using cmph would work. It does have the serialization machinery for the hash function itself, but you still need to serialize the keys and the data, besides adding a layer of collision resolution on top of it if your query set universe is not known before hand. If you know all keys before hand, then it is the way to go since you don't need to store the keys and will save a lot of space. If not, for such a small set, I would say it is overkill.

Probably the best option is to use google's sparse_hash_map. It has very low overhead and also has the serialization hooks that you need.

http://google-sparsehash.googlecode.com/svn/trunk/doc/sparse_hash_map.html#io

Davi 2010-07-10 22:38:57

ansaurus

tags:

views:

answers:

mmap-loadable data structure library for C++ (or C)

Details:

related questions