views:

280

answers:

2

I have an object graph in Objective-C on the iPhone platform that I wish to persist to flash when closing the app. The graph has about 100k-200k objects and contains many loops (by design). I need to be able to read/write this graph as quickly as possible.

So far I have tried using NSCoder. This not only struggles with the loops but also takes an age and a significant amount of memory to persist the graph - possibly because an XML document is used under the covers. I have also used an SQLite database but stepping through that many rows also takes a significant amount of time.

I have considered using Core-Data but fear I will suffer the same issues as SQLite or NSCoder as I believe the backing stores to core-data will work in the same way.

So is there any other way I can handle the persistence of this object graph in a lightweight way - ideally I'd like something like Java's serialization? I've been thinking of trying Tokyo Cabinet or writing the memory occupied by bunch of C structs out to disk - but that's going to be a lot of rewrite work.

+1  A: 

I would reccomend re-writing as c structs. I know it will be a pain, but not only will it be quick to write to disk but should perform much better.

Before anyone gets upset, I am not saying people should always use structs, but there are some situations where this is actually better for performance. Especially if you pre-allocate your memory in say 20k contiguous blocks at a time (with pointers into the block), rather than creating/allocating lots of little chunks within a repeated loop.

ie if your loop continually allocates objects, that is going to slow it down. If you have preallocated 1000 structs and just have an array of pointers (or a single pointer) then this is a large magnitude faster.

(I have had situations where even my desktop mac was too slow and did not have enough memory to cope with those millions of objects being created in a row)

Jacob
As long as your not transferring your dump files from computer to computer, a raw dump of memory will be fine. If you are trying to share the file between different machines you will have "byte order" problems.This is definitely better in limited circumstances.
corydoras
+1  A: 

Rather than rolling your own, I'd highly recommend taking another look at Core Data. Core Data was designed from the ground up for persisting object graphs. An NSCoder-based archive, like the one you describe, requires you to have the entire object graph in memory and all writes are atomic. Core Data brings objects in and out of memory as needed, and can only write the part of your graph that has changed to disk (via SQLite).

If you read the Core Data Programming Guide or their tutorial guide, you can see that they've put a lot of thought into performance optimizations. If you follow Apple's recommendations (which can seem counterintuitive, like their suggestion to denormalize your data structures at some points), you can squeeze a lot more performance out of your data model than you'd expect. I've seen benchmarks where Core Data handily beat hand-tuned SQLite for data access within databases of the size you're looking at.

On the iPhone, you also have some memory advantages when using controlling the batch size of fetches and a very nice helper class in NSFetchedResultsController.

It shouldn't take that long to build up a proof-of-principle Core Data implementation of your graph to compare it to your existing data storage methods.

Brad Larson
This will never meet the same performance as a bulk load/unload of memory if that is all the user is trying to achieve.
corydoras
But that's not all they're trying to do. They're trying to manage a complex object graph, including serialization and deserialization, a case that Core Data is ideally suited for. Also, if done right using Core Data, the whole 100k-200k element graph should not need to be loaded into memory at once, a huge benefit for a memory-constrained device like the iPhone.
Brad Larson