



Hi guys,

I've got a little conundrum: would it be better to use direct file management, or a CoreData SQLite database?

Here's my scenario:

I have a bunch of 'user' objects, each with a list of 'post' objects. This is easily done in CoreData, and would be great - however, the 'post' objects are downloaded from a web server, and they each have a unique identifier. I don't want to have multiple 'post' objects with the same ID. I could solve this by caching CoreData responses into an NSDictionary, however this would not apply well to the design pattern of an application. As far as I am aware, when adding a new 'post' to my CoreData NSManagedObjectContext, I would have to lookup the unique ID to check for its existence (fast), then add it if it does not exist (slow), and update the previous if it does (fast). This is effectively replacing it. How would you guys handle this?

I've been trying to think of alternatives for a few days now, but no matter which way I look at it, CoreData is going to be slower than my alternative:

A file architecture inside the Caches/ directory of an iOS application could solve the problem. Something like this:

  • Users/
    • {unique ID}.user
    • {unique ID}.user
  • Posts/
    • {unique ID}.post
    • {unique ID}.post

Then, when retrieving a post object or user object, I can check the files for the existence of the data, and cache the file contents in an NSDictionary. If the ID exists in the dictionary, retrieve it from there instead. Replacing previous 'user' and 'post' objects is as simple as overwriting the file and updating the cache.

My second alternative would clearly be faster - however, I would not be taking advantage of any efficiencies built into CoreData, and I would have to provide my own memory management scheme to clear my cached dictionaries when a memory warning occurs.

Is there is some way of 'uniquing' in CoreData? That would solve my problem. Something similar to using a primary key in an ordinary SQLite database.

I'll start doing tests to verify speeds of both methods, but I thought I'd post this up here before starting in case anyone has any better solutions.

+1  A: 

This exact question comes up a lot.

You can check if a value exist in Core Data without reading in the entire object. Just set the fetch to fetch the specific property you want to test, the ID in this case, and then return the fetch as a dictionary. Provide a predicate that looks for one or more IDs and if the returned dictionary has values, you know you have existing objects.

It's very rare that you can end up with a custom system which is faster and more robust than Core Data. It's rarely worth even trying.

Remember as well that premature optimization is the root of all evil. All this work is predicated on the premise that the simplest Core Data implementation is to slow. Have you actually tested that it is to slow? If not, do so before you try more elaborate designs.

Thanks for the hints. Put me in the right direction! CoreData is a lot faster than I expected.
+1  A: 

After testing, I've found that CoreData at least halves the amount of time taken. The test I was running was as follows: I added 1000 posts to an empty CoreData object graph; and then retrieved 100 of those objects for updating. The time taken to add the objects was 0.069s, and the time taken to retrieve the objects was 0.181s. I retrieved these values on a 3G iPad device. Using files, adding these objects took 10 times longer, and retrieving them took 4 times longer.

My recommendation: Stick to using CoreData!
