views:

1057

answers:

5

I have some inefficiency in my app that I'd like to understand and fix.

My algorithm is:

fetch object collection from network
for each object:
  if (corresponding locally stored object not found): -- A
    create object
    if (a nested related object locally not found): -- B
      create a related object

I am doing the checking on lines A and B by creating a predicate query with the relevant object's key that's part of my schema. I see that both A (always) and B (if execution branched into that part) generate a SQL select like:

2010-02-05 01:57:51.092 app[393:207] CoreData: sql: SELECT <a bunch of fields> FROM ZTABLE1 t0 WHERE  t0.ZID = ? 
2010-02-05 01:57:51.097 app[393:207] CoreData: annotation: sql connection fetch time: 0.0046s
2010-02-05 01:57:51.100 app[393:207] CoreData: annotation: total fetch execution time: 0.0074s for 0 rows.
2010-02-05 01:57:51.125 app[393:207] CoreData: sql: SELECT <a bunch of fields> FROM ZTABLE2 t0 WHERE  t0.ZID = ? 
2010-02-05 01:57:51.129 app[393:207] CoreData: annotation: sql connection fetch time: 0.0040s
2010-02-05 01:57:51.132 app[393:207] CoreData: annotation: total fetch execution time: 0.0071s for 0 rows.

0.0071s for a query is fine on a 3GS device, but if you add 100 of these up, you just got a 700ms blocker.

In my code, I'm using a helper to do these fetches:

- (MyObject *) myObjectById:(NSNumber *)myObjectId {
    NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
    [fetchRequest setEntity:[self objectEntity]]; // my entity cache    
    [fetchRequest setPredicate:[self objectPredicateById:objectId]]; // predicate cache    
    NSError *error = nil;
    NSArray *fetchedObjects = [moc executeFetchRequest:fetchRequest error:&error];
    if ([fetchedObjects count] == 1) {
        [fetchRequest release];
        return [fetchedObjects objectAtIndex:0];
    }
    [fetchRequest release];
    return nil;
}

MyObject *obj = [self myObjectById];
if (!obj) {
   // [NSEntityDescription insertNewObjectForEntityForName: ... etc
}

I feel this is wrong and I should do the check some other way. It should only hit the database once and should come from memory thereafter, right? (The SQL gets executed even for objects that I know for sure exist locally and should have been loaded to memory with previous queries.) But, if I only have myObjectId from an external source, this is the best I could think of.

So, perhaps the question is: if I have myObjectId (a Core Data int64 property on MyObject), how should I correctly check if the relevant local object exists in CD store or not? Preload the whole set of possible matches and then predicate a local array?

(One possible solution is moving this to a background thread. This would be fine, except that when I get the changes from the thread and do [moc mergeChangesFromContextDidSaveNotification:aNotification]; (getting changed objects from background thread by way of notification), this still blocks.)

+2  A: 

You could probably take a lesson from email clients.

They work by first querying the server for a list of message IDs. Once the client has that list, it then compares against it's local data store to see if anything is different.

If there is a difference it takes one of a couple of actions. 1. If it exists on client, but not server AND we are IMAP, then delete locally. 2. If it exists on server, but not client, then download the rest of the message.

In your case, first query for all of the ids. Then send a follow up query to grab all of the data for the ones you don't already have.

If you have a situation where the record might exist locally, but might have been updated since the last download on the server, then your query should include the last updated date.

Chris Lively
Querying the server is cheap for me and done asynchronously. My problem is that the "query for all of the local IDs" part is inefficient and/or I can't figure out how to do it correctly.
Jaanus
+1  A: 

It sounds like what you need is an NSSet of NSManagedObjectIDs which is loaded into memory or stored somewhere more quickly accessed than your persistent object store.

That way you can compare object IDs from the network with object IDs from your cache without having to perform a fetch request on a large data set.

Maybe add the ID to the cache from within -awakeFromInsert within your managed entity classes?

Victorb
Objects from the network have a different NSNumber "id" that is not related with NSManagedObjectID. This is the attribute that I'm using in the example. So, just NSSet of NSManagedObjectID-s is not enough, since I have no "external ID->managedId" mapping.
Jaanus
+2  A: 

You should do one fetch across all the objects, but only fetch the server ID for the objects.

Use setPropertiesToFetch: with setResultType: set to NSDictionaryResultType.

gerry3
+5  A: 

Read "Implementing Find-or-Create Efficiently" in Core Data Programming Guide.

Basically you need to create an array of IDs or properties like names, or anything you have from the managed object entity.

Then you need to create a predicate that will filter the managed objects using this array.

[fetchRequest setPredicate:[NSPredicate predicateWithFormat: @"(objectID IN %@)", objectIDs]];

Of course "objectIDs" could be anything that you can use to identify. It doesn't have to be the NSManagedObjectID.

Then you could do one fetch request, and iterate the resulting fetched objects, to find duplicates. Add a new one if it doesn't exist.

Jesse Armand
Link? I've read Core Data Programming Guide at http://developer.apple.com/Mac/library/documentation/Cocoa/Conceptual/CoreData/cdProgrammingGuide.html, but I don't recall anything about Find-or-Create being there.
Jaanus
Here's the link http://developer.apple.com/Mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdImporting.html#//apple_ref/doc/uid/TP40003174
Jesse Armand
I marked this one as the answer because it carries the same idea as several others (cache relevant identifier property), but has the added bonus of linking to official documentation. Find-or-Create is exactly what this is about. / As for myself, since my data is small, I ended up just fetching the whole object set to memory, since the data is small enough and I need to eventually access it, and this cached way is super fast.
Jaanus
+1  A: 

After fighting with this same issue for ages, i finally stumbled across this blog entry which totally resolved it (and is a reusable code block as a bonus!).

http://henrik.nyh.se/2007/01/importing-legacy-data-into-core-data-with-the-find-or-create-or-delete-pattern

While the code sample doesn't cover the network part; you simply need to load it into a NSDictionary. and then this deals with the syncing of the local Core Data context.

Mark Mackay