views:

876

answers:

2

I am making an app that parses feeds from xml and stores them using Core Data. The issue I am dealing with at the moment is duplicate entries. Every feed I am parsing contains a unique id, something that I get in my model as an int. Now what I need is to tell Core Data not to store that entity if another with the same id already exists.

As an example, let's say my model has the following properties:

Story.id (int) - primary key
Story.title (NSString)
Story.date (NSDate)

What is the best way to implement that?

My approach would be to keep a record (an array) of all the IDs available in the database and before inserting anything, check if it exists. That could work for the size of my app, but I get the feeling that's not the right approach.

+4  A: 

I see two ways of doing this. It seems to me that the latter (your proposed method) is the better solution.

I changed id to primaryKey since I don't think it would be a good idea to use id as a variable or method name in Object-C since it's a keyword. I might work, I've never really tried. Also I assumed primaryKey to be an NSNumber, since that is how would be stored in Core Data.

Method One would execute a fetch request on the context each time:

for (id data in someSetOfDataToImport) {
    NSFetchRequest * request = [[NSFetchRequest alloc] init];

    [request setEntity:[NSEntityDescription entityForName:@"Story" inManagedObjectContext:context]];
    [request setPredicate:[NSPredicate predicateWithFormat:@"primaryKey = %d", primaryKey]];
    NSUInteger count = [context countForFetchRequest:request error:nil];
    [request release];

    if (count > 0)
     continue;

    // Insert data in Managed Object Context
}

Method Two does what you proposed, cache the keys in an array and check it instead of going to the source:

NSFetchRequest * request = [[NSFetchRequest alloc] init];
[request setEntity:[NSEntityDescription entityForName:@"Story" inManagedObjectContext:context]];
NSArray * allStories = [context countForFetchRequest:request error:nil];
[request release];

NSMutableArray * allPrimaryKeys = [[allStories valueForKeyPath:@"@distinctUnionOfObjects.primaryKey"] mutableCopy];

for (id data in someSetOfDataToImport) {
    if ([allPrimaryKeys containsObject:data.primaryKey])
     continue;

    [allPrimaryKeys addObject:data.primaryKey];

    // Insert data in Managed Object Context
}

[allPrimaryKeys release];
Cory Kilger
I have an app that uses the top approach - fetch to see if the item exists. For the kinds of volume I have, it works fine and is very fast. Either approach works fine, just a question of your requirements.
Hunter
Cory thanks for the code above. It helped a lot.
Dimitris
I use the first method emulate an SQL Replace or update where if I find no object I create a new one, and if I find an object I use it, and then save. So it inserts if it is new and updates if exists.
Grant M
+1  A: 

I would warn you against premature optimization. It's the root of all programming evil and a waste of time.

Frankly, on the iPhone, its almost impossible to get such a huge object graph that you start to notice fetches bogging things down. I can't imagine you're processing hundreds of unique xml feeds every second.

Unless your dealing with hundreds of thousands of primary keys at once, the predicate method will take a trivial amount of time and resources while minimizing complexity, maintenance and programming time. It's the simplest and quickest solution so use it to start and then optimize only if you later determine its a bottle neck.

TechZen
I think that that is the method I will use, yes.
Dimitris