views:

279

answers:

1

I am importing a large plist/XML into Core Data. The structure is simple: let's say there is Person and Company, and a Company can have many Persons. The import goes successfully, but the plist has no established relations, so duplicates of Company are inserted every time multiple people have the same Company.

A potential solution lies in Apple's Core Data docs under 'Implementing Find-or-Create Efficiently':

Or if you import "flat" data with no relationships, you can create managed objects for the entire set and weed out (delete) any duplicates before save using a single large IN predicate.

I've stared at this sentence for ages and can't parse it. Wasn't I already using managed objects to import the entire set? What fetch request are they alluding to?

An algorithm or clarification would be much appreciated.

A: 

In your example your best bet is to do a lookup/search on the company during the import so that you can set up the relationships correctly. Depending on the size of your data you may even want to keep the company objects in memory in a NSDictionary so that you can easily join them to the person objects as they are imported.

Marcus S. Zarra
That sounds like what the docs tell me to avoid. The article says, 'This pattern does not scale well. If you profile your application with this pattern, you typically find the fetch to be one of the more expensive operations in the loop (compared to just iterating over a collection of items). Even worse, this pattern turns an O(n) problem into an O(n^2) problem.' The magnitude of this import will typically be in the 1000 - 10000 person range. If the performance difference is less than a second, I'm happy just implementing your suggestion.
spamguy
It is not the people you are storing in the dictionary but the company objects which should be far fewer. If it is a rare occurrence to have more than one person in a company then this solution would not scale well and you would want to look at calling -countForFetch:error: instead.
Marcus S. Zarra