views:

86

answers:

2

Hi all,
Here is my problem, I have 5000 arrays, I want to remove the elements in array4999 those are with identical name to array5000, then add to array5000. Comes out a new array,say NewArray1. then remove identical name elements from array4998 and comes out NewArray2. Iteratively ,till all 5000 arrays are done with filter and leave only a new array,with no duplicate name in it.

What I am planning to do is as below example, Say array of Entity objects:

@interface Entity : NSObject  {
     NSString *firstName;
     NSString *lastName;
     NSNumber *nid;
     object1;
        .
        .
        .
     objectN
}

I want to compare those two arrays on three key fields: firstName, lastName and nid (I do not care about the other fields). If the elements in array5000 and array4999 are same in these 3 fields, remove the corresponding elements in array4999 before adding it to array5000. The resulting NewArray1 should look as in the example below:

array5000 = {"Tom","Jackson",235,....},
         {"Dick","Martin",360,....},
         {"Jimmy","Green",568,....}
array4999 = {"John","Mouson",125,....},
         {"Dick","Martin",360,....}  

NewArray1= {"Tom","Jackson",235,....},
         {"Dick","Martin",360,....},
         {"Jimmy","Green",568,....}
         {"John","Mouson",125,....};

I found a method - (void)removeObjectsInArray:(NSArray *)otherArray which is close to my needs, but this method will remove the elements in otherArray only if those elements are completely identical to the corresponding element in the reciever. For me, I only want to remove elements if firstName, lastName and nid fields are the same. I want to filter elements with identical name in array4999,then add it to array5000 to get NewArray1. Next,filtering array4998. Remove elements with identical name to NewArray1 in it, then merge NewArray1 and array4998 to get NewArray2. Filter one by one on descending count ,till all 5000 arrays are done . Since performance is an issue ,can anyone give me some ideas on my problem? Some code samples would be appreciated.
Thanks in advance.

A: 

The absolute simplest approach would be to use a nested loop, as follows:

NSMutableArray * array3 = [[array1 mutableCopy] autorelease];
for (Entity * entity2 in array2) {
    BOOL noMatchFound = TRUE;
    for (Entity * entity1 in array1) {
        if ([entity1.firstName isEqual:entity2.firstName] &&
              [entity1.lastName  isEqual:entity2.lastName] &&
              [entity1.nid       isEqual:entity2.nid) {
            noMatchFound = FALSE;
            break;
        }
    }
    if (noMatchFound) {
        [array3 addObject:entity2];
    }
}

The resulting array3 will now contain all of the original entities from array1, plus any entities from array2 that do not match have identical firstName, lastName and nid properties.

Note that this code assumes that the Entity class has @property definitions for all three of your key fields.

e.James
Thank your for your code. Is there any way to avoid nested loop to meet my needs ? Because new born array3 is going to be used to compare with array4 ,just like array1 did , say array5. Iteratively till arrayN. N could reach 5000 at most. So what I am looking for is an efficient way with good performance.
Unless you came up with a 1:1 mapping (like a hash) for these array values or manage to sort them so you can perform a binary search (like a sorted list or a tree structure), I think you're stuck with an N^2 iteration through each array, for a total O(M*N^2).
SauceMaster
Can I make a SUBQUERY to avoid iteration? I have no idea if my thought is proper for this issue.
A: 

One thing you can do is:

  • start by defining a -compositeKey method on your entities that returns the a single key that combines all the fields that you care about for your uniqueness check

  • insert all the entities from each array into a single dictionary, using your composite key as the dictionary key

  • extract the result array from the dictionary using -allValues

For example:

NSMutableDictionary *uniqueResults = [NSMutableDictionary dictionary];

for (NSArray *curArray in allArrays) {
    for (Entity *entity in curArray) {
        if ([uniqueResults objectForKey:[entity compositeKey]] == nil) {
             [uniqueResults setObject:entity forKey:[entity compositeKey]];
        }
    }
}

NSArray *resultArray = [uniqueResults allValues];

By leveraging the NSDictionary hash table implementation, you'll do much better than O(n^2)

David Gelhar
Thanks, David. That's a good solution. But what I was trying to do is make a new array with name fields and UID field. Then I am working on make a test case ,to find if its performance is OK.