tags:

views:

271

answers:

5

Hi,

I have a two ArrayList given below as sample. AccVO: compareKey,Amount fields

List 1:

AccVO[001,500]                                                   
AccVO[002,600]                                                   
AccVO[003,800]

List2:

AccVO[001,100]                                                                   
AccVO[001,100]                                                                   
AccVO[001,300] 
AccVO[003,300]  
AccVO[003,300]  
AccVO[003,200]
AccVO[005,300]  
AccVO[005,300]

I have sorted the two lists. I have to compare the two lists with the compare key and fetch the records of List2 to insert into database.

Sample Code:

for(AccVO accvo1 : List1){
    for(AccVO accvo2 : List2){
    if(accvo1.getCmpkey().equals(accvo2.getCmpkey())){    
            //insert the recs into the table      
           }     
     }
}

Since my list size will be larger i.e handling millions of records, i need some optimistic logic in looping the records.

Thanking you in advance Prasanna

A: 

I would suggest using a HashMap for the second list.

Amarghosh
+5  A: 

Because your lists are sorted, you can use an index into both arrays and increment only the smaller key each time:

int i = 0,j = 0;

while (i < List1.size() && j < List2.size()){
  int x = List1.get(i).getCmpKey().CompareTo(List2.get(j).getCmpKey();
  if (x == 0){ 
    //add record
    j++;
  }else if(x < 0){
    i++;
  }else{
    j++;
  }
}
steadfastbuck
+1 - given that the inputs are already sorted, this is `O(N)`, compared with `O(NlogN)` for binary search and `O(N)` with `O(N)` extra space for HashSets. The only case where the other approaches might be better is when one of the lists is tiny compared with the other one.
Stephen C
I think you forgot an extra check in the loop for j < list2.size(). I'd prefer a simple solution that uses some more memory over one that's "complicated" enough for such bugs to slip in...
Wouter Coekaerts
@Stephen - Wouldn't it be O(N+M), O(NlogM), O(N) and O(N+M) respectively?
steadfastbuck
@Wouter: Everything should be made as simple as possible, but not simpler
Adriaan Koster
you should add j++; in the add record part
Zappi
You could do it with iterators as well, having the added benefit that it would work well on a linkedlist too.
wds
+3  A: 

If your equals (and hashCode) implementation are based on getCmpkey() you can use Sets.

set1 = new HashSet(list1)
set2 = new HashSet(list2)
set2.retainAll(set1);
for(AccVO a : set2) //insert

This will have O(1) for individual removes (O(n) for n elements in set1).

Thomas Jung
This will use a bit of extra memory for the hashsets. But unless this is *really* performance critical code, I prefer this solution over the one from steadfastbuck because it is a lot easier to read (and less tricky, and shorter).
Wouter Coekaerts
Actually, the bit of extra memory will be a lot more. A HashSet is a fancy HashMap wrapper. So n entries will have n Entry objects, arrays for buckets etc. That's a lot of work for the garbage collector.
Thomas Jung
Dos HashSet work for the second list? We have multiple records for the same compareKey value. But I don't understand this AccVO object (array?) anyway.
Andreas_D
@Andreas_D I suppose this is a glitch in the question. The database will not be happy if you tried to insert a non-unique compound key anyway.
Thomas Jung
A: 

If the type of the Lists is not specified you should use Iterators to traverse them. This will give you guaranteed O(n) (linear) performance even if you use a List implementation that's not backed by an Array (assuming it you can iterate in O(n) time).

For example, if you happened to be given a LinkedList as your list class, the following implementation is still O(n) but an implementation that used get() to index into the list would be O(n^2). So if you list sizes were each 1 million, this implementation would be 1 million times faster than an indexing implementation.

Iterator i1 = list1.iterator();
Iterator i2 = list2.iterator();
if (i1.hasNext() && i2.hasNext() {
   AccVO v1 = i1.next();
   AccVO v2 = i2.next();
   while (true) {
      int i = v1.getCmpKey().compareTo(v2.getCmpKey());
      if (i == 0) {

         // insert record into DB here

         if (i2.hasNext()) {
            v2 = i2.next()
         } else {
            break;
         } 
      } else if (i < 0) {
         if (i1.hasNext()) {
            v1 = i1.next();
         } else {
            break;   
         }
      } else {
         if (i2.hasNext()) {
            v2 = i2.next();
         } else {
            break;
         }
      }
   }
}
Vladimir Giverts
A: 

I think I'd do something along the lines of

Set<Integer> goodKeys = new HashSet<Integer>(); 
for (AccVO a : List1) 
    goodKeys.add(a.getCmpkey());
for (AccVO a : List2)
    if (goodKeys.contains(a.getCmpkey()))
     // insert the recs into the table

I could then hang on to the list of good keys, if desired, extract the first chunk into getKeys(List<AccVO> list), extract the remainder into insertRecordsForKeys(Set<Integer> keys), etc. Easier to test, easier to debug, easier to reuse, easier to work with.

Carl Manaster