views:

283

answers:

3

I'm looking for a standard algorithm/code (Java) which compares two integer lists (old and new) and gives a third result list which provides actions to convert the 'old' list into the 'new' list.

For example:

old-> 1, 2, 3, 4
new-> 9, 2, 3, 6, 4

so the result should be something like:

1-, 9+, 2, 3, 4-, 6+, 4+

Here, the suffix:

  - = Deleted item from old list.
  + = New added item to old list.

and the rest (w/o suffix), are numbers which are unchanged (i.e Value as well as Index). I believe something using the LCS (longest common sequence) would do this job! But I can't really figure-out if there is any.

Any pointers will be highly appreciated.

+3  A: 

Levenshtein distance algorithm seems to work for you (essentially the LCS algorithm you mentioned). Just record the action you choose in another table (right after when you choose the minimum, you need to record which action has resulted the min cost to be able to look it up afterward).

if (seq1[i] == seq2[j] && d[i - 1, j - 1] <= d[i - 1, j] + 1
                       && d[i - 1, j - 1] <= d[i, j - 1] + 1) {
     d[i, j] = d[i - 1, j - 1];
     action[i, j] = MATCHED;
} else if (d[i - 1, j] < d[i, j - 1]) // If cost of insertion is less:
{
     d[i, j] = d[i - 1, j] + 1;
     action[i, j] = INSERTION;
} else {
     d[i, j] = d[i, j - 1] + 1;
     action[i, j] = DELETION;
}

Then use action[i, j] to recursively go back through the process and pushing the chosen action in a stack.

Mehrdad Afshari
Hi,Thanks for your reply. I'm Sorry but i can't really understand, as how to reach to the solution. What is the multi-dimension array(d) doing here ? How do i populate it ? Basically, How do i start when all i have is two flat lists.
Abhishek
"d" is the array holding solutions to subproblems (d[i, j] = minimum actions required to change a[0..i] to b[0..j], therefore d[a.length, b.length] will be the solution to the complete problem). If you are familiar with LCS or dynamic programming it should be familiar to you, otherwise I suggest reading the LCS section from Introduction to Algorithms or elsewhere.
Mehrdad Afshari
+2  A: 

I implemented something in C#. Porting it to Java ...

(edit)

Here is the Java version:

enum Action {
 UNCHANGED, ADDED, REMOVED
}

static class DiffResult<T> {
 private T value;
 public Action type;

 public DiffResult(T value, Action type) {
  super();
  this.value = value;
  this.type = type;
 }

 public T getValue() {
  return value;
 }

 public Action getType() {
  return type;
 }
}


public static <T> List<DiffResult<T>> listDiff(List<T> originalList,
  List<T> newList) {
 List<DiffResult<T>> result = new ArrayList<DiffResult<T>>();

 int maxCount = Math.max(originalList.size(), newList.size());
 for (int i = 0; i < maxCount; i++) {
  if (newList.size() < i + 1)
   result.add(new DiffResult<T>(originalList.get(i),
     Action.REMOVED));
  else {
   if (originalList.size() < i + 1) {
    result.add(new DiffResult<T>(newList.get(i), Action.ADDED));
   } else {
    if (originalList.get(i).equals(newList.get(i)))
     result.add(new DiffResult<T>(originalList.get(i),
       Action.UNCHANGED));
    else {
     result.add(new DiffResult<T>(originalList.get(i),
       Action.REMOVED));
     result.add(new DiffResult<T>(newList.get(i),
       Action.ADDED));
    }
   }
  }
 }
 return result;
}

public static void main(String[] args) {
 List<Integer> oldList = new ArrayList<Integer>();
 oldList.add(1);
 oldList.add(2);
 oldList.add(3);
 oldList.add(4);

 List<Integer> newList = new ArrayList<Integer>();
 newList.add(9);
 newList.add(2);
 newList.add(3);
 newList.add(6);
 newList.add(4);

 List<DiffResult<Integer>> diff = listDiff(oldList, newList);

 for (DiffResult<Integer> d : diff) {
  System.out.println("Item: " + d.getValue() + " -> " + d.getType());
 }
}
bruno conde
A: 

Hi All,

Just for future references. Both the 1st and 2nd answers are good. The 1st answer is clue to what i was looking for. The optimal way to compare sequences. and, The 2nd answer is a working code to compare sequences. But this doesn't give a optimal result for coverting one list to another. But good for a simple diff!!

Thanks all for the answers!!

Thanks, Abhishek.

Abhishek