This came up in a real-world situation, and I thought I would share it, as it could lead to some interesting solutions. Essentially, the algorithm needs to diff two lists, but let me give you a more rigorous definition of the problem.
Mathematical Formulation
Suppose you have two lists, L
and R
each of which contain elements from some underlying alphabet S
. Moreover, these lists have the property that the common elements that they have appear in order: that is to say, if L[i] = R[i*]
and L[j] = R[j*]
, and i
< j
then i
* < j
*. The lists need not have any common elements at all, and one or both may be empty. [Clarification: You may assume no repetitions of elements.]
The problem is to produce a sort of "diff" of the lists, which may be viewed as new list of ordered pairs (x,y)
where x
is from L
and y
is from R
, with the following properties:
- If
x
appears in both lists, then(x,x)
appears in the result. - If
x
appears inL
, but not inR
, then(x,NULL)
appears in the result. - If
y
appears inR
, but not inL
, then(NULL,y)
appears in the result.
and finally
- The result list has "the same" ordering as each of the input lists: it shares, roughly speaking, the same ordering property as above with each of the lists individually (see example).
Examples
L = (d)
R = (a,b,c)
Result = ((NULL,d), (a,NULL), (b,NULL), (c,NULL))
L = (a,b,c,d,e)
R = (b,q,c,d,g,e)
Result = ((a,NULL), (b,b), (NULL,q), (c,c), (d,d), (NULL,g), (e,e))
Does anyone have any good algorithms to solve this? What is the complexity?