I've got two ways of fetching a bunch of data. The data is stored in a sorted vector<map<string, int> >
.
I want to identify whether there are inconsistencies between the two vectors.
What I'm currently doing (pseudo-code):
for i in 0... min(length(vector1), length(vector2)):
for (k, v) in vector1[i]:
if v != vector2[i][k]:
// report that k is bad for index i,
// with vector1 having v, vector2 having vector2[i][k]
for i in 0... min(length(vector1), length(vector2)):
for (k, v) in vector2[i]:
if v != vector1[i][k]:
// report that k is bad for index i,
// with vector2 having v, vector1 having vector1[i][k]
This works in general, but breaks horribly if vector1
has a, b, c, d
and vector2
has a, b, b1, c, d
(it reports brokenness for b1
, c
, and d
). I'm after an algorithm that tells me that there's an extra entry in vector2
compared to vector1
.
I think I want to do something where when I encountered mismatches entries, I look at the next entries in the second vector, and if a match is found before the end of the second vector, store the index i
of the entry found in the second vector, and move to matching the next entry in the first vector, beginning with vector2[i+1]
.
Is there a neater way of doing this? Some standard algorithm that I've not come across?
I'm working in C++, so C++ solutions are welcome, but solutions in any language or pseudo-code would also be great.
Example
Given the arbitrary map objects: a
, b
, c
, d
, e
, f
and g
;
With vector1
: a
, b
, d
, e
, f
and vector2
: a
, c
, e
, f
I want an algorithm that tells me either:
Extra
b
at index 1 ofvector1
, andvector2's c != vector1's d
.
or (I'd view this as an effectively equivalent outcome)
vector1's b != vector2's c
and extrad
at index 2 ofvector1
Edit
I ended up using std::set_difference
, and then doing some matching on the diffs from both sets to work out which entries were similar but different, and which had entries completely absent from the other vector.