tags:

views:

46

answers:

1

I have a file containing data as follows

10 20 30 40 70 
20 30 70 
30 40 10 20 
29 70 
80 90 20 30 40 
40 45 65 10 20 80 
45 65 20 

I want to remove all subset transaction from this file.

output file should be like follows

10 20 30 40 70 
29 70 
80 90 20 30 40
40 45 65 10 20 80 

Where records like

20 30 70 
30 40 10 20 
45 65 20 

are removed because of they are subset of other records.

+2  A: 

Hint: easiest way is to use std::set

#include <set>
#include <algorithm>

...

std::set< int > s1, s2;

...

// Is s1 a subset of s2?
bool test = std::includes(s2.begin(),s2.end(),s1.begin(),s1.end());

You could create a set of sets, and then do a n^2 amount of include tests to remove the duplicates. Be sure to check which set is larger, to pick which one should be removed.

This is not the fastest way to do it, but probably the easiest.

Kornel Kisielewicz