views:

204

answers:

5

Suppose there are 4 sets:

s1={1,2,3,4};
s2={2,3,4};
s3={2,3,4,5};
s4={1,3,4,5};

Is there any standard metric to present the similarity degree of this group of 4 sets?

Thank you for the suggestion of Jaccard method. However, it seems pairwise. How can I compute the similarity degree of the whole group of sets?

+2  A: 

Your question isn't very specific. But I suppose you mean something like the "edit distance" between them? I.e. how much you need to change s1 to get to s2?

Check out the Wikipedia article on Edit distance.

adamse
A: 

you could compute the size of the intersection between each set

jspcal
+7  A: 

Pairwise, you can compute the Jaccard distance of two sets. It's simply the distance between two sets, if they were vectors of booleans in a space where {1, 2, 3…} are all unit vectors.

Tobu
+1, and probably the mean of the (6) Jaccard coefficients is what @Soup is looking for.
Nick D
Seconding your idea of taking the mean.
Tobu
+2  A: 

As Tobu said I'd use the Jaccard Index which is just the intersection divided by the union of the sets.

Aly
thanks for cleaning up the link Nick D
Aly
A: 

You could compute the Euclidean distance between them, and build a dendrogram from that to visualize similarity.

Alex Reynolds