views:

476

answers:

3

I have two dendrograms which I wish to compare to each other in order to find out how "similar" they are. But I don't know of any method to do so (let alone a code to implement it, say, in R).

Any leads ?

Thanks, Tal

A: 

If you have access to the underlying distance matrix that generated each dendrogram (you probably do if you generated the dendorograms in R), couldn't you just use correlation between the corresponding values of the two matrices? I know this doesn't address the letter of what you asked, but it's a good solution to the spirit of what you asked.

dsimcha
Hi dsimcha,Thanks for the idea.In my particular situation, I have the distance matrix for only one of the two. So your solution is not applicable. But thanks again!
Tal Galili
+3  A: 

As you know, Dendrograms arise from hierarchical clustering - so what you are really asking is how can I compare the results of two hierarchical clustering runs. There are no standard metrics I know of, but I would be looking at the number of clusters found and comparing membership similarity between like clusters. Here is a good overview of hierarchical clustering that my colleage wrote on clustering scotch whiskey's.

Paul
Hi Paul,Thank you for the answer, I'll read it through later.Thanks,Tal
Tal Galili
+5  A: 

Comparing dendrograms is not quite the same as comparing hierarchical clusterings, because the former includes the lengths of branches as well as the splits, but I also think that's a good start. I would suggest you read E. B. Fowlkes & C. L. Mallows (1983). "A Method for Comparing Two Hierarchical Clusterings". Journal of the American Statistical Association 78 (383): 553–584 (link).

Their approach is based on cutting the trees at each level k, getting a measure Bk that compares the groupings into k clusters, and then examining the Bk vs k plots. The measure Bk is based upon looking at pairs of objects and seeing whether they fall into the same cluster or not.

I am sure that one can write code based on this method, but first we would need to know how the dendrograms are represented in R.

Aniko
That is VERY helpful Aniko - thank you!I will read further into this.
Tal Galili
+1 thanks for the link. This looks like the ticket Tal.
Paul