tags:

views:

52

answers:

1

hi.... i have calculated the tf-idf values of terms of document 1 and document 2..now i dont know how to use these tf-idf values...basically i want to find similarity between two documents(in my case are webpages)..can any body tell how to implement cosine similarity, jaccard coefficient to find similarity...c# code would be appreciated..pls help...thanks

A: 

I recommend a visit to Apache Mahout. It provides a complete kit of tools for this. Even if you don't want to use them, you can get the answers to these questions by looking at existing implementations.

bmargulies
thanks for the reply....could u give me the link
jaskirat