Hello!
I would like to calculate the term frequency using Tf-idf. I've drafted an equation where you should get the Tf-idf value on the left side. Is this correct?
Tf-idf for DOCUMENT:
tf-idf(WORD) = occurrences(WORD,DOCUMENT) / number-of-words(DOCUMENT) * log10 ( documents(ALL) / ( 1 + documents(WORD, ALL) ) )
- occurrences(WORD,DOCUMENT): number of occurrences of WORD in DOCUMENT
- number-of-words(DOCUMENT): number of words in DOCUMENT
- documents(ALL): number of documents in the database
- documents(WORD, ALL): number of documents in the database which contain WORD
It would be great if you could help me. Thank you very much in advance!