views:

323

answers:

1

Dear Folks,

I am trying order the files on a common fileshare of my department, containing thousands of documents of various filetypes. My idea was to sort them by content-related keywords. Only few files contain valid info in the keywords file attribute provided by Windows. My idea was to let some desktop search engine index the files (and their content) and then use the generated keywords from the index.

The problem is that I don't know how to read these generated keywords from the search index.

Neither Microsoft nor Copernic seem to provide any information on how to access their index files. MSDN only provides info about how to query the Windows Search engine directly from your program, but the results do only contain Windows file attributes and file information, but not those generated keywords used for indexing. Copernic does not seem to provide any info at all.

I am very grateful for any idea on how to access these generated keywords. Thank you in advance!

+1  A: 

If Google Desktop search is an option, you may use the Google Desktop Search API. A more programming-intensive option is using Lucene. Somewhere in the middle is nutch.

Yuval F