views:

55

answers:

1

I have to search a given file name (let say Keyword) in a directory containing files. If there were only few keywords to be searched, I could have used regular search (like creating an array of file names residing in the specified directory and then search each file name with the given keyword). Since I need to search very large number of keywords dynamically, its not efficient to search using regular. I had couple of ideas:

1.using hashing (but not clear how to design it)

2.Using Bloom Filters for searching (please google , if u dont know about it, its working is very interesting!): Problem in using bloom filters is "False positives are possible, but false negatives are not". I might miss some results....

+1  A: 

Before searching, create a trie of all positive matches.

Creating the trie will take O(n) where n is the number of words.

To to search, try to match the word against the trie. Look-ups are done in O(m) where m is the length of the word to look-up.

Total runtime: O(n + nm) => O(nm) to find all the words.

Ben S
It's not true that a trie is "more compact" than a list of words! Tries are rather space-inefficient; that's how you get the savings in time. If the search space is truly huge (the OP doesn't say) then you have to get clever, and use something like a PATRICIA, or else you simply won't have the RAM.
Jonathan Feinberg
Hey Bens!!! thanks for the idea... space is no problem...only time has to be considered here...your idea seems to workout for me....Is there any java library for trie?
Reddy
Hi Jonathan !!! "very large" here i refer is very large no. of file names to be searched..
Reddy