Hello all,
I am developing a component that stores URLs and maintains a certain set of keywords associated with these URLs. For example-
URL: http://www.imdb.com Keywords: search, movies, movie-index, reviews
The keywords themselves are not restricted by number. The number of urls may be huge in number ranging between 10K to 100K. What's the best approach to associate and store the URLs with their keywords? This should support search by keywords and listing based on keyword combinations. I surely feel its not a good approach to use relational DB for this.
Maybe my question summarizes to "how does a search engine work". But I am looking for more specific information like- Are there tools available to store the keywords and index them? I have heard of Apache Lucene, that seems to more of a full text search engine.
What does stackoverflow use internally to associate the keywords with articles? :)
Thanks,
-Keshav