views:

37

answers:

1

Hi,

I want to develop a website with its own search engine. I would like to use some sort of web framework for all this development, like Django or Rails. The search would be based on vector space model (data is represented as term-by-document matrix). Everything would be running on one server, there would be no extra server for information indexing or search. I also would expect that system to run fast.

My question is, does anyone have any experience developing such system and could share some thoughts? Because the way I see it, every time loading term-by-document matrix into some array and performing search in it, plus performing such task very frequently, will be very slow.

Thanks!

A: 

You wouldn't develop your own search engine, that would be insane. Just use something already available.

  • Sphinx...
  • Solr

I absolutely love Solr and recommend you use this. Its built off of the Java Lucene search library and is open source.

Take a look here:

http://lucene.apache.org/solr/

Laykes
Do you know if it is possible to control weights given to particular terms when documents are indexed by Solr?
spacemonkey
Yes. You use Score. You can set a score multiplier for any index, matching method etc
Laykes