views:

118

answers:

2

I am trying to create a search for my website over a mySQL database. I started down the line of using Sphinx but was hesitant when learning that the index doesn't update in real time. I did see they have an almost real time update but I am concerned this doesn't fit my system well because new content is added to the database on a minute by minute basis. This new content needs to be added immediately and re-indexing after each update seems strange.

I am currently looking into Solr which is built on Lucene but this also doesn't seem to fit my needs because it is more of a file based search instead of a database search. It also looks like an awful lot to configure for a relatively simple search.

I also found this stackoverflow question but had a few problems with it as well. The first is that I am not searching through just one field but many. Also, I am worried that searches done purely in SQL may be too slow over my database which will hopefully store in the hundreds of thousands of records, if not more.

If anyone has any opinions on any of the software I have mentioned or any that I haven't, all ideas are welcome. I am using java for the back-end if that makes any difference. Thanks.

+3  A: 

At their core, databases are just files. What is wrong with a file based search?

It sounds like Solr fulfills your requirements. If you use their example setup they provide in their download, there isnt much to getting started. All you would need to do is configure your schema.xml for your data.

To get real-time search you would need to add your documents to the solr index in real time. This is a simple to post to one of Solr's servlets or can be done through SolrJ (their java client).

If you are searching over many columns, I think Solr will be more efficient and easier to use than a database. It will also provide a richer feature set such as faceting and stemming.

Brad
Nothing wrong with file based search. Just seemed like it might be a bit more difficult to set up as a result. But it is looking like Solr is the right choice at this point. Thanks for the info on real-time search with Solr.
UmYeah
+2  A: 

There is also just plain Lucene and Xapian --- the latter has PHP-bindings.