views:

51

answers:

2

Im switching fulltext searching on my site to sphinx. Im going to be using SphinxSE to perform the searches.

I created 2 indexes, as specified in the manual: http://www.sphinxsearch.com/docs/manual-0.9.9.html#live-updates

It seems to work, and index different stuff in its own index, but Im somewhat confused about how I should handle the index updating, merging, and rebuilding.

The way I understand is I cron it to run "indexer delta --rotate" every 5 mins or so, which would add new submissions to the index. Then once a day, I would merge the delta index into the main index by running "indexer main delta --rotate". then once a month or so, I'll run "indexer --all" to rebuild all indexes.

Am I doing this right, or am I missing something?

+1  A: 

Sounds pretty much like the setup I did for a customer. And no, the search won't stop working during updates. From the Sphinx docs:

--rotate is used for rotating indexes. Unless you have the situation where you can take the search function offline without troubling users, you will almost certainly need to keep search running whilst indexing new documents. --rotate creates a second index, parallel to the first (in the same place, simply including .new in the filenames). Once complete, indexer notifies searchd via sending the SIGHUP signal, and searchd will attempt to rename the indexes (renaming the existing ones to include .old and renaming the .new to replace them), and then start serving from the newer files. Depending on the setting of seamless_rotate, there may be a slight delay in being able to search the newer indexes.

bemace
+1  A: 

--rotate would just build index in tmp (need space disk) and switch + restart searchd when it's done.

about delta, you need to use pre-query to compute the "limit" max(id) the main indexes id below the limit, and delta is up to this limit.

if you have a timestamp (indexed if possible) you can use it

main -> where timefile < today() delta -> where timefile >= today()

Moosh

related questions