tags:

views:

194

answers:

3

Hi everybody.

I have this problem for a long time, and can't find a solution. I guess this might be something everybodys faced using Sphinx, but I cnanot get any usefull information.

I have one index, and a delta. I queried in a php module both indexes, and then show the results. For each ID in the result, I create an object for the model, and dsiplay main data for that model.

I delete one document from the database, phisically.

When I query the index, the ID for this deleted document is still there (in the sphinx result set). Maybe I can detect this by code, and avoid showing it, but the result set sphinx gaves me as result is wrong. xxx total_found, when really is xxx-1. For example, Sphinx gaves me the first 20 results, but one of this 20 results doesn't exists anymore, so I have to show only 19 results.

I re-index the main index once per day, and the delta index, each 5 minutes.

Is there a solution for this??

Thanks in advance!!

A: 

I suppose you could ask for maybe 25 results from sphinx and then when you get the full data from your DB just have a limit 20 on the query.

Ty W
A: 

Maybe this fits better to my needs, but involves changing the database.

http://sphinxsearch.com/docs/current.html#conf-sql-query-killlist

Alejandro G.
A: 

What I've done in my Ruby Sphinx adapter, Thinking Sphinx, is to track when records are deleted, and update a boolean attribute for the records in the main index (I call it sphinx_deleted). Then, whenever I search, I filter on values where sphinx_deleted is 0. In the sql_query configuration, I have the explicit attribute as follows:

SELECT fields, more_fields, 0 as sphinx_deleted FROM table

And of course there's the attribute definition as well.

sql_attr_bool = sphinx_deleted

Keep in mind that these updates to attributes (using the Sphinx API) are only stored in memory - the underlying index files aren't changed, so if you restart Sphinx, you'll lose this knowledge, unless you do a full index as well.

This is a bit of work, but it will ensure your result count and pagination will work neatly.

pat

related questions