tags:

views:

2199

answers:

2

hello,

we decided to use mongodb for some web application (instead of mysql) but want to stay with sphinx for indexing/searching all data stored in mongodb. as the mongodb object-id is a hash per default -- and we want to stay with this -- now there's one problem in using sphinx. as it says in the sphinx documentation:

ALL DOCUMENT IDS MUST BE UNIQUE UNSIGNED NON-ZERO INTEGER NUMBERS (32-BIT OR 64-BIT, DEPENDING ON BUILD TIME SETTINGS).

so ... what's the best way to solve this problem ... how can we map the mongodb object-id to a non-zero integer (and back)?

thanks!

UPDATE

casey's answer is the right direction to look into, however at it turns out string attributes are in the current dev-version only available for the sql datasource. for xmlpipe it's necessary to apply a patch to the checkout source. more information on this can be found in the sphinx forum at:

http://sphinxsearch.com/forum/view.html?id=4102

+9  A: 

You can't use the object id as a Sphinx document id - MongoDB object IDs are bigger than the maximum size of Sphinx's document IDs.

Instead, you could increment a unique ID while generating the XML that Sphinx is going to process (I'm assuming you are using xmlpipe to get your Mongo data into Sphinx?) and store the MongoDB object ID as a string attribute in Sphinx.

You'll need the latest development version of Sphinx to do this - see my answer to this question for a little more detail: http://stackoverflow.com/questions/1644800/sphinx-without-using-an-autoincrement-id/1650190#1650190

casey
thanks very much ... seems to be exactly what i need! generally i have no problem in running a development version. i'll try tomorrow and set 'answered', if everything works as expected. thanks again!
harald
A: 

Something I've been looking at playing with is SOLR.

EllisGL

related questions