views:

160

answers:

2

I have a fairly simple need to do a conditional update in Solr, which is easily accomplished in MySQL.

For example,

  • I have 100 documents with a unique field called <id>
  • I am POSTing 10 documents, some of which may be duplicate <id>s, in which case Solr would update the existing records with the same <id>s
  • I have a field called <dateCreated> and I would like to only update a <doc> if the new <dateCreated> is greated than the old <dateCreated> (this applies to duplicate <id>s only, of course)

How would I be able to accomplish such a thing?

The context is trying to combat race conditions resulting in multiple adds for the same ID but executing in the wrong order.

Thanks.

+1  A: 

I can think of two ways:

  1. Write your own UpdateHandler and override addDoc to implement that checking.
  2. Put the appropriate locks (critical sections) in your client code in order to fetch the stored document, compare the dates, and conditionally add the new document in a thread-safe manner.

Remember that Solr is not a database, comparing it to MySQL is comparing apples and oranges.

Mauricio Scheffer
Thanks, I was hoping for something that is already supported. #1 sounds useful but complicated - I'm not a Java developer. #2 is probably the approach I will go for in the constraints of the deadline.
Artem Russakovskii
A: 

With really custom addition logic like this, I find that writing your own client side updater works better. It keeps you from mucking around in Solr internals, which makes it easier to update in the future. You can definitly do this in SolrJ, but if you aren't a Java dev, there is probably a clientside library in your own preferred language... PHP, Python, Ruby, C# etc...

The rsolr Ruby gem (http://github.com/mwmitchell/rsolr/tree/master) makes it VERY easy to hack together a custom load script.

Eric Pugh