views:

61

answers:

2

I think I'm missing something obvious here. I have to imagine a lot of people open up their Solr servers to other developers and don't want them to be able to modify the index.

Is there something in solrconfig.xml that can be set to effectively make the index read-only?

Update for clarification: My goal is to use Solr with an existing Lucene index managed by another application. This works just fine, but I want to be sure Solr never tries to write to this index.

+2  A: 

You can probably just remove the line that defines your solr.XmlUpdateRequestHandler in solrconfig.xml.

Replication is a nice way to setup read-only while being able to do indexation. Just setup a master with restricted access and a slave that is read-only (by removing your XmlUpdateRequestHandler from the config). The slave will be replicated from the master but won't accept any indexation directly.

UPDATE

I just read that in Solr 1.4, you can disable component. I just tried it on the /update requestHandler and I was not able to index anymore.

Pascal Dimassimo
Apparently commenting out that request handler won't disable anything because it's just acting as an override (according to http://wiki.apache.org/solr/SolrRequestHandler). I guess you could stick in a bogus class for the /update request handler, but that seems like a bad idea.
wynz
thanks, its good to know...
Pascal Dimassimo
see my updates about disabling component
Pascal Dimassimo
nice find! i thought this 'disable component' feature was the golden ticket, but unfortunately it doesn't seem to let you disable core components including /select /update and /admin. thanks for your help in looking for a solution.
wynz
you test it? I tried it earlier and, either by removing the requestHandler declaration or using the enable attribute was enough to disabling the /update of my index
Pascal Dimassimo
Yeah, I tested it, but I'll try again. Changing the 'enable' attribute definitely worked for custom requestHandlers, but I couldn't get it to disable /update.
wynz
Yep, when 'enable' is set to false Solr just ignores the definition and uses the core handler. Same thing happens when I comment out the whole requestHandler line for /update.
wynz
weird... you use Solr on Tomcat or Jetty? My Solr is running on Tomcat.
Pascal Dimassimo
Tomcat 6.0.28 and Solr 1.4.1
wynz
And don't forget that each time you change something in solrconfig.xml, you have to reload your core or restart your webserver
Pascal Dimassimo
So I went and tested with a fresh Solr install. I found that commenting out the /update handler or setting it to false initially prevents documents from being added. However, they still get queued up and later committed when Tomcat restarts. So close... :(
wynz
+2  A: 

Exposing a Solr instance to the public internet is a bad idea. Even though you can strip some components to make it read-only, it just wasn't designed with security in mind, it's meant to be used as an internal service, just like you wouldn't expose a RDBMS.

From the Solr Security wiki page:

First and foremost, Solr does not concern itself with security either at the document level or the communication level. It is strongly recommended that the application server containing Solr be firewalled such the only clients with access to Solr are your own. A default/example installation of Solr allows any client with access to it to add, update, and delete documents (and of course search/read too), including access to the Solr configuration and schema files and the administrative user interface.

Even ajax-solr, a Solr client for javascript meant to run in a browser, recommends talking to Solr through a proxy.

Take for example guardian.co.uk: it's well-known that they use Solr for searching, but they built an API to let others access their content. This way they can define and control exactly what and how they want people to search for things.

Otherwise, any script kiddie can write a trivial loop to DoS your Solr instance and therefore bring down your site.

Mauricio Scheffer
+1 well said and thanks for the links
Pascal Dimassimo
Those are good suggestions and hopefully anyone setting up Solr for production would follow these suggestions. But this doesn't really get to the question asked. I'll edit the question to clarify my particular use case.
wynz
@wynz: ok, it's cool if it's for internal use only.
Mauricio Scheffer