tags:

views:

8

answers:

0

Hi,

I've got a bit of a problem that I can't seem to get to the bottom of. I'm a bit fo a Solr Noobie so stick with me here.

I am writing a script to populate a solr instance with a small subset of documents, around 250 using the PHP SolrClient.

The script runs through and seems to be populating fine. However my document count never exceeds 123 documents indexed.

Looking in the tomcat logs I've noticed that after commiting to the index I see this in the logs:

9-Oct-2010 22:59:09 org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {commit=} 0 867
19-Oct-2010 22:59:09 org.apache.solr.core.SolrCore execute
INFO: [corecv] webapp=/solr path=/update/ params={indent=on&wt=xml&version=2.2} status=0     QTime=867 
19-Oct-2010 22:59:09 org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=true,waitFlush=true,waitSearcher=true,expungeDeletes=false)
19-Oct-2010 22:59:09 org.apache.solr.core.SolrDeletionPolicy onInit
    INFO: SolrDeletionPolicy.onInit: commits:num=1
    commit{dir=/opt/solr/data/index,segFN=segments_7nw,version=1287434278309,generation=9932,filenames=[_6af.tii, _6ag.tis, _6ag.fdt, _6af.fdx, segments_7nw, _6af.prx, _6ag.fnm, _6ag.nrm, _6af.fnm, _6ag.fdx, _6ag.prx, _6ag.tii, _6af.tis, _6af.fdt, _6ag.frq, _6af.nrm, _6af.frq, _6ag_1.del]
19-Oct-2010 22:59:09 org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1287434278309
19-Oct-2010 22:59:09 org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
    commit{dir=/opt/solr/data/index,segFN=segments_7nw,version=1287434278309,generation=9932,filenames=[_6af.tii, _6ag.tis, _6ag.fdt, _6af.fdx, segments_7nw, _6af.prx, _6ag.fnm, _6ag.nrm, _6af.fnm, _6ag.fdx, _6ag.prx, _6ag.tii, _6af.tis, _6af.fdt, _6ag.frq, _6af.nrm, _6af.frq, _6ag_1.del]
    commit{dir=/opt/solr/data/index,segFN=segments_7nx,version=1287434278310,generation=9933,filenames=[_6ah.fnm, _6ah.fdx, _6ah.tis, segments_7nx, _6ah.prx, _6ah.frq, _6ah.fdt, _6ah.nrm, _6ah.tii]
19-Oct-2010 22:59:09 org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1287434278310
19-Oct-2010 22:59:09 org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@19e34ed main
19-Oct-2010 22:59:09 org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming Searcher@19e34ed main from Searcher@1629d39 main
    fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0

No idea what's going on here but it looks like it is deleting after I commit.

My PHP code loops through a DB recordset and creates plus populate the SolrInputDocument and commits after each iteration of the loop e.g.:

$solrClient = new SolrClient($options);
foreach($dataset as $row){

  $document = new SolrInputDocument();
  $document->addField("content", $row["content"];
  ....

  $document->addField("id", $row["id"];

  $solrClient->addDocument($document);
  $solrClient->commit();
}
$solrClient->optimize();

Trimmed the code... it's a bit long.

Any idea why this could cause documents not to appear in the index? I've even checked using Luke to inspect the index and can see the documents in there. The odd thing is the documents that are being removed don't "seem" to be consistant.

Any help would really be useful.

Thanks,

Grant