views:

36

answers:

0

I have a set of documents that all have a "timestamp" field which is stored as a long integer number. The field is indexed in my Lucene index as a number using NumericField with a precision step of 8: NumericField("timestamp", 8). This is done so I can do numeric range queries to retrieve all documents that fall within a specific time range.

The query I construct has two parts to it, a query, and a filter. I get the document hits by calling the following method:

IndexSearcher.search( query, filter, myCollector);

The query parameter is one that matches all documents. The filter parameter is a numeric range filter on the "timestamp" field, which I create as follows:

filter = NumericRangeFilter.newLongRange("timestamp", 8, startTime, endTime, false, true);

Occasionally, I have a single document with a very specific timestamp I want to retrieve. Suppose that timestamp is timeX, I will create the filter as follows:

filter = NumericRangeFilter.newLongRange("timestamp", 8, timeX-1, timeX, false, true);

But with this filter, the document that should be found is never found. I have even tried expanding the time range as follows, but with no success:

filter = NumericRangeFilter.newLongRange("timestamp", 8, timeX-1, timeX+500, false, true);

Strangely, a filter that should NOT have found the document actually did find the document:

filter = NumericRangeFilter.newLongRange("timestamp", 8, timeX, timeX+1000, false, true);

This filter should NOT have found the document since the minInclusive argument is false.

I have also noticed that sometimes when I have several documents with exactly the same timestamp, a query will return some, but not all, of the documents.

This behavior has caused me to lose confidence in the so-called "trie" indexing for numeric range queries in Lucene. Am I doing something wrong here? Have I misunderstood how this is supposed to work? Has anyone else had problems like this?