views:

329

answers:

2

Hi,

I've two sets of search indexes. TestIndex (used in our test environment) and ProdIndex(used in PRODUCTION environment). Lucene search query: +date:[20090410184806 TO 20091007184806] works fine for test index but gives this error message for Prod index.

"maxClauseCount is set to 1024"

If I execute following line just before executing search query, then I do not get this error. BooleanQuery.SetMaxClauseCount(Int16.MaxValue); searcher.Search(myQuery, collector);

Am I missing something here?Why am not getting this error in test index?The schema for two indexes are same.They only differ wrt to number of records/data.PROD index has got higher number of records(around 1300) than those in test one (around 950).

Thanks for reading.

+1  A: 

The range query essentially gets transformed to a boolean query with one clause for every possible value, ORed together.

For example, the query +price:[10 to 13] is tranformed to a boolean query

+(price:10 price:11 price:12 price:13)

assuming all the values 10-13 exist in the index.

I suppose, all of your 1300 values fall in the range you have given. So, boolean query has 1300 clauses, which is higher than the default value of 1024. In the test index, the limit of 1024 is not reached as there are only 950 values.

Shashikant Kore
Thanks Shashikant for your answer.What is the solution to resolve this issue?BooleanQuery.SetMaxClauseCount(Int16.MaxValue); is supposedly a very expensive call.Thanks.
Ed
The downside is the performance of query degrades with the count of unique timestamps. But, it is not that bad. You can try it out and check if the perfromance is acceptable. You should mostly be fine.Lucene 2.9 (Java) has improved range queries dramatically. I am not sure when this will be ported to .Net version.Meanwhile, there are other tricks you can use for date queries. Typically, it involves breaking date into year, month and day. This needs lot of work to translate user query to the underlying lucene format. Try searching for "lucene date query" to get interesting ideas.
Shashikant Kore
In the meantime, you can design your date field differently - could you restrict it to days in a single year? (thus restricting it to 365 values)? Or split a data into year, month and day and use a more complex query? I know this is inelegant, but it may work.
Yuval F
A: 

I had the same problem. My solution was to catch BooleanQuery.TooManyClauses and dynamically increase maxClauseCount.

Here is some code that is similar to what I have in production.

Good Luck, Randy


    private static Hits searchIndex(Searcher searcher, Query query)
        throws IOException
    {
        boolean retry = true;
        while (retry)
        {
            try
            {
                retry = false;
                Hits myHits = searcher.search(query);
                return myHits;
            }
            catch (BooleanQuery.TooManyClauses e)
            {
                // Double the number of boolean queries allowed.
                // The default is in org.apache.lucene.search.BooleanQuery and is 1024.
                String defaultQueries = Integer.toString(BooleanQuery.getMaxClauseCount());
                int oldQueries = Integer.parseInt(System.getProperty("org.apache.lucene.maxClauseCount", defaultQueries));
                int newQueries = oldQueries * 2;
                log.error("Too many hits for query: " + oldQueries + ".  Increasing to " + newQueries, e);
                System.setProperty("org.apache.lucene.maxClauseCount", Integer.toString(newQueries));
                BooleanQuery.setMaxClauseCount(newQueries);
                retry = true;
            }
        }
    }
Randy Stegbauer