views:

212

answers:

1

I'm using Hibernate Search / Lucene to maintain a really simple index to find objects by name - no fancy stuff.

My model classes all extend a class NamedModel which looks basically as follows:

@MappedSuperclass
public abstract class NamedModel {
    @Column(unique = true)
    @Field(store = Store.YES, index = Index.UN_TOKENIZED)
    protected String name;
}

My problem is that I get a BooleanQuery$TooManyClauses exception when querying the index for objects with names starting with a specific letter, e.g. "name:l*". A query like "name:lin*" will work without problems, in fact any query using more than one letter before the wildcard will work.

While searching the net for similar problems, I only found people using pretty complex queries and that always seemed to cause the exception. I don't want to increase maxClauseCount because I don't think it's a good practice to change limits just because you reach them.

What's the problem here?

+3  A: 

Lucene tries to rewrite your query from simple name:l* to a query with all terms starting with l in them (something like name:lou OR name:la OR name: ...) - I believe as this is meant to be faster.

As a workaround, you may use a ConstantScorePrefixQuery instead of a PrefixQuery:

// instead of new PrefixQuery(prefix)
new ConstantScoreQuery(new PrefixFilter(prefix));

However, this changes scoring of documents (hence sorting if you rely on score for sorting). As we faced the challenge of needing score (and boost), we decided to go for a solution where we use PrefixQuery if possible and fallback to ConstantScorePrefixQuery where needed:

new PrefixQuery(prefix) {
  public Query rewrite(final IndexReader reader) throws IOException {
    try {
      return super.rewrite(reader);
    } catch (final TooManyClauses e) {
      log.debug("falling back to ConstantScoreQuery for prefix " + prefix + " (" + e + ")");
      final Query q = new ConstantScoreQuery(new PrefixFilter(prefix));
      q.setBoost(getBoost());
      return q;
    }
  }
};

(As an enhancement, one could use some kind of LRUMap to cache terms that failed before to avoid going through a costly rewrite again)

I can't help you with integrating this into Hibernate Search though. You might ask after you've switched to Compass ;)

sfussenegger
Thank you very much. As far as I can tell there's no direct way to change the generated `Query` type for Hibernate Search. But I changed the analyzer used to `KeywordAnalyer` which doesn't generate defunct queries and also fits my needs better.
Koraktor