views:

954

answers:

1

I'm using Google App Engine (Java) with JDO. How can I do the JDO equivalent of

select * from table where field like '%foo%'

The only recommendation I have seen so far is to use Lucene. I'm kind of surprised that something this basic is not possible on out-of-the-box GAE.

+6  A: 

You can't do substring searches of that sort on App Engine. The reason for this is that the App Engine datastore is built to be scalable, and refuses to execute any query it can't satisfy with an index. Indexing a query like this is next to impossible, because it requires searching the entirety of each record's 'field' property for a match. Any relational database you run this query on will execute it by doing a full table scan and checking each record individually - unscalable, to say the least.

The solution, as you've already found out, is to use full-text indexing, such as Lucene. There are libraries for running Lucene on App Engine, such as GAELucene. This also gives you the power of proper full-text search, rather than naive substring matching.

Nick Johnson
Nick, thanks for the reply. Would it be possible for the datastore to create an index on individual words, rather than %foo% ? I mean, Google is obviously able to do keyword searches, if not regexp-like searches. What I'm really trying to accomplish is to scan a set of recipes for keywords, so perhaps I formulated my question poorly. Thanks.
Caffeine Coma
Yes - and what you're referring to is called an "inverted index" - and it's what libraries like Lucene use. For Python, there's SearchableModel, which implements this pattern. You could do the same in Java, if you wanted, but you're probably better off just using Lucene.
Nick Johnson