views:

220

answers:

2

I'd like to implement a filter/search feature in my application using Lucene.

Querying Lucene index gives me a Hits instance, which is nothing more than a list of Documents matching my criteria.

Since I generate the indexed Documents from my objects, which is the best way to find the original object related to a specific Lucene Document?


A better description of my situation:
- Three model classes for now: Folder (can have other Folders or Lists as children), List (can have Tasks as children) and Task (can have other Tasks as children). They are all DefaultMutableTreeNode subclasses. I'll add the Tag entity in the future.
- Each Task has a text, a start date, a due date, some boolean flags.
- They are displayed in a JTree.
- The hole tree is saved in an XML file.
- I'd like to do things like these:
* search Tasks with Google-like queries.
* Find all Tasks that start today.
* Filter Tasks by Tag.

+2  A: 

You can't, not with vanilla Lucene. You said yourself that you converted your objects into Documents and then stored the Documents in Lucene, how would you imagine that process would be reversible?

If you want to store and retrieve your own objects in Lucene, I strongly recommend that you use Compass instead. Compass is to Lucene what Hibernate is to JDBC - you define a mapping between your objects and Lucene documents, Compass takes care of the conversion.

skaffman
Hibernate Search is to information retrieval what Hibernate is to relational databases. I haven't examined Hibernate Search in depth, but I have looked at Compass, and I believe it made a fundamental design mistake by implementing a JDBC-based `Directory` instead of an `IndexReader`. I really discourage the use of Compass.
erickson
Compass can use whatever Lucene directory you choose, the JDBC-based one is just one option. You can also use RAM directories and FileSystem directories. If that's the basis on which you've been recommending against Compass, you've been doing so on the wrong information.
skaffman
And Hibernate Search is for indexing Hibernate databases, it is *not* a general indexing mechanism. Lucene (and Compass) are.
skaffman
If using other directories, how is atomicity preserved between the index and the stored entities?
erickson
By using Transactions. Hibernate Search has the same problem to solve, and solves it in the same way.
skaffman
Sounds interesting, I'll look into it!
Giuseppe
+2  A: 

Add a "stored" field that contains an object identifier. For each hit, lookup the original object via the identifier.

Without knowing more context, it's hard to be more specific.

erickson
Yes, this is the easy way to do this. I guess you could serialize your objects into documents, and then recreate them, but this sounds like a bad design.
Yuval F
Since my objects are stored in a Tree, I should walk the hole tree to find the object I'm looking for. This would make Lucene useless.
Giuseppe
Hardly. Lucene is an information retrieval system. Its data structures are different than those used to efficient lookup a record by key. I'm not sure to what kind of "Tree" you are referring, but if you mean a `java.util.TreeMap`, rather than walking the whole tree, you'll get a O(log n) lookup (or O(1) lookup, if you switch to a `HashMap`). Similar story if you use a B-Tree on disk. Lucene offers many features not available from a simple tree on its own: tokenization, stemming, relevance ranking, etc. Perhaps you are using one or the other incorrectly if the difference isn't apparent.
erickson
I'd like to use Lucene to find objects in the Tree (a DefaultTreeModel) so I could avoid to walk through it. I'm trying to say that this would be useless, if I had to walk the tree anyway to get the objects corresponding to the Documents returned by Lucene.
Giuseppe