views:

52

answers:

1

I'm working on legacy code that builds an index of popular terms in another index. There are no unit tests in place, and the indexing process is a pain to wait for because the first index takes so long to build.

I want to structure the second (popular term) index differently. Is there a best practice for testing to see if a Lucene index is being created properly?

EDIT>> Per @Pascal's advice I'm using a RAMDirectory, then to test the index I just wrote I set up an indexReader and iterate through the term results, printing out each term to make sure the data looks alright.

Code:

IndexReader reader = IndexReader.open(dir2);
TermEnum terms = reader.terms();
System.out.println("Here come the terms!");
while (terms.next()){
    if (terms.term().field().equals("FULLTEXT")){
        System.out.println(terms.term());
    }
}
int numDocs = reader.maxDoc();
System.out.println("Number of Docs: " + numDocs);

If the index is really large I let it run for a bit then just stop it midway through.

Also, Luke is a great tool for inspecting the index if you want to be more thorough... I'm just looking for something fast.

Any other ideas are welcome!

+2  A: 

When unit-testing Lucene index, I often use the RAMDirectory as it is quick to build.

Pascal Dimassimo
Good idea, that way it also doesn't persist right?
stinkycheeseman
Yes, it won't persist to disk, but it will stay in memory for the time of the test.
Pascal Dimassimo
This works really well, thanks Pascal!
stinkycheeseman