I am interviewing candidates for a position developing an application which relies heavily on Lucene. In addition to the usual questions I ask, I'd like to be able to ask one or two Lucene-specific questions that will give me a rough idea of how familiar they are with the library. The problem is that I have no experience with Lucene myself. Any suggestions?
If the candidate has a long history of Java development, familiarity with the Lucene API shouldn't be that important. Someone unfamiliar with Lucene might take a little longer to get started, but in the long run, I would feel much more comfortable with a Very experienced Java developer than a somewhat experienced java Developer with Lucene experience. In fact, I might prefer an very experienced non-java programmer if there portfolio was impressive.
A couple of questions I would ask:
- What is the Lucene data structure? (inverted index)
- How does Lucene computes the relevancy of a document? (vector space model, boolean model)
- What is a segment? (a portion of the index)
- How text is being indexed? (analyzers, tokenizers)
- What is a document? (collection of fields)
- What is the Lucene query syntax looks like? (boolean query, boost, fuzzy searches)
- How it differs from a relational database and when would you use one over the other?
This is a tricky task. You're looking for the guy who knows more about Lucene than you do; therefore, you can't be a reliable judge of the candidates' knowledge (although you should be able to at least eliminate the ones who obviously know less than you).
My advice is to ask the candidates to explain to you some aspect of Lucene that you are confused about. When the interview's over, you can look it up to see if the answer made sense. This has the added benefit of testing their ability to communicate complex ideas. (And if the answer is "I don't know", then you should take that as a good sign: people who are willing to admit their ignorance are worth a lot more than those who aren't.)