I've got a database with a lot of books in it. I've got fields like title, descriptions, authors etc.
I'm indexing title with a boost of 100f and description with a boost of 0.1f, both fields tokenized and stemmed.
I'm searching with a single input field, that searches in all available fields using a booleanquery joined with BooleanClause.Occur.SHOULD and containing a wildcardquery for each field. I also remove all "stopwords" from the query to start with.
The problem i'm having is when i search for the string without the quotes
"de wetenschap van het leven", after removing the stop words i get "wetenschap leven"
The Title query becomes "*wetenschap
* *leven
*", the description query the same, with a wrapping booleanquery joined with BooleanClause.Occur.SHOULD.
The following books are in the db
- Wetenschappelijk denken. Een inleiding voor de medische en biomedische wetenschappen en voor de andere levenswetenschap.
- De wetenschap van de aarde. Over een levende planeet
- Atlas van de menselijke levensloop
- De wetenschap van het leven. Over eenheid in biologische diversiteit
The book return in the first 4 books, that's good, but in this implementation we cut off at 3 and the rest is below a read more link. Just upping the cutoff is not an option
For me, the "De wetenschap van het leven. Over eenheid in biologische diversiteit" book matches the query "more" then the others (or so i feel), but i'm unable to find the correct index/search combination to make this work. Does anyone have an idea?