The title is a bit awkward but I couldn't found a better one. My problem is as follows:
I have several users stored as documents and I am storing several key-value-pairs or items (which have an id) for each document. Now, if I apply highlighting with hl.snippets=5 I can get the first 5 items. But every user could have several hundreds items, so
- you will not get the most relevant 5 items. You will get the first 5 items ...
Another problem is that
- the highlighted text won't contain the id and so retrieving additional information of the highlighted item text is ugly.
Example where items are emails:
user1 has item1 { text:"developers developers developers", id:1, title:"ms" }
item2 { text:"c# development", id:2, title:"nice!" }
...
item77 ...
user2 has item1 { text:"nice restaurant", id:3, title:"bla"}
item2 { text:"best cafe", id:4, title:"blup"}
...
item223 ...
Now if I use highlighting for the text field and query against "restaurant" I get user2 and the text nice <b>restaurant</b>
. But how can I determine the id of the highlighted text to display e.g. the title of this item? And what happens if more relevant items are listed at the end of the item-list? Highlighting won't display those ...
So how can I find the best items of a documents with multiple such items?
I added my two findings as answers, but as I will point out each of them has its own drawbacks.
Could anyone point me to a better solution?