I am currently performing a full text search on my "pages" in a database. While users get the results they want, I am unable to provide them with relevant information as to why in the world the results that came up, came up.
Specifications on what I am looking for:
- I have HTML Data, meaning that if you search for a term such as "test" and the resulting page contained,
<b>here is some test</b> page
. I should be able to highlight the term without adversely affecting the html code on the page. - I only want to return a portion of the document, much like google does; where the portion returned contains a good portion of my search terms. How can I determine which portion contains the most terms? Would it be best to determine which section returns the most terms overall, or the section that has the most of the individual search terms, or a combination of both? Or should multiple snipits of information be included?
- I would like to do this server side, if that is a viable option?
I am unsure as to what the best way of going about doing these two things are. I do know of one issue that can easily be overlooked that needs to be taken into account:
a. Snipping off html data at random points can completely ruin the page if you are not careful, for example, not closing a div tag can throw my whole layout off. What are the best solutions around this?
What are the best methods for achieving a search system like the one above?