How is web browser search implemented?

views:

answers:

+2 Q:

How is web browser search implemented?

I want to implement in desktop application in java searching and highlighting multiple phrases in html files, like it is done in web browsers, so html tags (within < and >) are ignored but some tags like  arent ignored. When searching for example each table in text ...each table has name... will be highlighted, but in text ...has each Table is... it will be not highlighted, because the  tag interrupts the text meaning.
in web browser is this somehow implemented, how can I get to this implementation? or is there some source on the net? I tried google, but without success :(

+2 A:

Instead of searching inside the actual HTML file the browsers search on the rendered output of that HTML.

Get a suitable HTML renderer and get its output as text. Then search on that text output using appropriate string searching algorithms.

The example that you highlighted in your question would result in a newline character in the rendered HTML output and hence a normal string searching algorithm will behave as you expect.

Faisal Feroz 2010-09-14 12:47:53

+1 thanks so far the best answer, but I want an algorithm to do this somehow in desktop app... I dont believe that nobody tried this ever :)

Zavael 2010-09-16 05:55:48

This seems pretty easy.

1) Search for the last word in the string. 2) Look at what's before the last word. 3) Decide if what's before the last word constitutes and interruption (, , <div>). 4) If interruption, continue 5) Else evaluate previous word against the search query.

I don't know if this is how browsers perform this operation, but this approach should work.

babbitt 2010-09-14 12:48:46

so you suggest to "split" the html text into some pure text parts and then apply the searching within these parts? or did I misunderstand you?

Zavael 2010-09-16 05:53:53

Try using javax.swing.text.html package in java.

Kuri 2010-09-14 13:12:15

ansaurus

tags:

views:

answers:

How is web browser search implemented?

related questions