I have a document that was converted from PDF to HTML for use on a company website to be referenced and indexed for search. I'm attempting to format the converted document to meet my needs and in doing so I am attempting to clean up some of the junk that was pulled over from when it was a PDF such as page numbers, headers, and footers. luckily all of these lines that need to be removed are in blocks of 4 lines unfortunately they are not exactly the same therefore cannot be removed with a simple literal replace. The lines contain numbers which are incremental as they correlate with the pages. How can I remove the following example from my html file.
Title<br>
10<br>
<hr>
<A name=11></a>Footer<br>
I've tried many different regular expression attempts but as my skill in that area is limited I can't find the proper syntax. I'm sure i'm missing something fairly easy as it would seem all I need is a wildcard replace for the two numbers in the code and the rest is literal.
any help is apprciated