I'm trying to write a search and replace regex that will detect whether HTML that has been returned by a web request is complete. I have had cases when the server returns incomplete HTML (half of the page), so I want to detect that in the client and request the page again.
I was thinking the regex could look for the presence of <html[^>]*>
, and then the absence of </html>
. The replace part would then replace the whole HTML with a bit of special text.
I can't just check for the absence of </html>
because the returned data might be a text file, and I can't check MIME types.
Any ideas? I just can't wrap my head around the look-behinds this would require. I'm not trying to parse HTML, just searching for bits of text, which is what regexes are for, right?
EDIT:
The regexes will be run by C#, but I write them in a regex editor. I can only use a search and replace regex to solve this, nothing else.