views:

73

answers:

1

I'm having a go at writing some C# code which overrides the Render method of the System.Web.UI.Page and then reformats the HTML before presenting it to the browser. This is purely experimental so overhead is not a concern right now.

I'm perhaps a little learned in the ways of the regular expression and would like to utilise them here, but I can't seem to think of a real concise and elegant way of nicely formatting a HTML document. I've managed to completely minify the HTML using regex, but as for correctly indenting it, I'm stumped.

So, if you had a string of HTML, using C#, how would you reformat it in much the same way as Visual Studio's Format Document function does? Any ideas would be greatly appreciated.

+2  A: 

Use Tidy. I have used this .net wrapper quite successfully.

Sky Sanders
This certainly seems as though it would do the trick. I must confess, though, that I'm somewhat averse to using a library written in (portable ANSI) C _and_ a .NET wrapper just to accomplish this. The unwieldiness of this suggested solution far outweighs, in my opinion, the novelty of a nicely formatted document. I'd really rather do this in C# code or in nothing else, really.
David Foster
@david - parsing and formatting html is not a trivial task. Ask anyone who has tried. Tidy is *a*, if not *the*, gold standard. I am not sure I understand your aversion to using code that works and has been accepted, refined and improved over the past 10 (+?) years, regardless of language or distributable. But then, it is not for me to question your motives. Good luck with that.
Sky Sanders
http://tidy.sourceforge.net/
Femaref
I guess I have something of a minor disinclination to using other people's code unless absolutely necessary. It's especially difficult for me using a .NET wrapper called 'Mark.Tidy'. You are right, though, and I appreciate how difficult this task is by the the very token that I'm here asking how it's done! :-D I was just hoping that merely indenting the document correctly would be slightly simpler.
David Foster
@David- so build it with whatever assembly name you like, or just add the source to your own project, as I have done in some cases. It works (tm)
Sky Sanders
@femaref - thanks for that. Perhaps I should have added a link to the project page in the first two words of the answer. ;-)
Sky Sanders