I have been researching this one for awhile. I found several resources on the subject and they all tend to use the same approach - override Page.Render, use the HtmlTextWriter to convert the output into a string, and then use a series of compiled regular expressions to remove the extra whitespace. Here is an example.
Well, I tried it and it works...but....
In Safari 5.0, this seems to cause all sorts of performance issues with loading images and getting "server too busy" errors. IE 7, FireFox 3.6 and Google Chrome 6.0 seem to work okay, but I haven't stress tested the server very much. Occasionally there seems to be lags in the time the page is generated, but once the html is sent to the browser, the page displays quickly.
Anyway, when you think about it, it seems rather silly to have .NET build up all of these tabs and line breaks only to strip them all out again using string parsing - the least efficient way to strip them out. It would make more sense to override the HtmlTextWriter and pass it into the tree on the main page request in order to avoid putting them into the output at all - logically there should be a performance gain instead of a hit in that case.
Even if I can only remove 50% of the whitespace using this method, it will still leave much less work for the regular expressions to do - meaning it should perform better than it does with regular expressions alone.
I tried using a Control Adapter and overriding several of the members
- Directing all calls to WriteLine() to the corresponding Write() method
- Setting the NewLine property to an empty string
- Overriding the OutputTabs() method and simply removing the code
- Overriding the Indent property and returning 0
I also tried overriding RenderChildren, Render, BeginRender, and EndRender to pass in my custom HtmlTextWriter, but I can't seem to make even a simple label control remove the tabs before its tag. I also dug through the framework using Reflector, but I simply can't figure out how these characters are being generated - I thought I was using a "catch all" approach, but apparently I am missing something.
Anyway, here is what I have come up with. This code is not functioning the way I would like it to. Of course, I have also tried overriding the various Render methods on the page directly and passing in an instance of my custom HtmlTextWriter, but that didn't work either.
Public Class PageCompressorControlAdapter
Inherits System.Web.UI.Adapters.ControlAdapter
Protected Overrides Sub RenderChildren(ByVal writer As System.Web.UI.HtmlTextWriter)
MyBase.RenderChildren(New CompressedHtmlTextWriter(writer))
End Sub
Protected Overrides Sub Render(ByVal writer As System.Web.UI.HtmlTextWriter)
MyBase.Render(New CompressedHtmlTextWriter(writer))
End Sub
Protected Overrides Sub BeginRender(ByVal writer As System.Web.UI.HtmlTextWriter)
MyBase.BeginRender(New CompressedHtmlTextWriter(writer))
End Sub
Protected Overrides Sub EndRender(ByVal writer As System.Web.UI.HtmlTextWriter)
MyBase.EndRender(New CompressedHtmlTextWriter(writer))
End Sub
End Class
Public Class CompressedHtmlTextWriter
Inherits HtmlTextWriter
Sub New(ByVal writer As HtmlTextWriter)
MyBase.New(writer, "")
Me.InnerWriter = writer.InnerWriter
Me.NewLine = ""
End Sub
Sub New(ByVal writer As System.IO.TextWriter)
MyBase.New(writer, "")
MyBase.InnerWriter = writer
Me.NewLine = ""
End Sub
Protected Overrides Sub OutputTabs()
'Skip over the tabs
End Sub
Public Overrides Property NewLine() As String
Get
Return ""
End Get
Set(ByVal value As String)
MyBase.NewLine = value
End Set
End Property
Public Overrides Sub WriteLine()
End Sub
Public Overrides Sub WriteLine(ByVal value As Boolean)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal value As Char)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal buffer() As Char)
MyBase.Write(buffer)
End Sub
Public Overrides Sub WriteLine(ByVal buffer() As Char, ByVal index As Integer, ByVal count As Integer)
MyBase.Write(buffer, index, count)
End Sub
Public Overrides Sub WriteLine(ByVal value As Decimal)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal value As Double)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal value As Integer)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal value As Long)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal value As Object)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal value As Single)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal s As String)
MyBase.Write(s)
End Sub
Public Overrides Sub WriteLine(ByVal format As String, ByVal arg0 As Object)
MyBase.Write(format, arg0)
End Sub
Public Overrides Sub WriteLine(ByVal format As String, ByVal arg0 As Object, ByVal arg1 As Object)
MyBase.Write(format, arg0, arg1)
End Sub
Public Overrides Sub WriteLine(ByVal format As String, ByVal arg0 As Object, ByVal arg1 As Object, ByVal arg2 As Object)
MyBase.Write(format, arg0, arg1, arg2)
End Sub
Public Overrides Sub WriteLine(ByVal format As String, ByVal ParamArray arg() As Object)
MyBase.Write(format, arg)
End Sub
Public Overrides Sub WriteLine(ByVal value As UInteger)
MyBase.Write(value)
End Sub
Public Overrides Sub WriteLine(ByVal value As ULong)
MyBase.Write(value)
End Sub
End Class
In case you are not familliar with control adapters, simply place the xml below in a .browser file in the ASP.NET App_Browsers folder. You can change the controlType to apply the control adapter to a Label or something else for a smaller scope of a test. If I can get this working, it isn't such a big deal to add all of the controls in my project here if it will be necessary to do so.
<browsers>
<browser refID="Default">
<controlAdapters>
<adapter controlType="System.Web.UI.Page"
adapterType="PageCompressorControlAdapter"/>
</controlAdapters>
</browser>
</browsers>
Anyway, you would think there would just be a simple configuration setting like VerboseHtml="false" or PreserveHtmlFormatting="false" or something along those lines. If you look at the output from MSN.com, they are using some kind of compression similar to this...and it appears to be very performant.