views:

42

answers:

2

I have a filter which processes generated HTML and rewrites certain elements. For example, it adds class attributes to some anchors. Finally, it writes the processed HTML to the response (a subclass of HttpServletResponseWrapper). Naturally, this means that the processed HTML is a different length after it has passed through the filter.

I can see two ways of approaching this.

One is to iterate over the HTML, using a StringBuilder to build up the processed HTML, and write the processed HTML to the response once all filtering is complete. The other is to iterate over the HTML but to write it to the response as soon as each element has been processed.

Which is the better way for this operation, or is there another option which would be preferable? I am looking to minimise temporary memory usage primarily.

A: 

Obviously the second approach would need less memory and would increase responsiveness, but it is often more difficult to implement.

Maurice Perry
+1  A: 

The complexity of streaming the response (i.e. writing it "on the go") lies in the code structure: your processing must be such that the response bytes are obtained in due order. But if you assemble the response in a StringBuilder then your code is already good for streaming. Simply replace the StringBuilder with the PrintWriter that the ServletResponse.getWriter() method returns.

Note that in HTTP 1.0, the HTTP server must either provide the content length in the response headers, or close the connection at the end of the response. HTTP 1.1 includes the "chunked transfer encoding" which allows data streaming without knowing the content length beforehand, and without preventing the connection from being reused for subsequent HTTP requests. This should be handled automatically, so you do not have to worry about it unless you are trying to support really old HTTP clients.

Thomas Pornin