I have an embedded device which runs Java applications which can among other things serve up XHTML web pages (I could write the pages as something other than XHTML, but I'm aiming for that for now).
When a request for a web page handled by my application is received a method is called in my code with all the information on the request including an output stream to display the page.
On one of my pages I would like to display a (log) file, which can be up to 1 MB in size.
I can display this file unescaped using the following code:
final PrintWriter writer; // Is initialized to a PrintWriter writing to the output stream.
final FileInputStream fis = new FileInputStream(file);
final InputStreamReader inputStreamReader = new InputStreamReader(fis);
try {
writer.println("<div id=\"log\" style=\"white-space: pre-wrap; word-wrap: break-word\">");
writer.println(" <pre>");
int length;
char[] buffer = new char[1024];
while ((length = inputStreamReader.read(buffer)) != -1) {
writer.write(buffer, 0, length);
}
writer.println(" </pre>");
writer.println("</div>");
} finally {
if (inputStreamReader != null) {
inputStreamReader.close();
}
}
This works reasonably well, and displays the entire file within a second or two (an acceptable timeframe).
This file can (and in practice, does) contain characters which are invalid XHTML, most commonly <>
. So I need to find a way to escape these characters.
The first thing I tried was a CDATA section, but as documented here they do not display correctly in IE8.
The second thing I tried was a method like the following:
// Based on code: http://stackoverflow.com/questions/439298/best-way-to-encode-text-data-for-xml-in-java/440296#440296
// Modified to write directly to the stream to avoid creating extra objects.
private static void writeXmlEscaped(PrintWriter writer, char[] buffer, int offset, int length) {
for (int i = offset; i < length; i++) {
char ch = buffer[i];
boolean controlCharacter = ch < 32;
boolean unicodeButNotAscii = ch > 126;
boolean characterWithSpecialMeaningInXML = ch == '<' || ch == '&' || ch == '>';
if (characterWithSpecialMeaningInXML || unicodeButNotAscii || controlCharacter) {
writer.write("&#" + (int) ch + ";");
} else {
writer.write(ch);
}
}
}
This correctly escapes the characters (I was going to expand it to escape HTML invalid characters if needed), but the web page then takes 15+ seconds to display and other resources on the page (images, css stylesheet) intermittently fail to load (I believe due to the requests for them timing out because the processor is pegged).
I've tried using a BufferedWriter
in front of the PrintWriter
as well as changing the buffer size (both for reading the file and for the BufferedWriter
) in various ways, with no improvement.
Is there a way to escape all XHTML invalid characters that does not require iterating over every single character in the stream? Failing that is there a way to speed up my code enough to display these files within a couple seconds?
I'll consider reducing the size of the log files if I have to, but I was hoping to make them at least 250-500 KB in size (with 1 MB being ideal).
I already have a method to simply download the log files, but I would like to display them in browser as well for simple troubleshooting/perusal.
If there's a way to set the headers so that IE8/Firefox will simply display the file in browser as a text file I would consider that as an alternative (and have an entire page dedicated to the file with no XHTML of any kind).
EDIT:
After making the change suggested by Cameron Skinner and performance testing it looks like the escaped writing takes about 1.5-2x as long as the block-written version. It's not nothing, but I'm probably not going to be able to get a huge speedup by messing with it.
I may just need to reduce the max size of the log file.