views:

184

answers:

3

I have a few reports of people seeing raw html in their browser (instead of their browser interpreting it). It seems to be happen on slow connections. When this happens, if the user reloads the page, the page is interpreted correctly. Are there any html specific things that would cause this to happen (as opposed to server settings)?

+2  A: 

Maybe if the connection timed out before the HTML could be fully sent. The DOM would essentially be incomplete and might not be able to be interpreted properly. Just a guess.

Joe Philllips
A: 

If the HTTP header were mangled as to send the HTML with the wrong mime type it would display as text and not parsed HTML.

+1  A: 

I guess this is called FOUC problem.

A Flash of unstyled content (FOUC) is an instance where a web page appears briefly unstyled prior to loading an external CSS stylesheet. The page corrects itself as quickly as the style rules are loaded and applied, however the shift is quite visible and distracting. After the web page appears, the viewer sees unstyled HTML morph into a differently styled document.

Why does a page takes more load time?

One of the most problematic tasks when working on a Web browser is getting an accurate measurement of how long you're taking to load Web pages. In order to understand why this is tricky, we'll need to understand what exactly browsers do when you ask them to load a URL.

So what happens when you go to a URL like cnn.com? Well, the first step is to start fetching the data from the network. This is typically done on a thread other than the main UI thread.

As the data for the page comes in, it is fed to an HTML tokenizer. It's the tokenizer's job to take the data stream and figure out what the individual tokens are, e.g., a start tag, an attribute name, an attribute value, an end tag, etc. The tokenizer then feeds the individual tokens to an HTML parser.

The parser's job is to build up the DOM tree for a document. Some DOM elements also represent subresources like stylesheets, scripts, and images, and those loads need to be kicked off when those DOM nodes are encountered.

In addition to building up a DOM tree, modern CSS2-compliant browsers also build up separate rendering trees that represent what is actually shown on your screen when painting. It's important to note two things about the rendering tree vs. the DOM tree.

(1) If stylesheets are still loading, it is wasteful to construct the rendering tree, since you don't want to paint anything at all until all stylesheets have been loaded and parsed. Otherwise you'll run into a problem called FOUC (the flash of unstyled content problem), where you show content before it's ready.

(2) Image loads should be kicked off as soon as possible, and that means they need to happen from the DOM tree rather then the rendering tree. You don't want to have to wait for a CSS file to load just to kick off the loads of images.

There are two options for how to deal with delayed construction of the render tree because of stylesheet loads. You can either block the parser until the stylesheets have loaded, which has the disadvantage of keeping you from parallelizing resource loads, or you can allow parsing to continue but simply prevent the construction of the render tree. Safari does the latter.

External scripts must block the parser by default (because they can document.write). An exception is when defer is specified for scripts, in which case the browser knows it can delay the execution of the script and keep parsing.

What are some of the relevant milestones in the life of a loading page as far as figuring out when you can actually reliably display content?

(1) All stylesheets have loaded.
(2) All data for the HTML page has been received.
(3) All data for the HTML page has been parsed.
(4) All subresources have loaded (the onload handler time).

You can find more info on this here.

Hope that helps in explaining why this happens.

kayteen