views:

2257

answers:

4

An application I'm building is generating XHTML documents that are going to be distributed in a bunch of different ways, including email. I can open these documents in Firefox or Chrome (and by "open" I mean from Windows Explorer, not through a web server). With IE 7, though, I'm having two - possibly three - different problems.

If the files are named with the extension ".xhtml", then IE launches and then closes. Sometimes it's still running in Task Manager and I have to kill it. Sometimes not.

If I name them with the extension ".htm" or ".html", then they open properly, except the IE Information Bar comes up telling me that it has blocked content of some kind. These documents don't contain any scripts or iframes or objects - they're as plain-vanilla XHTML as can be. They don't even reference external CSS.

When the customer I'm developing this for opens the documents in his environment (he's just using the ".xhtml" extension at this point), IE opens them and renders them as XML documents.

I've spent a fair amount of time on Google to try and get to the bottom of this, and everything I find there has to do with specifying the MIME type in the HTTP header, which is not especially useful as I'm not actually serving these files.

The files all (seem to) have the proper DOCTYPE, processing instruction, and namespace declarations; the top of each looks like this:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml"&gt;

Any ideas?

A: 

Look at the .html entry in the HKEY_CLASSES_ROOT section of the Windows Registry: and then try cloning this to make a corresponding .xhtml entry.

<Standard disclaimer about how messing with the registry is perilous>

ChrisW
Not only perilous, but I'd have to do it on every machine that I was emailing these documents to. And (see my answer), there's already an `.xhtml` entry.
Robert Rossney
+2  A: 

IE opens them and renders them as XML documents.

That is normal. If you want to distribute XHTML as files for viewing in IE you are going to have to stick with .html.

I don't know what's broken in your setup, perhaps a messed-up file association?

If I name them with the extension ".htm" or ".html", then they open properly, except the IE Information Bar comes up telling me that it has blocked content of some kind.

Curious. Until you find out what, exactly, IE is thinking is active content, try inserting the Mark Of The Web to mollify IE. This requires losing the XML prolog, but that was only using the defaults anyway, so including it didn't get you anything.

<!-- saved from url=(0014)about:internet -->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml"&gt;

Note you have to be using CRLF line endings (at least on the first line) for this to work. Ugh.

bobince
The Mark of the Web (really, Microsoft? *Really?*) doesn't have any effect on that particular issue, sadly.
Robert Rossney
Are you saying you still get "Active content blocked" even with the Mark of the Web? The horribleness of the MotW concept aside, this should not happen. What zone does IE say the file is in? (It should be "Internet zone".)
bobince
A: 

You could try saving them as .html but add a meta line in the head section describing the content as HTMl instead of something active:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Perchik
Regrettably (see my answer) by the time the browser gets around to examining this tag the question "is she or isn't she text/html?" has already been answered.
Robert Rossney
+3  A: 

So, funny story. IE7 doesn't actually support strict XHTML.

Specifically, if you serve it XHTML with a content-type of application/xhtml+xml, it will go "oh, that new-fangled XHTML stuff, I don't know anything about it," and will treat it as an XML document. If, on the other hand, you serve it XHTML with a content-type of text/html, it goes, "this HTML tastes kinda funny, but I can choke it down."

There's a remarkable hack - suggested by the W3C, no less - to make IE render what it thinks is XML content as HTML. You basically add the IE-only xsl:stylesheet processing instruction to the document and reference an XSLT identity transform with an output type of HTML. Other browsers ignore it; IE transforms what it thinks is an XML document (to itself) and then renders it as HTML. I don't know whether to be impressed or appalled by this hack.

But that hack only works when the document can resolve the reference to the transform. Whoever opens the email that these documents are attached to won't necessarily be able to do this. I suppose there's a way of including the transform inside the XML itself, but I have already spent Way Too Much Time on this. I'd spend more time if it got me to the right answer, but this just gets me to a different flavor of wrong answer.

So I'm going to do the stupid thing, and name the files with an .htm extension. Registry settings map file extensions to content types. So the .htm extension means text/html, and the .xhtml extension means application/xhtml+xml. IE, and everything else that uses the registry to determine the content type, will treat these documents as text/html, and they'll render, and the whole thing will more or less work. But I'm not happy about it.

Robert Rossney
xsl:stylesheet is not IE-only, it's a standard way of invoking XSLT. Even with it, though, IE is still in HTML mode and things like CSS case sensitivity and JavaScript DOM calls won't work in the way you expect for an XML document. Avoid; stick to XHTML-as-.html.
bobince
I knew there was something hinky about the xsl-stylesheet PI, but I misremembered what: it's that the W3C's docs say, in essence, "this is a hack that will get replaced once we figure out how this feature should really be designed."
Robert Rossney