tags:

views:

346

answers:

3
A: 

Don't use the <?xml ... ?> construct (I can't remember its proper name). Use a DOCTYPE:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=ISO-8859-1"/>
    ...
</head>
...
</html>
Matt Ball
Processing Instruction?
Vincent Ramdhanie
That's the one.
Matt Ball
I have removed <?xml ...?> from the jsp page, included the DOCTYPE. Page renders but still → shows up as a question mark. Did not solve the problem.
m_a_khan
Try using HTML rather than XHTML if you want to use the `→` entity. Also try using `→` instead of `→`.
Matt Ball
It's an ‘XML Declaration’, which technically isn't a Processing Instruction even though it looks like one. You probably don't need an XML Declaration (because XML is UTF-8 by default; HTML still needs the meta) and shouldn't use it in HTML-compatible-XHTML as IE will go into quirks mode.
bobince
+1  A: 

The page stops rendering and throws XML parsing error such as "semi colon expected somewhere in my javascript" or "processing instruction not found" etc etc.

That will happen if you declare XHTML as XML instead of "HTML with XML syntax". Indeed get rid of that XML declaration. If you can, I'd go a step further and just use HTML as real HTML, i.e. use <!doctype html> or any other HTML strict doctype. Also see http://hsivonen.iki.fi/doctype/.

<% request.setCharacterEncoding("UTF-8"); %>

First detail is that the request.setCharacterEncoding("UTF-8") is way superfluous. At this stage it's already too late to set that. Second detail is that you're using scriptlets for that. I recommend not to do so. Use taglibs/EL where appropriate. If that's not possible, then the code logic actually belongs in a Java class, either directly or indirectly in a Servlet or Filter class.

Removing "charset=utf-8" from response.setContentType makes the page render. The only problem is that → shows up as a questin mark "?"

The response.setContentType(..) is superfluous if you already set it as a <meta> tag in the HTML <head> which is imho much cleaner.

Finally you also need to set the response character encoding (that's different from setting the content type!) as follows:

<%@ page pageEncoding="UTF-8" %>

This by the way also implicitly creates a <meta> tag for the content-type. More background information and hints can be found here.

Hope this helps.

BalusC
Removed request.setCharacterEndocding, included <meta .... > under <head>, page renders but → still shows up as question mark.
m_a_khan
I've edited my answer how to set response encoding properly.
BalusC
<meta> with charset is ignored in XHTML by design. It's only for HTML (and HTML-pretending-to-be-XML in IE).
porneL
A: 

You probably have code like this:

<script type="text/javascript"> if (a && b) </script>

which is forbidden in XHTML mode, but required in text/html mode. You'll find explanation of this problem in Sending XHTML as text/html Considered Harmful.

And code like:

<a href="foo?bar&baz">

is not allowed in any version of HTML or XHTML. It must always be written as:

<a href="foo?bar&amp;baz">

Apparently you're not generating page using XML serializer (it wouldn't let you create invalid entities or improperly encoded characters), therefore I suggest that you use HTML 4 Strict or HTML5-as- text/html instead, which are more appropriate for hand-coded markup.

porneL