views:

94

answers:

2

I'm currently trying to deploy some RSS feeds on a WebLogic Application Server. The feeds' views are .jspx files, like the one below:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" 
    xmlns:georss="http://www.georss.org/georss"
    xmlns:jsp="http://java.sun.com/JSP/Page"
    xmlns:c="http://java.sun.com/jsp/jstl/core"
    xmlns:fmt="http://java.sun.com/jsp/jstl/fmt"
    xmlns:fn="http://java.sun.com/jsp/jstl/functions"
    xmlns:util="http://example.com/util"&gt;
    <jsp:directive.page pageEncoding="utf-8" contentType="application/xhtml+xml" /> 

    <jsp:useBean id="now" class="java.util.Date" scope="page" />

    [...]

    <c:forEach var="category" items="${categories}">
    <entry>
        <title>${util:htmlEscape(category.label)}</title>
        <id>${category.id}</id>
        <c:if test="${empty parentId}">
        <link href="${util:htmlEscape(fullRequest)}?parentId=${category.id}" />
        </c:if>
        <summary>${util:htmlEscape(category.localizedLabel)}</summary> 
    </entry>
    </c:forEach>
</feed>

The problem is that on my local development server (Apache Tomcat 6.0) everything renders fine, but on the WebLogic server I get all the UTF-8 characters back mangled.

In Firefox, I see something like <summary>Formaci�n</summary>. The byte sequence for the strange character is ef bf bd and I seem to get that for all UTF-8 chars that I'm supposed to receive in the tests I'm conducting (á, ó, í). I've checked the content-type and encoding in firebug and it seems ok (Content-Type: application/xhtml+xml; charset=UTF-8).

In Chrome, the content gets trucated at the first occurence of the strange character, with the error message: This page contains the following errors: error on line 1 at column 523: Encoding error.

I'm not sure what's happening, but I think it's related to something that the web server is doing, considering that on my local Tomcat everything's ok. Any ideas are welcome.

Thanks,
Alex

+1  A: 

The is the Unicode Replacement Character U+FFFD (in hex indeed 0xEF 0xBF 0xBD).

This character is been used in Firefox to replace a character whose unicode codepoint actually lies outside the range of the character encoding the browser is been instructed to render the page in.

Since the browser is been instructed to render the page in UTF-8 and the character is initially ó (U+00F3, 0xC3 0xB3) which would be malformed into an unknown character when being decoded using a single byte charset to 0xF3 instead of 0xC3 0xB3, the symptoms indicate that the server is actually decoding the response as ISO-8859-1 instead of UTF-8, but yet instructing the browser to encode it using UTF-8.

I don't do Weblogic, so I googled a bit and I came across this old bug report wherein one suggests to add the following to weblogic.xml file to force it to parse JSP files using UTF-8.

<weblogic-web-app>
    <jsp-descriptor>
        <jsp-param>
            <param-name>encoding</param-name>
            <param-value>UTF-8</param-value>
        </jsp-param>
        <jsp-param>
            <param-name>compilerSupportsEncoding</param-name>
            <param-value>false</param-value>
        </jsp-param>
    </jsp-descriptor>
</weblogic-web-app>

See if that helps to solve your problem.

BalusC
Thanks for the response! It helped me to better understand what was going on. I found out the issue eventually, I've already posted the response. I tried the config sample you suggested but it gave me an error when I tried to push it to the server, so I started trying other stuff :). Thanks again, Alex.
Alex Ciminian
You're welcome :)
BalusC
+1  A: 

The issue was coming from the order of the attributes in the jspx directive and the fact that I wasn't including the charset in the contentType attribute!

After switching:

<jsp:directive.page pageEncoding="utf-8" contentType="application/xhtml+xml" />

to:

<jsp:directive.page contentType="application/xhtml+xml; charset=UTF-8" 
     pageEncoding="UTF-8" />

The characters came out fine. I fiddled around a bit more, and, curiously, found out that this:

<jsp:directive.page pageEncoding="UTF-8"
      contentType="application/xhtml+xml; charset=UTF-8" />

doesn't work. I don't really understand why, but I'm guessing that it's a bug in WebLogic. The version I deployed on was 10.0.

Alex Ciminian
Wow, that's pretty nasty. Good find!
BalusC