views:

93

answers:

3

My Servlet just won't use UTF-8 for JSON responses.

MyServlet.java:

public class MyServlet extends HttpServlet {

  protected void doPost(HttpServletRequest req, HttpServletResponse res) throws Exception {

    PrintWriter writer = res.getWriter();

    res.setCharacterEncoding("UTF-8");
    res.setContentType("application/json; charset=UTF-8");

    writer.print(getSomeJson());
  }
}

But special characters aren't showing up, and when I check the headers that I'm getting back in Firebug, I see Content-Type: application/json;charset=ISO-8859-1.

I did a grep -ri iso . in my Servlet directory, and came up with nothing, so nowhere am I explicitly setting the type to ISO-8859-1.

I should also specify that I'm running this on Tomcat 7 in Eclipse with a J2EE target as a development environment, with Solaris 10 and whatever they call their web server environment (somebody else admins this) as the production environment, and the behavior is the same.

I've also confirmed that the request submitted is UTF-8, and only the response is ISO-8859-1.

Update

I have amended the code to reflect that I am calling PrintWriter before I set the character encoding. I omitted this from my original example, and now I realize that this was the source of my problem. I read here that you have to set character encoding before you call HttpServletResponse.getWriter(), or getWriter will set it to ISO-8859-1 for you.

This was my problem. So the above example should be adjusted to

public class MyServlet extends HttpServlet {

  protected void doPost(HttpServletRequest req, HttpServletResponse res) throws Exception {

    res.setCharacterEncoding("UTF-8");
    res.setContentType("application/json");

    PrintWriter writer = res.getWriter();
    writer.print(getSomeJson());
  }
}
+1  A: 

The code looks fine. Either you're not running the code you think you're running, or there's some Filter or proxy somewhere in the request-response chain which modifies the content type like that.

BalusC
I don't know where this could be happening unless it's the default behavior of my environment. There is one source file and I wrote it from scratch. But this is also my first Servlet, so I don't know what a `Filter` is.
sidewaysmilk
Nothing else in webapp's `/WEB-INF/web.xml`? What Tomcat version exactly? Did you modify anything in its `/conf/web.xml` file after downloading/installing it?
BalusC
Nothing else there. See my updated question.
sidewaysmilk
In other words, the original code in your question looked fine and you wasn't running the code you think you was running :)
BalusC
Yes. You are technically correct--the best kind of correct! I appreciate your response, but it didn't really do much to point me in the right direction. Thank you, though.
sidewaysmilk
+1  A: 

Once the encoding is set for a response, it cannot be changed.

The easiest way to force UTF-8 is to create your own filter which is the first to peek at the response and set the encoding.

Take a look at how Spring 3.0 does this. Even if you can't use Spring in your project, maybe you can get some inspiration (make sure your company policy allows you to get inspiration from open source licenses).

Leonel
You're right about not being able to change the encoding. I was setting it twice, but not explicitly. I was doing it implicitly with getWriter. See my updated question. Thanks for your anwer, and +1.
sidewaysmilk
A: 

Aside from specific problem, you really should consider getting output stream, using JSON library to write contents directly as UTF-8 encoded JSON; there is no benefit to using writers. Some JSON packages only work with strings, which is unfortunate, but most allow using more efficient streams (safer and more efficient as parser/generator can handle escaping and encoding aspects together).

StaxMan
Thank you for your suggestion. I'll look into this. This is my first servlet, so I'm not too familiar with the libraries available. I am bound to using whatever is available on our default Solaris 10 environment.
sidewaysmilk