views:

603

answers:

2

Hello, I'm programming some websites with JSP-Scripting and I encounterd a strange problem with urlencoded web-formular parameters. The site itself is encoded in iso-8859-1.

I have a simple webform with a field called description.

If I enter german Umlauts or specialchars like "ü" or "ß", these get automatically urlencoded. But if I want to read this parameter, I always get null.

String description = request.getParameter("description");

If I enter some chinese words, like 專業人士, they are urlencoded, too. However, I can read them without retrieving null.

This behaviour doesn't change whether I use "post" or "get" as method. I tried to "pimp" my webform with the text below - but that didn't help either.

accept-charset="ISO-8859-1" enctype="application/x-www-form-urlencoded"

My question is: Why can't I retrieve urlencoded german Umlauts but some chinese words?

+1  A: 

I suspect your page or servlet encoding is UTF-8. The Latin-1 encoded Umlaut is an invalid UTF-8 sequence so you get NULL.

When you enter Chinese, the browser knows it can't encode this in Latin-1 so it changes into UTF-8 automatically. That's why Chinese works.

If you can post some HTTP trace, we can confirm this.

ZZ Coder
+1  A: 

Chinese with ISO-8859-1 wont't work, but german should. If chineese is encoded anyway, maybe your form is encoded in UTF-8? Which browser so you use? What is the encopding of the JSP? You should always use UFT-8, not Latin 1. Nowadays every browser and server should support UTF-8 on every OS.

My experience is, that GET ist often not working with encoding, but POST should do. The form parameters accept-charset and enctype are correct. What server do you use?

The JSP content type is typically set by:

<%@ page language="java" contentType="text/html; charset=ISO-8859-1" 
                         pageEncoding="ISO-8859-1" %>
Arne Burmeister
I know that Latin 1 is "outdated" - but I'm stuck to some legacy systems.. I can switch to UFT-8, but that's not that easy.The encoding of the file itself, the meta-tag, and the page-directives are all set to iso-8859-1. <%@ page contentType="text/html; charset=ISO-8859-1"%> <% request.setCharacterEncoding("ISO-8859-1"); response.setCharacterEncoding("ISO-8859-1"); %>Browser: Firefox, IE 6, Safari, Opera .. what you wantServer: IIS + Resin (Java EE Server from caucho.com)GET for this form was only for testing. I prefer POST, too.Thanks for your help.
Johannes
Do not set the encoding on the request or response in that way. Just the contentType on the page.
Arne Burmeister
Thanks! I'll try that tomorrow.Why is it better to set these encoding via the page-directive?
Johannes
Ok, it was a strange behaviour with IE6. It simply doesn't send the data ...I'm using now UTF-8 and everythings works like a charm ...
Johannes