views:

5749

answers:

5

Hi all,

I'm coding an italian website where I need to validate some input data with an xhr call. My code for the ajax request's like this (I'm using JQuery 1.3.2):

 $.ajaxSetup({
    type: "POST",
    timeout: 10000,
    contentType: "application/x-www-form-urlencoded; charset=iso-8859-1"        
}); 


 $.ajax({
    url: "ajaxvalidate.do",
    data: {field:controlInfo.field,value:controlInfo.fieldValue},
    dataType: "json",
    complete: function() {
        //
    },
    success: function(msg) {
        handleAsyncMsg(controlInfo, msg, closureOnError);
    },
    error: function(xhr, status, e) {            
        showException(controlInfo.id, status);

    }

});

On the backend I have a java struts action to handle the xhr. I need to use the encoding ISO-8859-1 in the page to ensure the data (specially accented characters) are sent correctly in the synchronous submit.

All's working like a charm in Firefox but when I have to handle an async post from IE 7 with accented characters I have a problem: I always receive invalid characters (utf-8 maybe?). EG I type in the form àààààààà and I get in my request this value: Ã Ã Ã Ã Ã Ã Ã Ã. Since the request charset's correctly set to ISO-8859-1 I can't understand why the server's still not parsing the form value correctly.

This is a log sample with all the request headers and the error (the server's an old Bea Weblogic 8.1):

Encoding: ISO-8859-1
Header: x-requested-with - Value: XMLHttpRequest
Header: Accept-Language - Value: it
Header: Referer - Value: https://10.172.14.36:7002/reg-docroot/conv/starttim.do
Header: Accept - Value: application/json, text/javascript
Header: Content-Type - Value: application/x-www-form-urlencoded; charset=iso-8859-1
Header: UA-CPU - Value: x86
Header: Accept-Encoding - Value: gzip, deflate
Header: User-Agent - Value: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Header: Host - Value: 10.172.14.36:7002
Header: Content-Length - Value: 65
Header: Connection - Value: Keep-Alive
Header: Cache-Control - Value: no-cache
Header: Cookie - Value: JSESSIONID=JQJlNpVC86yTZJbcpt54wzt82TnkYmWYC5VLL2snt5Z8GTsQ1pLQ!1967684811
Attribute: javax.net.ssl.cipher_suite - Value: SSL_RSA_WITH_RC4_128_MD5
Attribute: javax.servlet.request.key-size - Value: 128
Attribute: javax.servlet.request.cipher_suite - Value: TLS_RSA_WITH_RC4_128_MD5
Attribute: javax.servlet.request.key_size - Value: 128
Attribute: weblogic.servlet.network_channel.port - Value: 7001
Attribute: weblogic.servlet.network_channel.sslport - Value: 7002
Attribute: org.apache.struts.action.MESSAGE - Value: org.apache.struts.util.PropertyMessageResources@4a97dbd
Attribute: org.apache.struts.globals.ORIGINAL_URI_KEY - Value: /conv/ajaxvalidate.do
Attribute: errors - Value: org.apache.struts.util.PropertyMessageResources@4a97e4d
Attribute: org.apache.struts.action.MODULE - Value: org.apache.struts.config.impl.ModuleConfigImpl@4aa2ff8
Attribute: weblogic.servlet.request.sslsession - Value: javax.net.ssl.impl.SSLSessionImpl@42157c5
field: nome - value: àààààààà - action: /endtim
+2  A: 

contentType: "application/x-www-form-urlencoded; charset=iso-8859-1"

You can say you're sending a form submission as ISO-8859-1 in the header, but that doesn't mean you actually are. jQuery uses the standand JavaScript encodeURIComponent() method to encode Unicode strings into query-string bytes, and that always uses UTF-8.

In any case, the ‘charset’ parameter for the MIME type ‘application/x-www-form-urlencoded’ is highly non-standard. As an ‘x-’ type there is no official MIME registration for this type, but HTML 4.01 doesn't specify such a parameter and it would be very unusual for an ‘application/*’ type. Weblogic claims to detect this construct, for what it's worth.

So what you can do is either:

1: create the POST body form-urlencoded content yourself, hacking it into ISO-8859-1 format manually, using something like

function encodeLatin1URIComponent(str) {
    var bytes= '';
    for (var i= 0; i<str.length; i++)
        bytes+= str.charCodeAt(i)<256? str.charAt(i) : '?';
    return escape(bytes).split('+').join('%2B');
}

instead of encodeURIComponent().

2: lose the ‘charset’ and leave it submitting UTF-8 as normal, and make your servlet understand incoming UTF-8. This is generally best, but will mean mucking around with the servlet container config to make it choose the right encoding. For Weblogic this seems to mean using an <input-charset> element in weblogic.xml. And by then you're looking at moving your whole app to UTF-8. Which is by no means a bad thing (non-Unicode-capable websites are sooo 20-century!) but may well be a lot of work.

bobince
Thanks bobince, I'll try convert the pages but it'll take some time. I've tried the js encodeURI() and it worked but I had to manually decode each field from the request server side. I've tried also to put the xhr in utf-8 with the pages still in iso-8859-1 and it seems to work. Really strange..
Marco Z
A: 

I'm getting a similar error with IE, whereby I set contentType: "application/json" in my Ajax call, and this is what gets sent on the wire by FF, but IE adds the "application/x-www-form-urlencoded", so that on the wire it looks like: contentType: "application/x-www-form-urlencoded; application/json"

I don't have any fancy characters, but this seems to break the server (I'm using Jersey).

I could fiddle with the server, but it seems wrong that IE is adding the extra contentType, I'd much prefer to fix that. Any ideas?

Thanks

A: 

Actually just solved that, I needed to do:

jQuery.ajaxSetup({ contentType: "application/json;charset=utf-8" });

Which overrides the behaviour of adding application/x-www-form-urlencoded to the content-type in IE.

A: 

Thanks a lot !!! You help me find a solution to a problem that make me waste several days without finding any clue !!

A: 

Hi People, I was having similar problems, working in a content comments system in our Spanish Portal. What finally solved my problem, after many hours of searching, instead of messing with jQuery charset, that seems to use utf-8 no matter what, it was to decode from utf-8 back to ISO-8859-1 in the PHP that processed the ajax POST. In PHP there is a built in function, utf8_decode(), so the first thing I do with the comments string is this: $comentario = utf8_decode($_POST['comentario']);

(and then I used nl2br() and htmlentities() PHP functions in order to prepare the text to be stored with html entities instead of special chars)

Good Luck & Peace all over! Seba

Sebastian Alberoni