views:

71

answers:

1

For our web app, we have multiple HTML pages containing text areas. All of our pages are rendered with an ISO-8859-1 charset. When the page is accessed through IE6 on a Windows machine and special characters such as a "smart quote" are copied in to the text area, some of our pages submit the page using the Windows 1252 character encoding. On the others, the pages appear to submit using the UTF-8 character encoding. I've been tracking the submit character encoding by using the following hidden field:

<input type="hidden" name="_charset_" />

On the Windows 1252 submit character encoding pages, we receive a value of "windows-1252".

On the UTF-8 submit character encoding pages, we receive a blank value.

On the backend, we are using ISO-8859-1. While ideally we would want the submit character encoding, I do not see an option for forcing that behavior on IE 6. Given the choice between Windows 1252 and UTF-8, I would prefer the content be submitted in Windows 1252 so that is more likely to render correctly when the page re-renders in ISO-8859-1.

I've looked into our pages in some depth and nothing jumps out at me as the reason why some pages submit in one character encoding.

1) When IE 6 returns a charset of blank, does that in fact equate to UTF-8? Does IE 6 always return a charset of blank when the submit character encoding is UTF-8, or only when it is unable to properly determine what character encoding to use?

2) What possible differences could there be on the pages that would result in IE 6 picking Windows 1252 on some pages and UTF-8 on others? I scanned the page for UTF-8 characters and for any accept-charset attributes and could not find either.

Additional Note: I found the information on the charset hidden input at the following link.

http://web.archive.org/web/20060427015200/ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html

A: 

MSDN states that IE only accepts "utf-8" as a value for this attribute.

Mike Hearn