windows-1252

Detecting encoding conversion problems

The majority of content on my company's website starts life as a Word document (Windows-1252 encoded) and is eventually copied-and-pasted into our UTF-8-encoded content management system. The conversion usually chokes on a few characters (special break characters, smart quotes, scientific notations) which have to be cleaned up manually, ...

Streamwriter: Polish characters are skipped?

I'm trying to make a small tool to help some guys converting data between a SAP installation and a Axapta installation. I get a text file i Western European (Windows) encoding (1252). They have put in some special chars to replace some Polish characters. Now it's my job to replace those special chars with the correct Polish characters. ...

Java 1.6 Windows-1252 encoding fails on 3 characters

EDIT: I've been convinced that this question is somewhat non-sensical. Thanks to those who responded. I may post a follow-up question that is more specific. Today I was investing some encoding problems and wrote this unit test to isolate a base repro case: int badCount = 0; for (int i = 1; i < 255; i++) { String str = "Hi " + ne...

Windows C API for UTF8 to 1252

I'm familiar with WideCharToMultiByte and MultiByteToWideChar conversions and could use these to do something like: UTF8 -> UTF16 -> 1252 I know that iconv will do what I need, but does anybody know of any MS libs that will allow this in a single call? I should probably just pull in the iconv library, but am feeling lazy. Thanks ...

How to normalize text content to UTF 8 in java

We have a CMS which has several thousand text/html files in it. It turns out that users have been uploading text/html files using various character encodings (utf-8,utf-8 w BOM, windows 1252, iso-8859-1). When these files are read in and written to the response our CMS's framework forces a charset=UTF-8 on the response's content-type at...

Jquery ajax call and charset windows-1252

Dear stackoveflow, I have this problem. I'm working with an old version of mssql (2000) that has all the tables encoded in windows 1252 (and that's it). I can write and read succesfully with php using this line: <?php header('Content-Type: text/html; charset=windows-1252'); ?> If I make a normal post everything works as expected, If I...

Submitted character encoding -- _charset_ hidden field

For our web app, we have multiple HTML pages containing text areas. All of our pages are rendered with an ISO-8859-1 charset. When the page is accessed through IE6 on a Windows machine and special characters such as a "smart quote" are copied in to the text area, some of our pages submit the page using the Windows 1252 character encodi...

XMLReader -- Getting problem with utf characters

Hi, I am parsing a huge xml file and encoding of file is to be said < ? xml version="1.0" encoding="ISO-8859-1" ?>**bold The db encoding is utf8 and I am running this query before anything is saved to db $sql='SET NAMES "utf8" COLLATE "utf8_swedish_ci"'; What the problem is that sometimes some non standard characters comes in the ...