questions about character-encoding

How to parse a string that is in a different encoding from java

I have a string that I have read in from a Word document. I think it is in "Cp1252" encoding. Java uses UTF8. How do I search that string for those special characters in Cp1252 and replace them with an appropriate UTF8 character? specifically, I want to replace the "En Dash" character with a plain "-" The following code block takes t...

java

conversion

character-encoding

HTML To PDF Turkish Character Problem

Hello All, I want to convert a ASP.NET web page to pdf using ITextSharp. I did write some code but I can not make it show the Turkish Characters. Can anyone help me? Here is the code: using System; using System.IO; using iTextSharp.text; using iTextSharp.text.pdf; using System.Web.UI; using System.Web; using iTextSharp.text.html.simpl...

Encoding problem between jQuery and Java

My encoding is set to ISO-8859-1. I'm making an AJAX call using jQuery.ajax to a servlet. The URL (after it has been serialized by jQuery) ends up looking like this: https://myurl.com/countryAndProvinceCodeServlet?action=getProvinces&label=%C3%85land+Islands The actual label value is Åland Islands. When this comes to the servlet, ...

Zend_Cache: After loading cached data, character encoding seems messed up

Hi all, First; On my development server (localhost; default XAMPP on OSX) everything works fine, though when I deploy the exact same code (and data) to the staging server (managed Apache2 on Redhat) it breaks. I'm caching some data using Zend_Cache using the File backend and auto-serialization. Special characters used in the original d...

cfsavecontent + cfinclude with utf-8 charset?

I have a line of coldfusion code that includes an cfm file encoded with the utf-8 charset and saves it to a variable. The problem I am having is that there is no way to specify a charset in cfinclude and the resulting variable does not seem to be reading utf-8 correctly so any non ascii characters are rendered incorrectly. <cfsaveconte...

coldfusion

character-encoding

charset

[Adobe AIR] How can I detect and convert text encoding?

Some text (or html) document from web is not encoded as UTF-8, so I want to convert encoding of text document to UTF-8. Do you have any clues for dealing with text encoding? And I found that, when application draws element with encoding-broken text (such as "¿©¼º ½̾ ±â"), the application is often killed with alert dialog "adl quit une...

How to convert UTF8 to unicode

Hi, I try to convert a UTF8 string to java unicode string. String question = request.getParameter("searchWord"); byte[] bytes = question.getBytes(); question = new String(bytes, "UTF-8"); The input are Chinese Characters and when I compare the hex code of each caracter it is the same Chinses character. So I'm pretty sure that the c...

Replace all special characters from a string using PHP

Hi, I am using jQuery editor with PHP it works fine for plane text (text with out special characters) but if I try to post text which contain special characters then it does not store these special characters in to db table.. and when I tried to replace any special character with HTML codes it works fine. But it is too difficult to repla...

Glassfish JSF 2.0 charset problem

Hi! I'm working on a project developed with JSF 2.0 (Mojarra 2.0.3) front end and deployed on Glassfish v.3.0.1 server. Application must accept ISO-8859-2 charset and write data to MySql database. To problem is that data is not in right charset. The request Http header has attribute value: content-type: application/x-www-form-urlenco...

When encoding actually matters? (e.g., string storing, printing?)

Just curious about the encodings that system is using when handling string storing(if it cares) and printing. Question 1: If I store one-byte string in std::string or two-byte string in std::wstring, will the underlying integer value differ depending on the encoding currently in use? (I remember that Bjarne says that encoding is the map...

c++

character-encoding

print char using unicode value (java)

Hi, Below code returns ? rather than a random character. Any ideas? Please note that i wrote this as pat of an exercise on method overloading hence the 'complicated' setup. class TestRandomCharacter { public static void main(String[] args) { char ch = RandomCharacter.getRandomCharacter() ; System.out.println(...

java

character-encoding

how to reliably decode various encodings to system default encoding

I am trying to work with several documents that all have various encodings - some utf-8, some ISO-8859-2, some ascii etc. Is there a reliable way of decoding to a standard encoding for processing? I have tried the following: import chardet encoding = chardet.detect(text) text = unicode(text,encoding['encoding']).decode(sys.getdefaulten...

python

character-encoding

Converting UTF-8 with C++ standard libraries (no /clr)

I have a string like this: "These are Pi (\u03a0) and Sigma (\u03a3).". How can i convert this to contain and print effective characters, using C++ standard libraries? This solution http://msdn.microsoft.com/en-us/library/system.text.encoding.utf8(VS.80).aspx, use .NET framework (/clr compiling), that i want to avoid preferring C++ stan...

c++

utf-8

character-encoding