character-encoding

How to parse a string that is in a different encoding from java

I have a string that I have read in from a Word document. I think it is in "Cp1252" encoding. Java uses UTF8. How do I search that string for those special characters in Cp1252 and replace them with an appropriate UTF8 character? specifically, I want to replace the "En Dash" character with a plain "-" The following code block takes t...

HTML To PDF Turkish Character Problem

Hello All, I want to convert a ASP.NET web page to pdf using ITextSharp. I did write some code but I can not make it show the Turkish Characters. Can anyone help me? Here is the code: using System; using System.IO; using iTextSharp.text; using iTextSharp.text.pdf; using System.Web.UI; using System.Web; using iTextSharp.text.html.simpl...

Encoding problem between jQuery and Java

My encoding is set to ISO-8859-1. I'm making an AJAX call using jQuery.ajax to a servlet. The URL (after it has been serialized by jQuery) ends up looking like this: https://myurl.com/countryAndProvinceCodeServlet?action=getProvinces&label=%C3%85land+Islands The actual label value is Åland Islands. When this comes to the servlet, ...

Zend_Cache: After loading cached data, character encoding seems messed up

Hi all, First; On my development server (localhost; default XAMPP on OSX) everything works fine, though when I deploy the exact same code (and data) to the staging server (managed Apache2 on Redhat) it breaks. I'm caching some data using Zend_Cache using the File backend and auto-serialization. Special characters used in the original d...

cfsavecontent + cfinclude with utf-8 charset?

I have a line of coldfusion code that includes an cfm file encoded with the utf-8 charset and saves it to a variable. The problem I am having is that there is no way to specify a charset in cfinclude and the resulting variable does not seem to be reading utf-8 correctly so any non ascii characters are rendered incorrectly. <cfsaveconte...

[Adobe AIR] How can I detect and convert text encoding?

Some text (or html) document from web is not encoded as UTF-8, so I want to convert encoding of text document to UTF-8. Do you have any clues for dealing with text encoding? And I found that, when application draws element with encoding-broken text (such as "¿©¼º ½̾ ±â"), the application is often killed with alert dialog "adl quit une...

How to convert UTF8 to unicode

Hi, I try to convert a UTF8 string to java unicode string. String question = request.getParameter("searchWord"); byte[] bytes = question.getBytes(); question = new String(bytes, "UTF-8"); The input are Chinese Characters and when I compare the hex code of each caracter it is the same Chinses character. So I'm pretty sure that the c...

Replace all special characters from a string using PHP

Hi, I am using jQuery editor with PHP it works fine for plane text (text with out special characters) but if I try to post text which contain special characters then it does not store these special characters in to db table.. and when I tried to replace any special character with HTML codes it works fine. But it is too difficult to repla...

Glassfish JSF 2.0 charset problem

Hi! I'm working on a project developed with JSF 2.0 (Mojarra 2.0.3) front end and deployed on Glassfish v.3.0.1 server. Application must accept ISO-8859-2 charset and write data to MySql database. To problem is that data is not in right charset. The request Http header has attribute value: content-type: application/x-www-form-urlenco...

When encoding actually matters? (e.g., string storing, printing?)

Just curious about the encodings that system is using when handling string storing(if it cares) and printing. Question 1: If I store one-byte string in std::string or two-byte string in std::wstring, will the underlying integer value differ depending on the encoding currently in use? (I remember that Bjarne says that encoding is the map...

print char using unicode value (java)

Hi, Below code returns ? rather than a random character. Any ideas? Please note that i wrote this as pat of an exercise on method overloading hence the 'complicated' setup. class TestRandomCharacter { public static void main(String[] args) { char ch = RandomCharacter.getRandomCharacter() ; System.out.println(...

how to reliably decode various encodings to system default encoding

I am trying to work with several documents that all have various encodings - some utf-8, some ISO-8859-2, some ascii etc. Is there a reliable way of decoding to a standard encoding for processing? I have tried the following: import chardet encoding = chardet.detect(text) text = unicode(text,encoding['encoding']).decode(sys.getdefaulten...

Converting UTF-8 with C++ standard libraries (no /clr)

I have a string like this: "These are Pi (\u03a0) and Sigma (\u03a3).". How can i convert this to contain and print effective characters, using C++ standard libraries? This solution http://msdn.microsoft.com/en-us/library/system.text.encoding.utf8(VS.80).aspx, use .NET framework (/clr compiling), that i want to avoid preferring C++ stan...