character-encoding

Convert string from UTF-8 to ISO 8859-1 in Java

I want to encode a UTF-8 string to a ISO 8859- string in Java I have this: String title = new String(item.getTitle().getText().getBytes("ISO-8859-1")); But it isn't working, the output is Sørensen for example ...

czech char 'ě' on php page script

Hi guys, I'm not able to correctly show this char on my web pages. I'm using UTF-8 charset for this page, have I to use ISO-8859-2? I'm getting this a string with this char from a db and on it, it's saved as ě. My Browser show only html tag. It's the only char (at this moment) that I can't show on my webpage. I've take a look to t...

Why is my GetNextChar() in my DecoderFallbackBuffer Specialization Repeatedly Getting Called?

I need to produce my own DecoderFallback and DecoderFallbackBuffer classes to implement some custom stream decoding. I have found that the stream reader making use of it is calling GetNextChar() repeatedly even when my specilizaed DecoderFallbackBuffer.Remaining property returns 0 to indicate that there no more characters to return. Th...

UTF GET parameter codification problem in JSP (JBoss 2.0.1)

I´m trying to take a string from a GET or POST parameter in JSP with some accents in UTF-8: <%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8" %> <% request.setCharacterEncoding("UTF-8"); String value = request.getParameter("q"); out.print(value+" | aáa"); %> The codification of the hardcoded string is co...

In MySQL how can I tell what character set a particular table is using?

I have a large mysql table that I think might be using the wrong character set. If so I'll need to change it using ALTER TABLE mytable CONVERT TO CHARACTER SET utf8 But since this is a very large table, I'd rather not run this command unless I have to. So my question is, how can I ask mysql what the character set is on a particular t...

Incorporating ISO 8859-1 Symbols / foreign langauges into a WinForms application

Hi, I have a function that finds any ISO 8859-1 symbol within a given string, and tries converting it to its proper meaning. However, I get question marks instead where I'd like actual values like : ÿ é æ etc. Can you please point me in the right direction on how to properly handle foreign/unique symbols? ...

Efficient way to calculate byte length of a character, depending on the encoding

What's the most efficient way to calculate the byte length of a character, taking the character encoding into account? The encoding would be only known during runtime. In UTF-8 for example the characters have a variable byte length, so each character needs to be determined individually. As far now I've come up with this: char c = getCha...

NSStrings, C strings, pathnames and encodings in iPhone

I am using libxml2 in my iPhone app. I have an NSString that holds the pathname to an XML file. The pathname may include non-ASCII characters. I want to get a C string representation of the NSString for to pass to xmlReadFile(). It appears that cStringUsingEncoding gives me the representation I seek. I am not clear on which encoding to u...

SQLite character encoding for Google Gears

We're using jQuery to get a JSON-string from our server (UTF-8 response, also UTF-8 request through jQuery) and put this JSON into a Google Gears WorkerPool. This workerpool processes the JSON and stores it into a Gears database (SQLite). It turns out that, apparently, SQLite stores data using iso-8859-1 rather than UTF-8. Since we're t...

How to read and write UTF-8 to disk on the Android?

I cannot read and write extended characters (French accented characters, for example) to a text file using the standard InputStreamReader methods shown in the Android API examples. When I read back the file using: InputStreamReader tmp = new InputStreamReader(in); BufferedReader reader = new BufferedReader(tmp); String str; while ((st...

Twitter Search API is returning weird characters - is it me or is it them?

We are building an app that accesses the Twitter search over JSONP. It mostly works fine, but occasionally the request returns a JSONP callback that consists of weird unparseable characters. Here is an example: http://search.twitter.com/search.json?result_type=recent&amp;rpp=100&amp;geocode=51.4375857,-0.1658648,1km&amp;page=5&amp;call...

Non-Latin characters in URLs - is it better to encode them or replace with their Latin "counterparts"?

We're implementing a blog for a site which supports six different languages and five of them have non-Latin characters in their alphabets. We are not sure whether we should have them encoded (that is what we're doing at the moment) Létání s potravinami: Co je dovoleno? becomes l%c3%a9t%c3%a1n%c3%ad-s-potravinami-co-je-dovoleno and the b...

how to manage formating of text when read a save file?

hello i have a java applet application in which i use rich text area . i write URDU the national language of PAKISTAN. i managed to do so with uni codes. the problem is, when i write urdu in text area and select a font and color for each line it do all of this but when i save this file using UTF-8 encoding and then open it again it show...

how to convert unicode to printable string in QT stream

I'm writing a stream to a file and stdout, but I'm getting some kind of encoding like this: \u05ea\u05e7\u05dc\u05d9\u05d8 \u05e9\u05e1\u05d9\u05de\u05dc \u05e9\u05d9\u05e0\u05d5\u05d9 \u05d1\u05e1\u05d2\u05e0\u05d5\u05df \u05dc\u05d3\u05e2\u05ea\u05d9 \u05d0\u05dd \u05d0\u05e0\u05d9 \u05d6\u05d5\u05db\u05e8 \u05e0\u05d...

Convert a raw string to an array of big-endian words with Ruby

Hello, I would like to convert a raw string to an array of big-endian words. As example, here is a JavaScript function that do it well (by Paul Johnston): /* * Convert a raw string to an array of big-endian words * Characters >255 have their high-byte silently ignored. */ function rstr2binb(input) { var output = Array(input.lengt...

mysql replace accented characters

Hi, i would like to generate strict alphanumeric character logins from users' first and lastname. Since many of them are foreigners, their names have special characters (é, è, ï, ...). I would like to remove the accents (e,e,i,...) in the logins. Here is my query. Is there a character set that does not contain accents? UPDATE contacts...

How do I convert Windows 7 file-name encoding to UTF-8 for Ruby on Rails?

Hi (Ive looked at the other questions - none seemed to quite fit my problem.) I have some file-names under Windows 7 that need to be translated in to MySQL database (UTF-8) with Ruby on Rails. An example file-name includes "íéó" in some kind of Windows 7 file-system encoding. Ive tried many combinations of gsub and ActiveSupport::Mul...

Load JSON in Python as header character set

Hi everyone, I've always found character sets and encodings complicated to understand and here I'm faced with another problem. My apologies for any inaccuracies. I'll do my best. I'm requesting data from a server which returns JSON. In the HTTP headers it also returns the character set like so: Content-Type: text/html; charset=UTF-8 ...

HTML character reference display problems.

Hey folks, I'm currently developing a site in Joomla, and one of the components I'm using makes use of a PHP file to administer the language. (english.php, spanish.php) The problem I'm having is that if I use the plain text version of eg. "á", it will show up in the browser tab title ok, but as a � in the body of the page. But if I use...

UTF-8 character encoding in Java

Hello, I am having some problems getting some French text to convert to UTF8 so that it can be displayed properly, either in a console, text file or in a GUI element. The original string is HANDICAP╔ES which is supposed to be HANDICAPÉES Here is a code snippet that shows how I am using the jackcess Database driver to read in the Ac...