utf-8

how to know which special character is there in a file?

My app needs to process text files during a batch process. Occassionally I receive a file with some special character at the end of the file. I am not sure what that special character is. Is there anyway I can find what that character is so that I can tell the other team which is producing that file. I have used mozilla's library to gue...

mongodb mongo shell utf8 errors

when using the mongodb 1.4.1 version and connect from the shell I have observed this (all running win64) db.foo.save({_id: "5" , "sub": "\u00f6"}) this seems like the ONLY way to insert a ö http://www.fileformat.info/info/unicode/char/00f6/index.htm and quite a long way from UTF-8 (hex) 0xC3 0xB6 (c3b6) so what is the best app...

How to test UTF-8 strings in Java

I have to test a web app and its API for UTF-8 strings. Webapp has a text field and its API has corresponding getter method, I have to make sure UTF-8 will work, how do I do that? ...

How do you get the glyph for a character encoded as 'ō' from a utf-8 encoded database field using php?

I have a MySQL database table with a collation of 'utf8_general_ci' and the value in the field is: x & #299; bán yá wén (without the spaces). When this is converted (for example by StackOverflow's editor) it looks like this: xī bán yá wén where the second character looks like a lower case i with a bar over the top. In PHP, what func...

How to display arabic in Javascript?

Hi Guys, I am using utf-8 in my jsp page. I have set the page pageEncoding="UTF-8" contentType="text/html;" <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> But when i try to alert a UTF-8 value then its coming as same utf-8 characters. ...

Storing UTF8 string in a UnicodeString

In Delphi 2007 you can store a UTF8 string in a WideString and then pass that onto a Win32 function, e.g. var UnicodeStr: WideString; UTF8Str: WideString; begin UnicodeStr:='some unicode text'; UTF8Str:=UTF8Encode(UnicodeStr); Windows.SomeFunction(PWideChar(UTF8Str), ...) end; Delphi 2007 does not interfere with the contents...

Four byte encoding of U+00F6 (LATIN SMALL LETTER O WITH DIAERESIS)?

Which character encoding (or combinations of encodings) represents the character ö (U+00F6, LATIN SMALL LETTER O WITH DIAERESIS or simply put chr(246) in ISO-8859-1) as the four octets combination chr(195) . chr(63) . chr(194) . chr(164)? ...

How to query MySQL for exact length and exact UTF-8 characters

I have table with words dictionary in my language (latvian). CREATE TABLE words ( value varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; And let's say it has 3 words inside: INSERT INTO words (value) VALUES ('tēja'); INSERT INTO words (value) VALUES ('vējš'); INSERT...

Problem with mysql character set & GWT

Hi I've a SmartGWT application which interacts with a mysql database using rpc services. Suppose it as a simple form with a textbox & two save & load buttons. My database & tables & all fields collation is utf8_persian_ci. All java source files & module html & xml files have saved with utf8 character set. & also I've a meta tag in module...

Mysql latin1 turkish data and delphi 2010 utf8

Hello, I have tables collating latin1_general_ci and have turkish character values. And i can use this data on delphi 7+zeos with no problem. but i want to upgrade my delphi to 2010 version but zeos too slow as i saw. so i want to use odbc+ado or dbexpress solution. dbexpress solution works fine , display my data as entered and write ...

Differences between utf8 and latin1

what is the difference between utf8 and latin1? ...

UTF-8 conversion

Hey guys, I am grabbing a JSON array and storing it in a NSArray, however it includes JSON encoded UTF-8 strings, for example pass\u00e9 represents passé. I need a way of converting all of these different types of strings into the actual character. I have an entire NSArray to convert. Or I can convert it when it is being displayed, wh...

Ruby Rails, jQuery, Uploadify - weird UTF-8 error

Hi everyone, I'm setting up jQuery and Uploadify in my Rails app (with the uploadify-rails plugin). Its all going fine, the flash is loaded, the authenticity paramater is passed through along with the session key and so on. However, my MySQL queries on the way to handling the upload from the flash are all reporting a 'redundant UTF-8 se...

phpmyadmin shows numbers or blob for mysql's utf8_bin callation columns?

Hi ! I have a table with a varchar column. Its collation is set to utf8_bin. My software using this table and column works perfectly. But when I look at the content in phpmyadmin, I only see some hex values or [Blob xB]. Can I make phpmyadmin show the content correctly? Besides, when I set the collation to utf8_general_ci or utf8_unico...

How do I decode mail header strings with their encoding type in them in PHP

I'm creating a small, web based, mail client in PHP and noticed that a number of email Subjects and Contents appear as follows: =?ISO-8859-1?Q?Everything_for_=A35_-_Box_Sets,_Games_?= =?ISO-8859-1?Q?and_CD_Soundtracks...hurry,_ends_soon?= =?utf-8?B?UGxheS5jb206IE9uZSBEYXkgT25seSDigJMgT3V0IG9mIHRoaXMgV29ybGQgRGVhbHMh?= =?windows-1252?Q?J...

How to read utf-8 xml from vbs and get correct character code

I'm trying to read xml file from vbs script. Xml is encoded in utf-8 and has appropriate header From vbs script I use microsoft xmldom parser to read xml: Dim objXMLDoc Set objXMLDoc = CreateObject( "Microsoft.XMLDOM" ) objXMLDoc.load("vbs_strings.xml") Inside xml I'm trying to write character by code using &#nnn; notation. Then I r...

SQLite character encoding for Google Gears

We're using jQuery to get a JSON-string from our server (UTF-8 response, also UTF-8 request through jQuery) and put this JSON into a Google Gears WorkerPool. This workerpool processes the JSON and stores it into a Gears database (SQLite). It turns out that, apparently, SQLite stores data using iso-8859-1 rather than UTF-8. Since we're t...

remove utf-8 figure spaces with php

I have some xml files with figure spaces in it, I need to remove those with php. The utf-8 code for these is e2 80 a9. If I'm not mistaken php does not seem to like 6 byte utf-8 chars, so far at least I'm unable to find a way to delete the figure spaces with functions like preg_replace. Anybody any tips or even better a solution to this...

How to read and write UTF-8 to disk on the Android?

I cannot read and write extended characters (French accented characters, for example) to a text file using the standard InputStreamReader methods shown in the Android API examples. When I read back the file using: InputStreamReader tmp = new InputStreamReader(in); BufferedReader reader = new BufferedReader(tmp); String str; while ((st...

Non-Latin characters in URLs - is it better to encode them or replace with their Latin "counterparts"?

We're implementing a blog for a site which supports six different languages and five of them have non-Latin characters in their alphabets. We are not sure whether we should have them encoded (that is what we're doing at the moment) Létání s potravinami: Co je dovoleno? becomes l%c3%a9t%c3%a1n%c3%ad-s-potravinami-co-je-dovoleno and the b...