charset

Transform project from windows-1256 to utf-8 charset, what's the right steps?

I got a PHP & MySQL script that use windows-1256 charset, I now want to modify the whole script make it completely built on utf-8 charset. starting from mysql DataBase to PHP files. what is the right steps to achive that's??!! Note: I use non-Latin language in script (Arabic language). ...

Regex and ISO-8859-1 charset in java

I have some text encoded in ISO-8859-1 which I then extract some data from using Regex. The problem is that the strings I get from the matcher object is in the wrong format, scrambling chars like "ÅÄÖ". How do I stop the regex library from scrambling my chars? Edit: Here's some code: private HttpResponse sendGetRequest(String url) th...

How to set the charset with JAX-RS?

How can I set the charset with JAX-RS? I've tried @Produces("text/html; charset=UTF-8") but that was ignored and only text/html was send with the HTTP header. I want to set the charset within a MessageBodyWriter, but don't want to extract the media type by analysing the @Produces annotation via reflection by myself. ...

FreeMarker encoding confusion

When I read an UTF-8 encoded template with FreeMarker, special chars are rendered correctly in the browser, although freeMarkerConfig.getDefaultEncoding() returns "Cp1252". If I set freeMarkerConfig.setDefaultEncoding("UTF-8"), I see only question marks in the browser, although "UTF-8" is the actual encoding of the template file. In ever...

ASCII-characters instead of Swedish chars?

Hi everyone, I have tested PHP's IMAP lib. to fetch emails from a GMAIL account, but I've just can't get my head around trying to make the characters to display correctly. At first, I was close to pull my hair off when I realized that I accidentally fetched the attachments instead of the message body - not good, but now when that is so...

PHP + iconv - Transform UTF-4 string?

I'm writing an E-Mail parser. I noticed that I received some emails that state their charset is UTF-4. However, when trying to convert these with iconv to UTF-8 it fails. Now my question is: I've never ever heard of UTF-4. Is this even a valid charset? And if not - can I just treat it as UTF-8? Here is part of the mail header: ["mime...

GBP pound symbol appearing as uknown char in shop

Hi, For every occurrence of the pound symbol (£) in my store, I am instead seeing a '?' question mark symbol in a black diamond. Googling has resulted in suggestions of charset - mine is set as utf-8 as below... <meta http-equiv="content-type" content="text/html;charset=utf-8" /> I believe the store was origonally set up in Os commer...

Why do i have to use set_charset("utf8") even though everything is utf-8 encoded? (MySQLi-PHP)

My table's collation is utf8_general_ci. My pages are encoded with UTF-8 (without BOM). Within my pages, my Equiv meta tag sets character set to utf8 My data has Turkish characters in it. When i output them, it's not showing them as it should be but when i do $db->set_charset("utf8");, it works. Why do i have to use $db->set_charset...

SQL restore and special characters

Dear All, I am using SQLYog for my work. I backup and restore a database. Problem is, when I restore the database all charactes like "à" become funny. I am confused: I am restoring a database on the same machine it was created on. Same dataset. Why does the character corruption happen? How can I fix it? Thanks!!! Sep ...

I got weird characters extracting data from MySQL db

Hi there! Well, I got a MySQL db, encoded as utf8_unicode_ci, and it runs like a charm with the current application (written in Code Igniter) Now, I'm developing a new PHP app, and when I try to recover the data, several characters are unreadable - chars appears ok in the DB with phpMyAdmin, but when I try to put it up in a webpage, it...

UTF-8 not working in HTML forms

I have this form: <form method="post" enctype="multipart/form-data" accept-charset="UTF-8"> But when I submit an é character, it turns it into é. Why doesn't this work? Yes, the MySQL database has all the character-sets set up correctly. (Database, tables.) If I manually put it in the database with Navicat it shows up fine on the we...

PHP function iconv character encoding from iso-8859-1 to utf-8

I'm trying to convert a string from iso-8859-1 to utf-8. But when I find these two charachter € and • the function returns a charachter that is a square with two number inside. How can I solve this issue? ...

Compete understanding of encodings and character sets

Can anybody tell me where to find some clear introduction to character sets, encodings and everything releted to these things? Thanks! ...

Compare letters from different languages

There are some letters in different alphabets, that are looking totally the same. Like A in latin and А in cyrillic. Do they play the same role, when I call one of them through utf-8 script? If aren't, how to get know code of given letter? ...

PHP: Updating with öäå into MySQL

Hello. I already have done this: mysql_set_charset("utf8",$link); at the connection mysql_query("SET NAMES 'UTF8'"); at the connection + on every table in database changing from latin1 to utf8 collation + character for every table + columns file have meta utf8 + header('Content-Type: text/html; charset=utf-8'); plus the files itself ...

Issue Decoding for a specific charset

Hi all, I'm trying to decode a char and get back the same char. Following is my simple test. I'm confused, If i have to encode or decode. Tried both. Both print the same result. Any suggestions are greatly helpful. char inpData = '†'; String str = Character.toString((char) inpData); byte b[] = str.getBytes(Charset.forName("MacRoman"));...

BufferedWriter#write(int) javadoc query

The Javadoc for this says: Only the lower two bytes of the integer oneChar are written. What effect, if any, does this have on writing non-utf8 encoded chars which have been cast to an int? Update: The code in question receives data from a socket and writes it to a file. (A lot of things happen between receiving and writing, so I can...

UTF-8 encoding and http parameters

I am doing a simple ajax call with the YahooUI Javascript library as follows: YAHOO.util.Connect.setForm('myform'); YAHOO.util.Connect.asyncRequest('POST', url, ...); Following are the settings in my app: Tomcat version: 6.0.18 Tomcat server connector : URIEncoding="UTF-8" webapp page : Also stated in YahooUI connector library doc...

Problem converting ISO8859-1 to UTF-8 in PHP

Hello, I am attempting to convert a ISO8859-1 string taken from a MySQL database and convert it to UTF-8 using php. However, when I use the utf8_encode function it removes almost all of the apostrophes from the string (the exceptions seem to be within html fields). Thanks ...

Use a different charset for some specific requests in ASP.NET

I'm now using UTF-8 in my web application, but some clients would POST data in other charsets like GB2312. I can't set the <globalization requestEncoding="GB2312" /> because it would affect the whole site. Can I use a charset for decoding the data for specific requests so that I can get the correct text via context.Request.Forms collec...