UTF8 or UTF-8?
Which of the two is correct terminology? ...
Which of the two is correct terminology? ...
I'm a bit worried if this function sends emails that can be recognized correctly on the majority of email and webmail clients the way it should, specifically I'm most concerned about this doubts: Are the UTF-8 declarations and attachments well formed? Do I need to use quoted_printable_decode()? If yes, where? Content-Transfer-Encoding:...
If a file contains a £ (pound) sign then directory_iterator correctly returns the utf8 character sequence \xC2\xA3 wdirectory_iterator uses wide chars, but still returns the utf8 sequence. Is this the correct behaviour for wdirectory_iterator, or am I using it incorrectly? AddFile(testpath, "pound£sign"); wdirectory_iterator iter(test...
I am a big fan of vim and gvim. But whenever I write localization code in PHP and have to translate some strings (primarily in Russian), I have to open Notepad to translate all the entries. That kinda sucks, but so far I have not found out how to make gvim work in utf8 mode. Any ideas would be appreciated. ...
Hey, I have populated a MySQL table with utf-8 strings (using a python script). You can assume that the string in the DB was correctly encoded (I've verified this by extracting the string from MySQL Query Browser and running a utf-8 decode... got my original unicode string). Now the problem begins when I try to load this string using N...
I am changing all varchar columns in our firebird database to UTF8 however I don't understand the difference in varchar size. For example, with the charset and collation set to nothing, we can set the varchar size to 255, if we set the charset and collation to UTF8, when we set the varchar to 255, it reads different values. What would ...
Hello guys. I need help with a character encoding problem that I want to sort once and for all. Here is an example of some content which I pull from a XML feed, insert into my database and then pull out. http://pastebin.com/d78d24f33 As you can see, a lot of special html characters get corrupted/broken. How can I once and for all sto...
Is it possible to sort an array with Unicode / UTF-8 characters in PHP using a natural order algorithm? For example (the order in this array is correctly ordered): $array = array ( 0 => 'Agile', 1 => 'Ágile', 2 => 'Àgile', 3 => 'Âgile', 4 => 'Ägile', 5 => 'Ãgile', 6 => 'Test', ); If I try with asort($array)...
I need to convert large UTF-8 strings into ASCII. It should be reversible, and ideally a quick/lightweight algorithm. How can I do this? I need the source code (using loops) or the JavaScript code. (should not be dependent on any platform/framework/library) Edit: I understand that the ASCII representation will not look correct and wou...
I'm trying to debug something and I'm wondering if the following code could ever return true public boolean impossible(byte[] myBytes) { if (myBytes.length == 0) return false; String string = new String(myBytes, "UTF-8"); return string.length() == 0; } Is there some value I can pass in that will return true? I've fiddled wit...
I'm reading an HTML document that contains UTF-8 chars but when I access the innerHTML of the document, all the "bad" chars show up as 0xfffd. I've tried it in all the major browsers and it behaves the same way. When I alert() the innerHTML it shows those chars as a "diamond with a ? mark". Surprisingly the following works perfectly, co...
Heres my problem. I have a mysql table called quotes. In one of the rows, a quote contains the folloqing characters ‘ and ’ Now the row collation is utf8__unicode__ci When using MySQL Query Browser and PHPMyAdmin to retrive the rows the quotes come out as intended. How ever when i retrive them from the database using PHP and display ...
My Perl app and MySQL database now handle incoming UTF-8 data properly, but I have to convert the pre-existing data. Some of the data appears to have been encoded as CP-1252 and not decoded as such before being encoded as UTF-8 and stored in MySQL. I've read the O'Reilly article Turning MySQL data in latin1 to utf8 utf-8, but although it...
How do I set the character encoding for a specific table? E.g: CREATE TABLE COMMENTS ( ID INTEGER GENERATED BY DEFAULT AS IDENTITY (START WITH 0, INCREMENT BY 1) NOT NULL, TXT LONGVARCHAR, PRIMARY KEY (ID) ) By default it's encoded as ASCII but I'd rather use UTF-8 for this one table. ...
A requirement of the product that we are building is that its URL endpoints are semantically meaningful to users in their native language. This means that we need UTF-8 encoded URLs to support every alphabet under the sun. We would also not like to have to provide installation configuration documentation for every application server and...
Hello all, My PHP application is taking user input and sending it to a WCF Web Service. Sometimes my users copy and paste from Word and get UTF-16 Characters into their message such as the "En Dash" \u2013 I get the following error when this occurs. PHP Fatal error: SOAP-ERROR: Encoding: string '\xe2...' is not a valid utf-8 st...
I'm working on an iphone app that needs to display superscripts and subscripts. I'm using a picker to read in data from a plist but the unicode values aren't being displayed corretly in the pickerview. Subscripts and superscripts are not being recognized. I'm assuming this is due to the encoding of the plist as utf-8, so the question ...
Hello everyone, I have an XML structure like this, some Student item contains invalid UTF-8 byte sequenceswhich may cause XML parsing fail for the whole XML document. What I want to do is, filter out Student item which contains UTF-8 byte sequences, and keep the valid byte sequences ones. Any advice or samples about how to do this in ....
I've recently had to switch encoding of webapp I'm working on from ISO-xx to utf8. Everything went smooth, except properties files. I added „-Dfile.encoding=UTF-8“ in eclipse.ini and normal files work fine. Properties however show some strange behaviour. If I copy utf8 encoded properties from Notepad++ and paste them in eclipse, they sh...
What are the implications of a change from UTF-8 to UTF-16 for HTML encoding? I would like to know your thoughts on the issue. Are there things I need to think of before making such a change? Note: Interested due to enormous amounts of japanese and chinese text I need to handle. ...