I've got two options for unicode that look promising for a mysql database.
utf8_general_ci unicode (multilingual), case-insensitive
utf8_unicode_ci unicode (multilingual), case-insensitive
Can you please explain what is the difference between utf8_general_ci and utf8_unicode_ci? What are the effects of choosing one over the other when...
I'd like to have a canonical place to pool information about Unicode support in various languages. Is it a part of the core language? Is it provided in libraries? Is it not available at all? Is there a resource popular resource for Unicode information in a language? One language per answer please. Also if you could make the language a he...
I use the WPF RichTexBox control in WPF project. The problem I get stuck in is it cannot display the unicode as the System.Windows.Forms.RichTextbox in Win Project.
E.g : When I copy a paragraph of Chinese language and paste it to the WPF Richtextbox. The font is break and it cannot display. But when I use System.Windows.Forms.RichTextb...
Hi,
I need to update the registry, specifically the Outlook 2003 Master category list to the following key. "Software\Microsoft\Office\11.0\Outlook\Categories"
It is stored as a Reg_binary value and it involves Unicode conversion. I was able to read it successfully but I am unsure how I can write it back. The code I used to read it is ...
The problem: I edit a .vcproj file, save it as UTF-8 (and specify that in the xml header), and when I open it in VS, the next time it saves it the encoding reverts back to CP-1255/1252/1251 (depending on the Localized Settings on the machine).
This has become a problem in our R&D, since whenever someone commits a .vcproj file the encodi...
I have a rather large SQL file which starts with the byte order marker of FFFE. I have split this file using the unicode aware linux split tool into 100,000 line chunks. But when passing these back to windows, it does not like any of the parts other than the first one as only it has the FFFE byte order marker on.
How can I add this two ...
Are these obsolete? They seem like the worst idea ever -- embed something in the contents of your file that no one can see, but impacts the file's functionality. I don't understand why I would want one.
...
I am trying to output a sigma character () in a label in a FusionChart graph. How can I specify that character in a PHP string? I have tried the htmlentity σ, but it is not interpreted correctly by the graph. Is there any way to specify the character in PHP using some sort of character code?
...
How do I see what the character set that a MySQL database, table and column are in? Is there something like
SHOW CHARACTER SET FOR mydatabase;
and
SHOW CHARACTER SET FOR mydatabase.mytable;
and
SHOW CHARACTER SET FOR mydatabase.mytable.mycolumn;
...
I'm going to ask what is probably quite a controversial question: "Should one of the most
popular encodings, UTF-16, be considered harmful?"
Why do I ask this question?
How many programmers are aware of the fact that UTF-16 is actually a variable length encoding? By this I mean that there are code points that, represented as surrogate ...
CharsetDecoder reads:
There are two general types of decoding errors. If the input byte sequence is not legal for this charset then the input is considered malformed. If the input byte sequence is legal but cannot be mapped to a valid Unicode character then an unmappable character has been encountered.
I understand the concept of m...
I am on python 2.6 for Windows.
I use os.walk t read a file tree. Files may have non-7-bit characters (German "ae" for example) in their filenames. These are encoded in Pythons internal string representation.
I am processing these filenames with Python library functions and that fails due to wrong encoding.
How can I convert these fil...
Hi, i am tring to add "翻訳するテキストやWebページ " into a PostgreSQL table, but its shown like this:
"& #32763;""& #35379;"す& #12427;テ& #12461;& #12473;& #12488;& #12420;Web& #12506;& #12540;& #12472;
How can I insert that in proper format?
<?php
$db = pg_connect("host=localhost port=5432 dbname=lang user=password=") or die(":(")...
How can I encode the Unicode character U+0048 (H), say, in a Powershell string? In C# I would just do this: "\u0048", but that doesn't appear to work in Powershell.
...
Hi,
I have a decryption routine in VB6. I now want the same decryption in C#.
The strings that need decryption are in unicode, so I use Encoding.Unicode.GetString to read the input in C#. The input now looks exactly the same as in VB6.
The first few characters in the loop are decrypted ok! Then I encounter a difference...
The program ...
I seem to have the all-familiar problem of correctly reading and viewing a web page. It looks like Python reads the page in UTF-8 but when I try to convert it to something more viewable (iso-8859-1) I get this error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 2: ordinal not in range(128)
The code look...
While searching for a proper way to trim non-breaking space from parsed HTML, I've first stumbled on java's spartan definition of String.trim() which is at least properly documented. I wanted to avoid explicitly listing characters eligible for trimming, so I assumed that using Unicode backed methods on Character class would do the job fo...
I have some french letters (é, è, à...) in a django template but when it is loaded by django, an UnicodeDecodeError exception is raised.
If I don't load the template but directly use a python string. It works ok.
Is there something to do to use unicode with django template?
...
I'm working on a project in VS2008 that I'm compiling in MBCS but I need to work with some UTF-8 strings to interact with some web services. I wrote a function that works perfectly with Unicode but not MBCS. Is there any way I can convert a MBCS string to UTF-8 or to Unicode?
Thanks!
...
Is there a way that I can add alias to python for encoding. There are sites on the web that are using the encoding 'windows-1251' but have their charset set to win-1251, so I would like to have win-1251 be an alias to windows-1251
...