Hi,
Delphi 2009 has changed its string type to use 2 bytes to represent a character, which allows support for unicode char sets. Now when you get sizeof(string) you get length(String) * sizeof(char) . Sizeof(char) currently being 2.
What I am interested in is whether anyone knows of a way which on a character by character basis it ...
I'm implementing a blog with tags with some French characters. My question has to do with how to deal with spaces and unicode (utf-8) characters in the url.
let's say I have a tag called: ohlàlà! and I have the following code in my tag cloud:
<%= link_to h(tag.name.capitalize), { :controller => :blog, :action => :tag, :id => h(tag.name...
Hello,
I am fighting with Python to understand how do I check whether a string is in ASCII or not.
I am aware of ord(), however when I try ord('é'), I have TypeError: ord() expected a character, but string of length 2 found. I understood it is caused by the way I built Python (as explained in the ord()'s documentation).
So my questio...
I'm not exactly sure how to ask this question really, and I'm no where close to finding an answer, so I hope someone can help me.
I'm writing a Python app that connects to a remote host and receives back byte data, which I unpack using Python's built-in struct module. My problem is with the strings, as they include multiple character e...
I'm using the PHP function imagettftext() to convert text into a GIF image. The text I am converting has Unicode characters including Japanese. Everything works fine on my local machine (Ubuntu 7.10), but on my webhost server, the Japanese characters are mangled. What could be causing the difference? Everything should be encoded as UTF-8...
What is the difference between UTF and UCS.
What are the best ways to represent not European character sets (using UTF) in C++ strings. I would like to know your recommendations for:
Internal representation inside the code
For string manipulation at run-time
For using the string for display purposes.
Best storage representation (i.e...
I know this is not a "real" programming question. But, it relates to programming so I am going to set it anyway. I have a program that I need to test that reads the Byte Order Marker of the file to see if it is utf-8 or utf-16. My problem is I cannot find a program/text editor that will allow me to set the byte order marker. Can anyb...
How can I use/display characters like ♥, ♦, ♣, or ♠ in Java/Eclipse?
Wenn I try to use them directly, i.e. in the source code, Eclipse cannot save the file:
What can I do?
Edit: How can I find the unicode escape sequence?
...
I'm looking for a way to match only fully composed characters in a Unicode string.
Is [:print:] dependent upon locale in any regular expression implementation that incorporates this character class? For example, will it match Japanese character 'あ', since it is not a control character, or is [:print:] always going to be ASCII codes 0x20...
Been creating a simple program using VBA that I can use to review vocabulary in Chinese.
I've gotten a fair bit working so far, but have run into a huge problem with inputting a macron-character such as "ā" (unicode 257). The specific application I am working on right now involves changing the contents of the text-box form so that an "...
I am writing a small app which I need to test with utf-8 characters of different number of byte lengths.
I can input unicode characters to test that are encoded in utf-8 with 1,2 and 3 bytes just fine by doing, for example:
string in = "pi = \u3a0";
But how do I get a unicode character that is encoded with 4-bytes? I have tried:
str...
Does anyone know why CMapStringToOb::Lookup doesn't work in Japanese? The code loads a string from the string table, and puts it into a CMapStringToOb object. Later it loads the same string from the string table (so it is guaranteed to be exactly the same) and calls CMapStringToOb::Lookup to find it. It works in all languages that we'v...
Recently, my junk mail folder has been filling up with messages composed in what appears (to me) to be the Cyrillic alphabet. If a message body or a message subject is in Cryillic, I want to permanently delete it.
On my screen I see Cyrillic characters, but when I iterate through the messages in VBA within Outlook, the "Subject" proper...
I'm running a console app (myApp.exe) which outputs a pseudo localized (unicode) string to the standard output.
If I run this in a regular command prompt(cmd.exe), the unicode data gets lost.
If I run this in a unicode command prompt(cmd.exe /u) or set the properties of the console to "Lucida Console" then the unicode string is maintaine...
I'm trying to write a wstring to file with ofstream in binary mode, but I think I'm doing something wrong. This is what I've tried:
ofstream outFile("test.txt", std::ios::out | std::ios::binary);
wstring hello = L"hello";
outFile.write((char *) hello.c_str(), hello.length() * sizeof(wchar_t));
outFile.close();
Opening test.txt in for ...
I need to use utf-8 characters in my perl-documentation.
If I use:
perldoc MyMod.pm
I see strange characters. If I use:
pod2text MyMod.pm
everything is fine.
I use Ubuntu/Debian.
$ locale
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_ME...
I have some data with messed-up accented characters. For example in the data we have things like
ClΘmentine
that should should read
Clémentine
I'd like to clean it up with a script, but when I do this for example
Select Replace('ClΘmentine', 'Θ', 'é')
this is what I get:
Clémenéine
Apparently Θ matches both Θ and t. Any ideas ...
When you use DllImport to import a function you can specify a CharSet to use. I noticed that in C#, C++ and visual basic the .Net runtime defaults to using Ansi instead of Unicode for this. So for any system call that has an A and a W version the A version will be called by default. .Net uses unicode internally and if I'm not mistaken ne...
I've read the documentation here:
http://msdn.microsoft.com/en-us/library/ms776420(VS.85).aspx
I'm stuck on this parameter:
lpMultiByteStr
[out] Pointer to a buffer that receives the converted string.
I'm not quite sure how to properly initialize the variable and feed it into the function
...
I am looking for a (simple) text editor that can handle text in different encodings in the same document.
I need to develop some sites with mixed Japanese and English text and the editors I have now (on an English Windows system) are unable to display the Japanese text.
Jedit files don't display the Japanese text I have inputted but whe...