unicode

SQL Server 2005 / XML Stored Proc - Unicode to ascii? (Exception 0xc00ce508)

Hello! I have an MSSQL2005 stored procedure here, which is supposed to take an XML message as input, and store it's content into a table. The table fields are varchars, because our delphi backend application could not handle unicode. Now, the messages that come in, are encoded ISO-8859-1. All is fine until characters over the > 128 stan...

GetPrivateProfileString Oddity

I was just tinkering around with calling GetPrivateProfileString and GetPrivateProfileSection in kernel32 from .NET and came across something odd I don't understand. Let's start with this encantation: Private Declare Unicode Function GetPrivateProfileString Lib "kernel32" Alias "GetPrivateProfileStringW" ( _ ByVal lpApplication...

Delphi 2009 + Unicode + Char-size

Hello! I just got Delphi 2009 and have previously read some articles about modifications that might be necessary because of the switch to Unicode strings. Mostly, it is mentioned that sizeof(char) is not guaranteed to be 1 anymore. But why would this be interesting regarding string manipulation? For example, if I use an AnsiString:='Tes...

Unicode in PDF

My program generates relatively simple PDF documents on request, but I'm having trouble with unicode characters, like kanji or odd math symbols. To write a normal string in PDF, you place it in brackets: (something) There is also the option to escape a character with octal codes: (\527) but this only goes up to 512 characters. How ...

Do UTF-8,UTF-16, and UTF-32 Unicode encodings differ in the number of characters they can store?

Okay. I know this looks like the typical "Why didn't he just Google it or go to www.unicode.org and look it up?" answer, but for such a simple question the answer still eludes me after checking both sources. I am pretty sure that all three of these encoding systems support all of the Unicode characters, but I need to confirm it before I...

How Do You Write Code That Is Safe for UTF-8?

We have a set of applications that were developed for the ASCII character set. Now, we're trying to install it in Iceland, and are running into problems where the Icelandic characters are getting screwed up. We are working through our issues, but I was wondering: Is there a good "guide" out there for writing C++ code that is designed ...

Type double byte character into vbscript file

I need to convert (&rarr) to a symbol I can type into a ANSI VBScript file. I am writing a script that translates a select set of htmlcodes to their actual double byte symbols using a regex. Many languages accomplish this using "\0x8594;"... what is the equivelent in VBScript? ...

How to convert a Unicode character to its ASCII equivalent

Here's the problem: In C# I'm getting information from a legacy ACCESS database. .NET converts the content of the database (in the case of this problem a string) to Unicode before handing the content to me. How do I convert this Unicode string back to it's ASCII equivalent? Edit Unicode char 710 is indeed MODIFIER LETTER CIRCUMFLEX A...

Saving 'tree /f /a" results to a textfile with unicode support

I'm trying to use the tree command in a windows commandline to generate a text file listing the contents of a directory but when I pipe the output the unicode characters get stuffed up. Here is the command I am using: tree /f /a > output.txt The results in the console window are fine: \---Erika szobája cover.jpg Eri...

Has anyone read Robert Martin's last Book, "Clean Code"?

Has anyone read Uncle Bob's last book Clean code? UPDATE: Is it similar to Martin Fowler's "Refactoring"? ...

Java, unicode and fonts

I've looked at the java documentation and scoured the net for information on java's support for international characters with specific fonts (such as Monospace), but haven't been able to get a clear concrete answer. There has been a change between java 1.4 and java 1.5/1.6. For example, in java 1.4 if you set the font on a JTextArea to ...

Ruby: How to break a potentially unicode string into bytes

I'm writing a game which is taking user input and rendering it on-screen. The engine I'm using for this is entirely unicode-friendly, so I'd like to keep that if at all possible. The problem is that the rendering loop looks like this: "string".each_byte do |c| render_this_letter(c) end I don't know a whole lot about i18n, but I ...

What are the best practices for handling Unicode strings in C#?

Can somebody please provide me some important aspects I should be aware of while handling Unicode strings in C#? ...

How does the UTF-8 support of TinyXML work

I'm using TinyXML (http://www.grinninglizard.com/tinyxml/) to parse/build XML files. Now according to the documentation (http://www.grinninglizard.com/tinyxmldocs/) this library supports multibyte character sets through UTF-8. So far so good I think. But, the only API that the library provides (for getting/setting element names, attribut...

Character reading from file in Python

In a text file, there is a string "I don't like this". However, when I read it into a string, it becomes "I don\xe2\x80\x98t like this". I understand that \u2018 is the unicode representation of "'". I use f1 = open (file1, "r") text = f1.read() command to do the reading. Now, is it possible to read the string in such a way that wh...

Javascript: How to find whether a particular string has unicode characters (esp. Double Byte characters)

To be more precise, I need to know whether (and if possible, how) I can find whether a given string has double byte characters or not. Basically, I need to open a pop-up to display a given text which can contain double byte characters, like Chinese or Japanese. In this case, we need to adjust the window size than it would be for English ...

UTF8 to/from wide char conversion in STL

Is it possible to convert UTF8 string in a std::string to std::wstring and vice versa in a platform independent manner? In a Windows application I would use MultiByteToWideChar and WideCharToMultiByte. However, the code is compiled for multiple OSes and I'm limited to standard C++ library. ...

Which programming languages are friendly to both web-development and Unicode?

I've been using PHP for a while, but I'm growing tired of the sloppy/awkward Unicode support (among other things). Baked-in Unicode support is very important to me, since I despise debugging character encoding issues, especially between the database and scripting layers. What languages work well with both Unicode and web development? He...

Unicode Characters that can be used to trick a string sorter?

Since Unicode lacks a series of zero width sorting characters, I need to determine equivalent characters that will allow me to force a certain order on a list that is automatically sorted by character values. Unfortunately the list items are not in an alphabetical order, nor is it acceptable to prefix them with visible characters to ensu...

Are there guidelines for updating C++Builder applications for C++Builder 2009?

I have a range of Win32 VCL applications developed with C++Builder from BCB5 onwards, and want to port them to ECB2009 or whatever it's now called. Some of my applications use the old TNT/TMS unicode components, so I have a good mix of AnsiStrings and WideStrings throughout the code. The new version introduces UnicodeString, and a bunch...