unicode

What's the best option to display Unicode text (hebrew, etc.) in VB6

I have some customers who want to use our speech therapy software in Hebrew. The programs are in VB6. The best option I'm aware of are: use the Forms 2.0 controls from MS Office, but you can't distribute them. http://www.hexagora.com/en_dw_unictrl.asp $899 http://www.iconico.com/UniToolbox/ $499 Any other options? ...

How to avoid tripping over UTF-8 BOM when reading files

I'm consuming a data feed that has recently added a Unicode BOM header (U+FEFF), and my rake task is now messed up by it. I can skip the first 3 bytes with file.gets[3..-1] but is there a more elegant way to read files in Ruby which can handle this correctly, whether a BOM is present or not? ...

ASP.NET: Will Saving an XmlDocument to the Response.OutputStream honor the encoding?

i want to send the xml of an XmlDocument object to the HTTP client, but i'm concerned that the suggested soltuion might not honor the encoding that the Response has been set to use: public void ProcessRequest(HttpContext context) { XmlDocument doc = GetXmlToShow(context); context.Response.ContentType = "text/xml"; context.Resp...

What is the difference between these two versions of code (pointer arithmetics & unicode) ?

I'm debugging some opensource code on a 64-bit Solaris system, using GCC, that converts 2byte characters (wchar_t) to 4byte characters (wchar_t). Because Solaris like some other Unixes define wchar_t as 4byte, not 2byte like in Windows. Now I fixed the problem, through laying it out the pointer arithmetic over two lines, but I'm not sur...

How I encode the ugly string?

I have a string that is: !"#$%?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]\^_`abcdefghijklmnopqrstuvwxyz{|}~¡¢£¤¥¦§¨©ª« ®¯°±²³´µ¶•¸¹º»¼½¾¿ÀÁÂÃÄÅàáâäèçéêëìíîïôö÷òóõùúý I post that to service and used Htmlencode, then I get a result: !#$%&'()* ,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~����������� ����...

Displaying a Downward Triangle in VB.NET ▼ (U+25BC)

Hey, I'm trying to figure out how to display the ▼ character properly in a .NET winform application. I am creating a custom control, and for the button, I want this character to appear. I am able to set the text to this character, but it appears as a blank square. Any ideas on what I need to do to make this character appear properl...

Conversion of a unicode character from byte

In our API, we use byte[] to send over data across the network. Everything worked fine, until the day our "foreign" clients decided to pass/receive Unicode characters. As far as I know, Unicode characters occupy 2 bytes, however, we only allocate 1 byte in the byte array for them. Here is how we read the character from the byte[] ...

PHP UTF-8 questions - If I create a string in PHP... is it in UTF-8?

In PHP, if I create a string like this: $str = "bla bla here is my string"; Will I then be able to use the mbstring functions to operate on that string as UTF8? // Will this work? $str = mb_strlen($str); Further, if I then have another string that I know is UTF-8 (say it was a POSTed form value, or a UTF-8 string from a databa...

SMTP and Unicode/UTF-8 characters...? How do I send them? base64 everything?

Using SMTP, how do you send unicode/UTF-8 e-mails? Am I expected to base64 encode the UTF-8 body and specify that in the MIME header or...? How about the headers? I'm sure there's a standard somewhere the describes this... but apparently I'm too tired/still too sick to find it... Thanks! ...

Problem with generating images from unicode strings using imagemagick

i'm generating text images with following command sequence convert -background "rgb(233, 231, 218)" -fill black \ -font media/fonts/FuturaStd-Medium.otf \ -pointsize 13 label:"ğüşıöçĞÜŞİÖÇ" -size 88x18 \ media/images/category_images/food-drink/category-top-row/tr_food-drink.png which generates the following image. ğşĞİŞ are proble...

What factors make PHP Unicode-incompatible?

I am able use UTF-8 characters just fine in my scripts. As a matter of fact it is possible to have names of variables and functions contain Unicode characters. There is also the mb_string extension, which deals with multi-byte strings. Yet in countless articles, PHP is criticized for its lack of Unicode support. I don't get it; why is PH...

Convert from Codepage 1252 (Windows) to Java, in Java

Hi! I have some strings in Java (originally from an Excel sheet) that I presume are in Windows 1252 codepage. I want them converted to Javas own unicode format. The Excel file was parsed using the JXL package, in case that matter. I will clarify: apparently the strings gotten from the Excel file look pretty much like it already is some...

Unicode-aware strings(1) program

Hello, Does anybody have a code sample for a unicode-aware strings program? Programming language doesn't matter. I want something that essentially does the same thing as the unix command "strings", but that also functions on unicode text (UTF-16 or UTF-8), pulling runs of english-language characters and punctuation. (I only care about...

East Asian Characters rendered as squares with PHP imagettftext()

Hi, I'm trying to render images with verdana text using PHP imagettftext function. However, all east asian characters are not being rendered correctly. I tried using other fonts like tahoma and Lucida Grande, but neither works. Arial Unicode, however, works perfectly. The problem is that I don't want to use Arial as my font. Is there a...

JSON character encoding

I am writing a webservice that uses json to represent its resources, and I am a bit stuck thinking about the best way to encode the json. Reading the json rfc (http://www.ietf.org/rfc/rfc4627.txt) it is clear that the preferred encoding is utf-8. But the rfc also describes a string escaping mechanism for specifying characters. I assume t...

complete, monospaced Unicode font?

I'm looking for a good programming font that lets me add comments and string literals in Unicode, usually Japanese and Chinese along with some Latin and Cyrillic languages. So far the situation seems to be "complete, monospace, free, pick 2" and Google is failing me with this (maybe because there are no good ones?). The best I found is...

File names containing non-ascii international language characters

Has anyone had experience generating files that have filenames containing non-ascii international language characters? Is doing this an easy thing to achieve, or is it fraught with danger? Is this funtionality expected from Japanese/Chinese speaking web users? Should file extensions also be international language characters? Info: W...

Check that varchar2 value contains unicode characters

I want to make an Oracle function to remove 'garbage' from user input values, but there's also a requirement that users may enter Unicode text which I'm supposed to leave as is. REGEXP_REPLACE (search_text, '[^0-9A-Za-z]', '') takes care of non-Unicode, how can I check that varchar2 value contains Unicode characters? Looks like I could...

How can I use _stprintf in my programs, with and without UNICODE support?

Microsoft's <tchar.h> defines _stprintf as 'swprintf' if _UNICODE is defined, and 'sprintf' if not. But these functions take different arguments! In swprintf, the second argument is the buffer size, but sprintf doesn't have this. Did somebody goof? If so, this is a big one. How can I use _stprintf in my programs, and have them work with...

Python string prints as u['String']

Hi there, This will surely be an easy one but it is really bugging me. I have a script that reads in a webpage and uses BeutifulSoup to parse it. From the soup I extract all the links as my final goal is to print out the link.contents. All off the text that I am parsing is ASCII. I know that python treats strings as unicode, and I am...