unicode

Unicode Problem with SQLAlchemy

I know I'm having a problem with a conversion from Unicode but I'm not sure where it's happening. I'm extracting data about a recent Eruopean trip from a directory of HTML files. Some of the location names have non-ASCII characters (such as é, ô, ü). I'm getting the data from a string representation of the the file using regex. If i ...

How to enforce internet explorer to use encoding given in meta tag?

I'm trying to prepare a demo html page with mixed english and arabic content. Basically it contains a small table with english phrases on the left, and the arabic translation on the right side. Because I don't understand arabic, I took the first three characters of the arabic alphabet from the Unicode reference. First attempt, using th...

Passing strange text as variables via post method in php

I have an odd problem. Our company collects data and we use a HORRIBLE piece of software to handle all of our phone interviewing. It uses binary files instead of SQL and uses no compression. As of right now we have to manually run all reports for the clients. I am working on building a web interface to our data and common reports. Now I...

Are there any problems converting between SHIFT_JIS and Unicode encodings?

I've heard there are (used to be?) ambiguous mappings between Unicode and SHIFT_JIS codes. This KB article somewhat proves this. So the question is: will I lose any data if I take SHIFT_JIS-encoded text, convert it to Unicode and back? Details: I'm talking about Windows (XP and on) and .NET (which in theory relies on NLS API). ...

HTML encoding of a string pasted from Word

See http://pilot.whatpub.org/Guide/002000/Pub002687.htm and have a look at the source. The text in the description ("Refurbished in 2005...") has been pasted from a Word document into a System.Web.UI.WebControls.TextBox and then saved into a database as unicode. It's obviously got some non-ASCII characters in there that IE interprets s...

How do I convert an NSString into something I can use with FSCreateDirectoryUnicode?

I'm new to Mac and Objective-C, so I may be barking up the wrong tree here and quite possibly there are better ways of doing this. I have tried the code below and it doesn't seem right. It seems I don't get the correct length in the call to FSCreateDirectoryUnicode. What is the simplest way to accomplish this? NSString *theString = @"M...

Why isn't everything we do in Unicode?

Given that Unicode has been around for 18 years, why are there still apps that don't have Unicode support? Even my experiences with some operating systems and Unicode have been painful to say the least. As Joel Spolsky pointed out in 2003, it's not that hard. So what's the deal? Why can't we get it together? ...

Read unicode text files with java

Real simple question really. I need to read a Unicode text file in a Java program. I am used to using plain ASCII text with a BufferedReader FileReader combo which is obviously not working :( I know that I can read a String in the 'traditional' way using a Buffered Reader and then convert it using something like: temp = new String(tem...

Python 3 doesn't read unicode file on a new server

My webpages are served by a script that dynamically imports a bunch of files with try: with open (filename, 'r') as f: exec(f.read()) except IOError: pass (actually, can you suggest a better method of importing a file? I'm sure there is one.) Sometimes the files have strings in different languages, like # contents of lan...

How can I handle unicode with Perl's DBI?

My delicious-to-wp perl script works but gives for all "weird" characters even weirder output. So I tried $description = decode_utf8( $description ); but that doesnt make a difference. I would like e.g. “go live” to become “go live” and not “go live” How can I handle unicode in Perl so that this works? UPDATE: I found the prob...

Python 3, is using sys.stdout.buffer.write() good style?

After I learned about reading unicode files in Python 3.0 web script, now it's time for me to learn using print() with unicode. I searched for writing unicode, for example this question explains that you can't write unicode characters to non-unicode console. However, in my case, the output is given to Apache and I am sure that it is cap...

Returning pure Django form errors in JSON

Hi I have a Django form which I'm validating in a normal Django view. I'm trying to figure out how to extract the pure errors (without the HTML formatting). Below is the code I'm using at the moment. return json_response({ 'success' : False, 'errors' : form.errors }) With this, I get the infamous proxy object e...

How do convert unicode escape sequences to unicode characters in a python string

When I tried to get the content of a tag using "unicode(head.contents[3])" i get the output similar to this: "Christensen Sk\xf6ld". I want the escape sequence to be returned as string. How to do it in python? ...

How to set up Win32 tooltips control with dynamic unicode text?

I am having some trouble provding a Win32 tooltips control with dynamic text in unicode format. I use the following code to set up the control: INITCOMMONCONTROLSEX icc; icc.dwSize = sizeof(INITCOMMONCONTROLSEX); icc.dwICC = ICC_WIN95_CLASSES; InitCommonControlsEx( HWND hwnd_tip = CreateWindowExW(0, TOOLTIPS_CLASSW, NULL, WS_POPUP |...

writing text with diacritic ("nikud", vocalization marks) using PIL (python image library)

writing simple text on an image using PIL is easy. draw = ImageDraw.Draw(img) draw.text((10, y), text2, font=font, fill=forecolor ) however, when I try to write Hebrew punctuation marks (called "nikud" or ניקוד), the characters does not overlap as it should. I guess this question is relevant also to Arabic and other similar lang...

Is it possible to print DOS characters on a website?

Hi, I would like to print some kind of ASCII "art" on a web page in pre-tags. These graphics use DOS characters to show a map like old maze games did. I didn't find anything in the HTML special character reference. Is there a way to use these characters in HTML ? Thanks in advance. ...

Getting unicode string from its code - C#

I know following is the way to use unicode in C# string unicodeString = "\u0D15"; In my situation, I will not get the character code (0D15) at compile time. I get this from a XML file at runtime. I wonder how do I convert this code to unicode string? I tried the following // will not compile as unrecognized escape sequence string uni...

Why use Unicode if your program is English only?

So I've read Joel's article, and looked through SO, and it seems the only reason to switch from ASCII to Unicode is for internationalization. The company I work for, as a policy, will only release software in English, even though we have customers throughout the world. Since all of our customers are scientists, they have functional eno...

Objective C unicode character comparisons

How are unicode comparisons coded? I need to test exactly as below, checking for specific letters in a string. The code below chokes: warning: comparison between pointer and integer for (charIndex = 0; charIndex < [myString length]; charIndex++) { unichar testChar = [myString characterAtIndex:charIndex]; if (testChar == "A") ...

want to clean up and change unicode form fields in using rails model in a more DRY way

Currently I am using this to strip out whitespaces. class Newsletter < ActiveRecord::Base before_validation :clean_up_whitespace end def clean_up_whitespace fields_to_strip = ['title','notes'] fields_to_strip.each { |f| unless self.attributes[f].nil? self.attributes[f].strip! end } end I want to do something simi...