unicode

How do I sort alphabetically in Python?

Python sorts by byte value by default, which means é comes after z and other equally funny things. What is the best way to sort alphabetically in Python? Is there a library for this? I couldn't find anything. Preferrably sorting should have language support so it understands that åäö should be sorted after z in Swedish, but that ü shoul...

Unicode problem Django-Python-URLLIB-MySQL

I am fetching a webpage (http://autoweek.com) and trying to process it but getting encoding error. Autoweek declares "iso-8859-1" encoding and has the word "Nürburgring" (u with umlaut) I do: # -*- encoding: utf-8 -*- import urllib webpage = urllib.urlopen(feed.crawl_url).read() webpage.decode("utf-8") it gives me the following err...

python - Problem storing Unicode character to MySQL with Django

I have the string u"Played Mirror's Edge\u2122" Which should be shown as Played Mirror's Edge™ But that is another issue. My problem at hand is that I'm putting it in a model and then trying to save it to a database. AKA: a = models.Achievement(name=u"Played Mirror's Edge\u2122") a.save() And I'm getting : 'ascii' codec can...

Converting Unicode code points to UTF-8

Currently I have something like this \u4eac\u90fd and I want to convert it to UTF-8 so I can insert it into a database. ...

Java - Statistics Symbols

What's the best way to insert statistics symbols in a JLabel's text? For example, the x-bar? I tried assigning the text field the following with no success: <html>x&#772; Thanks. ...

"Cannot decode string with wide characters" appears on a weird place

Hello, I am trying to use XML::RAI perl module on UTF8 coded text and I still have error I don't really understand... here is the code (it shouldn't do anything useful yet): use HTTP::Request; use LWP::UserAgent; use XML::RAI; use Encode; my $ua = LWP::UserAgent->new; sub readFromWeb{ my $address = shift; my $request = HTTP:...

Convert Memo to Text

Hi, I've got an msaccess database which have been created in Access 2002. I only have access 2003 and 2008 in my computer. so I've converted the database into access 2003 format. The problem I have is that I have a table named 'tblItms_F001' in the database with a column named 'stemtext' which is in memo datatype. I just want to be a...

Convert Memo to Text

Exact Duplicate of : http://stackoverflow.com/questions/1112954/convert-memo-to-text Hello, I've got an msaccess database which have been created in MS Access 97. I only have access 2003 and 2008 in my computer. so I've converted the database into access 2003 format. The problem I have is that I have a table named 'tblItms_F001' in...

WebPageContent problem

I have parsed some data from webpage content and stored it as an NSString. In that string there is a unicode character (\u0097). How can i remove or replace this and avoid all unicode characters? I tried using this line but it hasn't worked: [webpagecontent stringByReplacingOccurrencesOfString:@"\\u0097" withString:@" "]; Can anyone ...

Visual C++ argv question

I'm having some trouble with Visual Studio 2008. Very simple program: printing strings that are sent in as arguments. Why does this: #include <iostream> using namespace std; int _tmain(int argc, char* argv[]) { for (int c = 0; c < argc; c++) { cout << argv[c] << " "; } } For these arguments: program.exe testing on...

Delphi 2009 - Implicit string to RawByteString conversion warnings

Hello there, I have just got my hands on D2009 and using it with one of our existing projects - it all compiles fine however I have just picked up DIRegEx to use some regex in the project. However it's always giving warnings about String to RawByteString and vice versa. Eg var Response : string; begin Response := idHTTP.Get('http...

How to parse xml in ANSI or Unicode in javascript?

I'm trying to fetch and parse an XML-file through javascript. I don't control the XML-file. Now somehow the encoding of some XML-files changed, which results in the code not being able to parse the file as far as I can tell. It used to be ANSI, some are Unicode now (and those are failing). Is there a way for me to correctly get the cont...

Should I use Unicode string by default?

Is it considered as a good practice to pick Unicode string over regular string when coding in Python? I mainly work on the Windows platform, where most of the string types are Unicode these days (i.e. .NET String, '_UNICODE' turned on by default on a new c++ project, etc ). Therefore, I tend to think that the case where non-Unicode strin...

What are the limitations of primitive character types in D?

I am currently exploring the specification of the Digital Mars D language, and am having a little trouble understanding the complete nature of the primitive character types. The book Learn to Tango With D is similarly vague on the capabilities and limitations of the language in this area. The types are given on the website as: char; ...

concating 2 unicode strings - how to do that ?

hello everyone, I have 2 unicode strings which I like to concat. everytime I try to concat using RtlAppendUnicodeStringToString it telling me "STATUS_BUFFER_TOO_SMALL", even though im increasing my destination unicodestring.length to big numbers. what is the method to concat 2 unicode strings ? thanks ...

D2009 TStringlist ansistring

The businesswise calm of the summer has started so I picked up the migration to D2009. I roughly determined for every subsystem of the program if they should remain ascii, or can be unicode, and started porting. It went pretty ok, all components were there in D2009 versions (some, like VSTView, slightly incompatible though) but I now ha...

Fulltext search (sql server 2005) works only on some fields..

OK this is the situation.. I am enabling fulltext search on a table but it only works on some fields.. CREATE FULLTEXT CATALOG [defaultcatalog] CREATE UNIQUE INDEX ui_staticid on static(id) CREATE FULLTEXT INDEX ON static(title_gr LANGUAGE 19,title_en,description_gr LANGUAGE 19,description_en) KEY INDEX staticid ON [defaultcatalog] WIT...

Visual C++ Unicode String Literal is giving error: 'L': undeclared identifier

I'm working on getting a Visual C++ 2005 solution to compile in unicode. However, In some of my projects (but not all), I get errors in the form: 1>.\CBitFlags.cpp(25) : error C2065: 'L' : undeclared identifier and the line of code in question is: LOGERROR(UTILITY, L"Tried to use object to store %d flags, when max is %d", I am BAF...

Newlines in string not writing out to file

I'm trying to write a program that manipulates unicode strings read in from a file. I thought of two approaches - one where I read the whole file containing newlines in, perform a couple regex substitutions, and write it back out to another file; the other where I read in the file line by line and match individual lines and substitute o...

Unicode versions in .NET

The documentation of CharUnicodeInfo.GetUnicodeCategory says: Note that CharUnicodeInfo.GetUnicodeCategory does not always return the same UnicodeCategory value as the Char.GetUnicodeCategory method when passed a particular character as a parameter. The CharUnicodeInfo.GetUnicodeCategory method is designed to reflect the curren...