unicode

Joomla 1.5 & Indic Unicode Fonts - How-to?

I am using Inscript Keyboard to directly type into TinyMCE. However when I click on save, all the characters appear as question marks on website and even in article list on admin side. How I should solve the problem? I am specifically talking about Marathi but the problem-solution might be same for all Devnagrari fonts. Thanks in advan...

Windows C API for UTF8 to 1252

I'm familiar with WideCharToMultiByte and MultiByteToWideChar conversions and could use these to do something like: UTF8 -> UTF16 -> 1252 I know that iconv will do what I need, but does anybody know of any MS libs that will allow this in a single call? I should probably just pull in the iconv library, but am feeling lazy. Thanks ...

How to represent Unicode characters in an API

This is more an MBCS question than a Unicode question. I need to create an API that returns a list of structs that each instance holds a Unicode character as one of its members. This is in .NET so you'd think I'd want UTF-16, but then for Asian characters, there'd like be two characters required. What's the best practice when returnin...

Showing asian unicode string in UILabel

Hi, How can we show asian unicode values in UILabel \U2013\U00ee\U2013\U00e6\U2013\U2202\U2013\U220f\U2013\U03c0 \U2013\U00ee\U2013\U220f\U2013\U03c0\U2013\U00aa\U2013\U221e\U2014\U00c5 Thanks ...

How do i create a unicode filename in linux?

I heard fopen supports UTF8 but i dont know how to convert an array of shorts to utf8 How do i create a file with unicode letters in it? I prefer to use only built in libraries (no boost which is not installed on the linux box). I do need to use fopen but its pretty simple to. ...

UnicodeDecodeError problem with mechanize

Hi, I receive the following string from one website via mechanize: 'We\x92ve' I know that \x92 stands for ’ character. I'm trying to convert that string to Unicode: >> unicode('We\x92ve','utf-8') UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 2: unexpected code byte What am I doing wrong? Edit: The reason I t...

Reading unicode characters from text file in Delphi 2009

I have the following piece of code to read Japanese Kanji characters from UTF-8 format Text file and then load it into Memo. Var F:textFile; S:string; Begin AssignFile(F,'file.txt'); Reset(F); While not EoF(F) do Begin Readln(F,S); Memo1.Lines.Add(S); End; CloseFile(F); End; But instead of characters I see some set of totall...

Splitting unicode (I think) using .split in ruby

Hi all- I am currently scraping an rss feed from last.fm and the title attribute looks like it has a unicode "-" that comes up as \u2013 on firebug. Here is the feed for those that are curious: http://ws.audioscrobbler.com/2.0/user/rj/recenttracks.rss When I write something like this feedentry.title.split('-') it won't find the un...

How can I substitute Unicode characters with ASCII in Perl?

I can do it in vim like so: :%s/\%u2013/-/g How do I do the equivalent in Perl? I thought this would do it but it doesn't seem to be working: perl -i -pe 's/\x{2013}/-/g' my.dat ...

PHP/MySQL, discarding Unicode sent from a client

All of our tables are currently set with a LATIN1 character set. A user is currently capable of putting together unicode sequences on the client and trying to embed them into our application. What's the best way to discard all Unicode characters from hitting our database? Even better, that's the best way to ensure that only characters ba...

Opening fstream with file with Unicode file name under Windows using non-MSVC compiler

Hello, I need to open a file as std::fstream (or actually any other std::ostream) when file name is "Unicode" file name. Under MSVC I have non-standard extension std::fstream::open(wchar_t const *,...)? What can I do with other compilers like GCC (most important) and probably Borland compiler. I know that CRTL provides _wfopen but it ...

UTF-8 Encoding error, need help converting text

I've been working on a statistical translation system for haiti (code.google.com/p/ccmts) that uses a C++ backend (http://www.statmt.org/moses/?n=Development.GetStarted) and Python drives the C++ engine/backend. I've passed a UTF-8 Python string into a C++ std::string, done some processing, gotten a result back into Python and here is t...

Unicode Locale Data Markup for (e.g.) "Tue, 23 Feb 2010 06:00:44 PST"

Hi. I was parsing a XML feed and trying to convert it to a NSObject when I noticed that (e.g.) [NSDate dateFromString:@"Tue, 23 Feb 2010 06:00:44 PST"] returned nil. Then I tried to convert my string to a NSDate by using the NSDateFormatter. NSDateFormatter *df = [[NSDateFormatter alloc] init]; [df setDateFormat:@"EEE, dd MMM YYYY HH:m...

Are there delimiter bytes for UTF8 characters?

If I have a byte array that contains UTF8 content, how would I go about parsing it? Are there delimiter bytes that I can split off to get each character? ...

How to display Chinese characters (or how to create Unicode snippets) in Autohotkeys?

I usually expand text like this: ::text::this should expanded But if I use Chinese characters they don't show up properly: ::text::我是 Is expanded like: §Ú¬O I think I need some code to make the script support UTF-8 Any suggestions? ...

Defining the character encoding of a JavaScript source file

I would like to print a status message to my German users, which contains umlauts (ä/ü/ö). I also would like them be in the source file rather than having to download and parse some extra file just for the messages. However, I can't seem to find a way to define the encoding of a JS source file. Is there something like HTML's http-equiv?...

how are non-english programming/scripting languages developed ?

how are non-english programming/scripting languages developed ? do you need to be a computer scientist ? ...

Unicode - generally working with it in C++

Suppose we have an arbitrary string, s. s has the property of being from just about anywhere in the world. People from USA, Japan, Korea, Russia, China and Greece all write into s from time to time. Fortunately we don't have time travellers using Linear A, however. For the sake of discussion, let's presume we want to do string operati...

model.__unicode__() returning russian string cause TemplateSyntaxError

code: class Gallery(models.Model): title = models.CharField(max_length=100) description = models.TextField(blank=True) created = models.DateField(auto_now_add=True) class Meta: verbose_name = 'галерея' verbose_name_plural = 'галереи' def __unicode__(self): return 'Галерея %s' % self.title ...

Python - pyparsing unicode characters

Hi..:) I tried using w = Word(printables), but it isn't working. How should I give the spec for this. 'w' is meant to process Hindi characters (UTF-8) The code specifies the grammar and parses accordingly. 671.assess :: अहसास ::2 x=number + "." + src + "::" + w + "::" + number + "." + number If there is only english characters it ...