questions about unicode | ansaurus

unicode

Detect Unicode characters in NSString on iPhone

I am working on an SMS application for the iPhone. I need to detect if the user has entered any unicode characters inside the NSString they wish to send. I need to do this is because unicode characters take up more space in the message, and also because I need to convert them into their hexadecimal equivalents. So my question is how d...

java how to write 0x13 unicode character ?

Hi , how to print 0x13 Unicode character in java ??? ...

How do I make emacs display a multi-byte encoded file, properly? Is it mule?

When I open a multi-byte file, I get this: ...

Powershell: using redirection within the script produces a unicode output. How to emit single-byte ASCII text?

I am using Sandcastle Helpfile Builder to produce a helpfile (.chm). The project is a .shfbproj file, which is XML format, works with msbuild. I want to automatically update the Footer text that appears in the generated .chm file. I use this snippet: $newFooter = "<FooterText>MyProduct v1.2.3.4</FooterText>"; get-content -Encodin...

Fast way to filter illegal xml unicode chars in python?

XML specification lists a bunch of Unicode characters that are either illegal or "discouraged". Now, given a string, what is the best way to remove all those illegal chars from it? Right now, my best bet is a regular expression, but it's a bit of a mouthful: illegal_xml_re = re.compile(u'[\x00-\x08\x0b-\x1f\x7f-\x84\x86-\x9f\ud800-\udf...

mysql encoding problems

I am dealing with some external APIs, and when I save to the db, I am receiving some encoding errors, All the content I am dealing with is in unicode; but the mysql encoding is set ti latin1. It seems to work fine on my local system, but throws error on the server. The only difference between the environments is that the local runs pyth...

character-encoding

What does 'u' mean in a list?

This is the first time I've came across this. Just printed a list and each element seems to have a u in front of it i.e. [u'hello', u'hi', u'hey'] What does it mean and why would a list have this in front of each element? As I don't know how common this is, if you'd like to see how I came across it, I'll happily edit the post...

How to run Django on Windows and cope with Apache not having a daemon mode?

Evolution of this question This started as an attempt to find other recommendations for running Django on Linux, accessing SQL Server via Django-PyODBC, and supporting Unicode as competently as in installations running Django on Windows. After failing to materialize with a good solution for ODBC drivers in Linux that would provide the ...

sql-server-2005

Best way to decode unknown unicoding encoding in Python 2.5

Have I got that all the right way round? Anyway, I am parsing a lot of html, but I don't always know what encoding it's meant to be (a surprising number lie about it). The code below easily shows what I've been doing so far, but I'm sure there's a better way. Your suggestions would be much appreciated. import logging import codecs from ...

character-encoding

How can I display Unicode strings while debugging on linux?

I have been working for some years now as C++ Developer using MS Visual Studio as working platform. Since I privately prefer to use linux, I recently took the chance to move my working environment to linux as well. Since I have been optimizing my windows environment for several years now, of course it turns out several things are missin...

String to Unicode conversion with VB.NET

How could i convert a Greek string, to Unicode with VB.NET, without knowing the source encoding? ...

preg_match and UTF-8 in PHP

I'm trying to search a UTF8-encoded string using preg_match. preg_match('/H/u', "\xC2\xA1Hola!", $a_matches, PREG_OFFSET_CAPTURE); echo $a_matches[0][1]; This should print 1, since "H" is at index 1 in the string "¡Hola!". But it prints 2. So it seems like it's not treating the subject as a UTF8-encoded string, even though I'm passing...

Outputting unicode characters in windows terminal

Over the past week I've been working on a roguelike game in C++ along with a friend. Mostly too learn the language. I'm using: pdcurses Windows 7 Visual studio C++ To output wchar_t's wherever I want to in the console. I have succeeded in otuputting some unicode characters such as \u263B (☻), but others such as \u2638 (☸) will just ...

Having difficulty with character encoding on website.

I have a website that allows users from around the world to submit profiles. Somewhere between storing/retrieving/displaying the characters, they are not rendering correctly. I'm not sure which step is having problems, but here is a breakdown of what is happening. When I do a SELECT from my PostgreSQL DB via the psql command line inte...

internationalization

character-encoding

Python - Get a list of all the encodings python can encode to

I am writing a script that will try encoding bytes into many different encodings in python 2.6. This page http://www.python.org/doc/2.6/library/codecs.html?highlight=cp1250#id3 lists all the encodings that python can encode to. Rather than copy & paste that, is there some way to get a list of available encodings that I can iterate over? ...

character-encoding

how to properly handle international character in Php / MySQL / Apache

I need to create an application in Php that can handle all unicode characters in all places - edit fields, static html, database. Can somebody tell me the complete list of all parameters / functions that need to be set / used to achieve this goal? ...

How To HTML Escape Curly Quotes in a Java String

I've got a string that has curly quotes in it. I'd like to replace those with HTML entities to make sure they don't confuse other downstream systems. For my first attempt, I just added matching for the characters I wanted to replace, entering them directly in my code: public static String escapeXml(String s) { StringBuilder sb = new...

Netbeans unicode problems

I am switching to Netbeans for php programming (I currently use gedit). Some characters from the original source code ( à, á, é, è, etc.) are not shown in Netbeans, regardless of the fount used, and a little quotation mark is shown instead. Those files are shown perfectly in both gedit and firefox. If I modify the file in Netbeans, cha...

UnicodeEncodeError on MySQL insert in Python

I used lxml to parse some web page as below: >>> doc = lxml.html.fromstring(htmldata) >>> element in doc.cssselect(sometag)[0] >>> text = element.text_content() >>> print text u'Waldenstr\xf6m' Why it prints u'Waldenstr\xf6m' but not "Waldenström" here? After that, I tried to add this text to a MySQL table with UTF-8 character set a...

How do I include unicode strings in Python doctests?

I am working on some code that has to manipulate unicode strings. I am trying to write doctests for it, but am having trouble. The following is a minimal example that illustrates the problem: # -*- coding: utf-8 -*- def mylen(word): """ >>> mylen(u"áéíóú") 5 """ return len(word) print mylen(u"áéíóú") First we run the code t...

1
...
41
42
43
44
45
...
104