unicode

Printing unicode characters in PowerShell via a C++ program

My end goal here is to write some non-latin text output to console in Windows via a C++ program. cmd.exe gets me nowhere, so I got the latest, shiny version of PowerShell (that supports unicode). I've verified that I can type-in non-unicode characters and see non-unicode console output from windows commands (like "dir") for exampl...

Which encoding does Microsoft SSIS use to output flat files in Unicode?

Hi, In the Flat File Connection Manager screen there is a checkbox to specify that the file is encoded as Unicode, but there is no way to tell which encoding will be used (UTF-8, UTF-16, ...) Is there an official Microsoft resource as to which encoding is used? Kind regards ...

MySQL "incorrect string value" error when save unicode string in Django

I got strange error message when tried to save first_name, last_name to Django's auth_user model. Failed examples user = User.object.create_user(username, email, password) user.first_name = u'Rytis' user.last_name = u'Slatkevičius' user.save() >>> Incorrect string value: '\xC4\x8Dius' for column 'last_name' at row 104 user.first_name ...

Using unicode charater in generated pdf (java, iText)

Hi. I have a problem with with unicode characters in generated pdf. Everything works fine on my own workstation, but at the test environment things go wrong. Code inserting value is following: Font boldDefaultFont = FontFactory.getFont(FontFactory.HELVETICA, 10, Font.BOLD); // ... PdfPCell headerCell = new PdfPCell(); // unit.getName() ...

Data loss when converting UTF-8 XML to Latin-1?

If I convert a UTF-8-encoded XML document (which has an XML prolog declaring the encoding to be UTF-8) to Latin-1 using xmllint, will there be any data loss? xmllint --encode iso-8859-1 --output test-latin1.xml test-utf8.xml (the data will eventually be displayed as ISO-8859-1-encoded HTML) ...

A Unicode Maven ArtifactId

I just tried creating a project in Maven whose artifactId is made up entirely of non-English characters ("日本国"). I get the following feedback from Maven: ERROR] FATAL ERROR [INFO] ------------------------------------------------------------------------ [INFO] Error building POM (may not be this project's POM). Project ID: com.worlde...

Problem when using python logging in django and unicode

Hi there - totally confused by now... I am developing in python/django and using python logging. All of my app requires unicode and all my models have only a unicode()`, return u'..' methods implemented. Now when logging I have come upon a really strange issue that it took a long time to discover that I could reproduce it. I have tried b...

I successfully called advapi32's LsaEnumerateAccountRights() from C#. Now how do I unmarshal the array of LSA_UNICODE_STRING it returns?

It's a pointer to an array of LSA_UNICODE_STRING structures. I found some code that does the inverse, i.e., create a LSA_UNICODE_STRING from a C# string. You can see that in the helper code section below. What I have up to and including the call to LsaEnumerateAccountRights() seems to work just fine. Sensible values are returned for the...

Unicode rendering in c#

I have an C# application where I am storing the code point value of a Unicode character to be displayed when the user correctly matches a normal specific string. The thing is that when I am storing the code point value directly (say, \uFB80) the application works fine. But when I am reading from a file or a variable that has the code po...

rendering Unicode glyphs

I want to render Unicode glyphs to jpg format. Should I change the font for each Unicode block in my java Code? I tried not to do that, but it did not work. However, changing the font for each block takes a lot of time. Do you know any better way? ...

Java sort strings in codepoint (UTF-32) order

Other than to convert to UTF-8 bytes, or write a compare function that iterates and compares, is there some method I'm missing in JDK 1.6 that compares two strings in full Unicode codepoint order instead of in UCS-2 codepoint order? I appreciate that this is not a hard thing to code. I was puzzled, however, that 1.6 has the various 'cod...

Problem with Indy IdHttp Post in Delphi 2010

I have problem with Indy IdHttp Post method. Function CallRpc() compiled with Delphi 2007 works fine but same code compiled with Delphi 2010 raises exception. What do I have to consider when I change Delphi 2007 Indy TIdHttp to Delphi 2010 Indy TIdHttp? function CallRpc(const sURL, sXML: string): string; var SendStream : TStream; ...

Conversion for Delphi 2009 unicode issue

I converting a lecacy app from Delphi 7 to Delphi 2009. I got this error: E2010 Incompatible types: 'Char' and 'AnsiChar' How can I fix it ? I tried to declare Alphabet: Ansistring[AlphabetLength] but that failed. const AlphabetLength = 64; Alphabet: string[AlphabetLength] = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234...

(grep) Regex to match non-ascii characters?

On linux i have a directory with lots of files. Some of them have nonASCII characters, but they are all valid UTF8. One programme has a bug that prevents it working with nonASCII filenames, I have to find out how many are affected. I was going to do this with find and then do a grep to print the nonASCII characters, and then do a wc -l t...

Why won't Delphi 2009 let me Include a Char in a set?

Here is another question about convert old code to D2009 and Unicode. I'm certain that there is simple but i don't see the solution... CharacterSet is a set of Char and s[i] should also be a Char. But the compiler still think there is a conflict between AnsiChar and Char. The code: TSetOfChar = Set of Char; procedure aFunc; var Char...

How to properly escape output (for XHTML) in mako?

Despite offering a nice way to escape output using filters, none of them do the right thing. Taking the string: x=u"&\u0092" The filters do the following: x Turns the & into an entity but not the \u0092 (valid XML but not XHTML) h Exactly the same u Escapes both, but obviously uses url escaping ent...

Java: Convert String "\uFFFF" into char

Is there a standard method to convert a string like "\uFFFF" into character meaning that the string of six character contains a presentation of one unicode character? ...

An equivalent to string.ascii_letters for unicode strings in python 2.x?

In the "string" module of the standard library, string.ascii_letters ## Same as string.ascii_lowercase + string.ascii_uppercase is 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' Is there a similar constant which would include everything that is considered a letter in unicode? ...

To choose track of upgrade Delphi from 2007

I'm working with a team for a bigger application with Delphi 2007. It use a bigger lecacy framework to access the data. Both the app and framework use String as datatype for strings. I have started to modify the code in framework to support Delphi 2009 strings, see my previous questions about this. I see 2 alternatives now: Alt 1 - Con...

Using Unicode in fancyvrb’s VerbatimOut

Problem VerbatimOut from the “fancyvrb” package doesn’t play nicely with UTF-8 characters. Minimal working example: \documentclass{minimal} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{fancyvrb} \begin{document} \begin{VerbatimOut}{\jobname.test} é \end{VerbatimOut} \input{\jobname.test} \end{document} Error me...