utf-16

pyODBC and Unicode Problem

Hey guys, I'm working with pyODBC communicate with a MS SQL 2005 Express server. The table to which i'm trying to save the data consists of nvarchar columns. query = u"INSERT INTO tblPersons (name, birthday, gender) VALUES('" query = query + name + u"', '" query = query + birthday + u"', '" query = query + gender + u"')" cur.exe...

Query MySQL with unicode char code.

Hi, I have been having trouble searching through a MySQL table, trying to find entries with the character (UTF-16 code 200E) in a particular column. This particular code doesn't have a glyph, so it doesn't seem to work when I try to paste it into my search term. Is there a way to specify characters as their respective code point inst...

Finding the attributes of Chinese filenames using NewLISP?

The following NewLISP code shows me the file attributes of files under Win32. However, some of the filenames retrieved have Chinese characters in the name. When the GetFileAttributesA function encounters them, it gives me a -1 for the attribute. I looked at GetFileAttributesW but don't know how to make the contents of the fname available...

Ruby works well with Unicode character in Filenames on Mac OS X and on Linux, but why to make it work on Windows, it took at least 2 years?

Ruby works well with Unicode character in File Path and Filenames on Mac OS X and on Linux, but why to make it work on Windows, it took more than 2 years? I was just looking at Google Code Jam. People are solving non-trivial problems within a few hours. At work, I can imagine solving a filename or path issue having unicode characters ...

Converting from utf-16 to utf-8 in Python 3

I'm programming in Python 3 and I'm having a small problem which I can't find any reference to it on the net. As far as I understand the default string in is utf-16, but I must work with utf-8, I can't find the command that will convert from the default one to utf-8. I'd appreciate your help very much. ...

Advice on marshalled string that can be either ASCII or UTF-16

Welcome to unsafe land. I'm doing P/Invoke to a legacy lib that gives me a 0-terminated C-style string in the form of an unknown-length unmanaged byte buffer that can be either ASCII or UTF-16, but without giving any indication whatsoever thereof - other than the byte stream itself that is... Right now I have a bad scheme, based on che...

Replace string that contain #0?

I use this function to read file to string function LoadFile(const FileName: TFileName): string; begin with TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite) do begin try SetLength(Result, Size); Read(Pointer(Result)^, Size); except Result := ''; Free; raise; end; Free; ...

How to best deal with Windows' 16-bit wchar_t ugliness?

I'm writing a wrapper layer to be used with mingw which provides the application with a virtual UTF-8 environment. Functions which deal with filenames are wrappers which convert from UTF-8 and call the corresponding "_w" functions, and so on. The big problem I've run into is that Windows' wchar_t is 16-bit. For filesystem operations, it...

WideCharToMultiByte problem

I have the lovely functions from my previous question, which work fine if I do this: wstring temp; wcin >> temp; string whatever( toUTF8(getSomeWString()) ); // store whatever, copy, but do not use it as UTF8 (see below) wcout << toUTF16(whatever) << endl; The original form is reproduced, but the in between form often contains extr...

Cleaning up UTF-16/CJK characters using PHP?

I have some files on my computer that are in UTF-16, though this seems to be because of errors or corruption of the files rather than intent - they're supposed to be plain english. I uploaded one of these (here). If I leave the encoding in Firefox (Viwe>Character Encoding) at UTF-8 then I get tons of gibberish (see screenshot). If I chan...

Sql server 2005 with ASP .NET encoding issue

Hi, I'm writing once again about my encoding issue... Now with some code samples. In a nutshell: when saving to database input data, some language specyfic characters like polish 'ń' won't save - insted 'n' is saved. On the other hand, string: Adams æbler, with æ is saving. Here is code begind code that does save stuff and displays d...

How to Determine "Lowest" Encoding Possible?

Scenario You have lots of XML files stored as UTF-16 in a Database or on a Server where space is not an issue. You need to take a large majority of these files that you need to get to other systems as XML Files and it is critical that you use as little space as you can. Issue In reality only about 10% of the files stored as UTF-16 ne...

How to serialize object into UTF-8

Hi, I'm trying to insert into XML column (SQL SERVER 2008 R2), but the server's complaining: System.Data.SqlClient.SqlException (0x80131904): XML parsing: line 1, character 39, unable to switch the encoding I found out that the XML column has to be UTF-16 in order for the insert to succeed. The code I'm using is: XmlSerializer se...

SQLite - Insert special symbols (trademark, ...) into table

How can I insert special symbols like trademark into SQLite table? I have tried to use PRAGMA encoding = "UTF-16" with no effect :( ...

Should I change from UTF-8 to UTF-16 to accomodate Chinese characters in my HTML?

I am using ASP.NET MVC, MS SQL and IIS. I have a few users that have used Chinese characters in their profile info. However, when I display this information is shows up as &#230;Ž&#229;&#188;&#183;&#232;&#175; but they are correct in my database. Currently my UTF for my HTML pages is set to UTF-8. Should I change it to UTF-16? I und...

git gui - can it be made to display UTF16?

Is there any way to make git gui display and show diffs for UTF16 files somehow? I found some information here: http://stackoverflow.com/questions/777949/can-i-make-git-recognize-a-utf-16-file-as-text but this is mostly referring to the command line rather than the gui. ...

Javascript and HTML: Saving file as UTF-8 without BOM

I'm trying to write an MSIE only HTML page (which I'll call the "Title Page") that allows someone to save a generated HTML webpage (which I'll call "New Page") with a click of a button. What I found out is that the "Save As" dialog box that appears does not allow for the "New Page" to be saved as UTF-8 without BOM. It is instead, being...

Efficient binary-to-string formatting (like base64, but for UTF8/UTF16)?

I have many bunches of binary data, ranging from 16 to 4096 bytes, which need to be stored to a database and which should be easily comparable as a unit (e.g. two bunches of data batch only if the lengths match and all bytes match). Strings are nice for that, but converting binary data blindly to a string is apt to cause problems due to...