unicode-string

String To Lower/Upper in C++

What is the best way people have found to do String to Lower case / Upper case in C++? The issue is complicated by the fact that C++ isn't an English only programming language. Is there a good multilingual method? ...

UNICODE_STRING to Null terminated

I need to convert a UNICODE_STRING structure to a simple NULL TERMINATED STRING. typedef struct _UNICODE_STRING { USHORT Length; USHORT MaximumLength; PWSTR Buffer; } UNICODE_STRING, *PUNICODE_STRING; I can't find a clean sollution on MSDN about it. Anyone been there? I am not using .net so I need a native API s...

Why the Excess Memory for Strings in Delphi?

I'm reading in a large text file with 1.4 million lines that is 24 MB in size (average 17 characters a line). I'm using Delphi 2009 and the file is ANSI but gets converted to Unicode upon reading, so fairly you can say the text once converted is 48 MB in size. ( Edit: I found a much simpler example ... ) I'm loading this text into a s...

conversion of unicode string in python

I need to convert unicode strings in Python to other types such as unsigned and signed int 8 bits,unsigned and signed int 16 bits,unsigned and signed int 32 bits,unsigned and signed int 64 bits,double,float,string,unsigned and signed 8 bit,unsigned and signed 16 bit, unsigned and signed 32 bit,unsigned and signed 64 bit. I need help fro...

Escaped unicode to unicode character in Cocoa

I get from a NSURLConnection a NSData object which I convert with [[NSMutableString alloc] initWithData:[self urlData] encoding:NSUTF8StringEncoding] to a NSMutableString. After some "revision" I display it in a NSTextField. But when the response contains a more-than-utf8-string this is displayed: This "&#x27A1" should be one unico...

Weird SQL Behavior, why is this query returning nothing?

Assume there is a table named "myTable" with three columns: {**ID**(PK, int, not null), **X**(PK, int, not null), **Name**(nvarchar(256), not null)}. Let {4, 1, аккаунт} be a record on the table. select * from myTable as t where t.ID=4 AND t.X = 1 AND ( t.Name = N'аккаунт' ) select * from myTable as t ...

How to work with unicode in Python

I am trying to clean all of the HTML out of a string so the final output is a text file. I have some some research on the various 'converters' and am starting to lean towards creating my own dictionary for the entities and symbols and running a replace on the string. I am considering this because I want to automate the process and ther...

Reading "raw" Unicode-strings in Python

Dear all, I am quite new to Python so my question might be silly, but even though reading through a lot of threads I didn't find an answer to my question. I have a mixed source document which contains html, xml, latex and other textformats and which I try to get into a latex-only format. Therefore, I have used python to recognise ...

How to open file in PHP that has unicode characters in its name?

For example I have a filename like this - проба.xml and I am unable to open it from PHP script. If I setup php script to be in utf-8 than all the text in script is utf-8 thus when I pass this to file_get_contents: $fname = "проба.xml"; file_get_contents($fname); I get error that file does not exist. The reason for this is that in Win...

Hash method and UnicodeEncodeError

In Python 2.5, I have the following hash function: def __hash__(self): return hash(str(self)) It works well for my needs, but now I started to get the following error message. Any idea of what is going on? return hash(str(self)) UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 16: ordinal not in range(...

Displaying a unicode text in C#

My App displays English, Japanese and Chinese characters on a TextBox and a LinkLabel. Currently, I check if there are unicode characters and change the font to MS Mincho or else leave it in Tahoma. Now MS Mincho displays Japanese properly, but for Chinese I have to use Sim Sun. How can I distinguish between the two? How can I ensure t...

LPSTR how to free memory after using

Hi Suppose i have a LPSTR variable. How do i free the memory after using the variable. Is it LPSTR szFileName = GetSBCSBuffer(sFilePath); // sFilePath is a CString delete szFileName; OR delete []szFileName; Kindly advise ...

What is Causing This Memory Leak in Delphi?

I just can't figure out this memory leak that EurekaLog is reporting for my program. I'm using Delphi 2009. Here it is: Memory Leak: Type=Data; Total size=26; Count=1; The stack is: System.pas _UStrSetLength 17477 System.pas _UStrCat 17572 Process.pas InputGedcomFile 1145 That is all there is in the stack. EurekaLog is ...

[C#] Byte[] to String to Byte[] -- How to do so?

Ignore the reasons why someone would want to do this.... :) I want to be able to take some bytes, covert them to a string, then later back to the same byte array. Same length and everything. I've tried using the ASCIIEncoder class (works for only text files) and Unicode Encoder class (only works so far for arrays 1024*n big. I assume t...

strcmp or _tcscmp in UNICODE

hi For comparing strings in UNICODE versions is it advisable to use strcmp or _tcscmp? Thanks in advance ...

Visual C++ Automatically Appending A or W to end of Function

In C++ I have defined a class that has this as a member static const std::basic_string<TCHAR> MyClass_; There is also a getter function for this value LPCTSTR CClass::GetMyClassName() { return MyClass_.c_str(); } When I create an instance of this class and then try and access it intellisense pops up but the name has been change...

Where can I get started with Unicode-friendly programming in C?

So, I’m working on a plain-C (ANSI 9899:1999) project, and am trying to figure out where to get started re: Unicode, UTF-8, and all that jazz. Specifically, it’s a language interpreter project, and I have two primary places where I’ll need to handle Unicode: reading in source files (the language ostensibly supports Unicode identifiers a...

Is it possible to reliably auto-decode user files to Unicode? [C#]

I have a web application that allows users to upload their content for processing. The processing engine expects UTF8 (and I'm composing XML from multiple users' files), so I need to ensure that I can properly decode the uploaded files. Since I'd be surprised if any of my users knew their files even were encoded, I have very little hop...

like Query using plsql database for unicode string.

This gives the correct result select name_en,address_new,ddress2_new,unique_id from mumbaipropertydetails where mumbaipropertydetails."zone"='\\u092E\\u0928\\u092A\\u093E \\u092D\\u0935\\u0928'; but if I use like it does not select name_en,address_new,ddress2_new,unique_id from mumbaipropertydetails where mumbaipropertydetails....

Python unicode question

I have a list that I need to send through a URL to a third party vendor. I don't know what language they are using. The list prints out like this: [u'1', u'6', u'5'] I know that the u encodes the string in utf-8 right? So a couple of questions. Can I send a list through a URL? Will the u's show up on the other end when going throug...