unicode

Unicode and std::string in C++

If I write a random string to file in C++ consisting of some unicode characters, I am told by my text editor that I have not created a valid UTF-8 file. // Code example const std::string charset = "abcdefgàèíüŷÀ"; file << random_string(charset); // using std::fstream What can I do to solve this? Do I have to do lots of additional manu...

How to portably write std::wstring to file?

I have a wstring declared as such: // random wstring std::wstring str = L"abcàdëefŸg€hhhhhhhµa"; The literal would be UTF-8 encoded, because my source file is. [EDIT: According to Mark Ransom this is not necessarily the case, the compiler will decide what encoding to use - let us instead assume that I read this string from a file enc...

Are BSTR UTF-16 Encoded?

I'm in the process of trying to learn Unicode? For me the most difficult part is the Encoding. Can BSTRs (Basic String) content code points U+10000 or higher? If no, then what's the encoding for BSTRs? ...

Convert an escaped unicode String to its chars in ruby 1.8

I have to read some text files with the following content: \u201CThe Pedlar Lady of Gushing Cross\u201D In ruby 1.9 terminal, when I create a string with this content: ruby-1.9.1-p378 > "\u2714 \u2714 my great string \u2714 \u2714" => "✔ ✔ my great string ✔ ✔" In ruby 1.8, I don't get the unicode codes converted to their character...

Checking Unicode string for whitespace - byte for byte!

Quick & dirty Q: Can I safely assume that a byte of a UTF-8, UTF-16 or UTF-32 codepoint (character) will not be an ASCII whitespace character (unless the codepoint is representing one)? I'll explain: Say that I have a UTF-8 encoded string. This string contains some characters that take more than one byte to store. I need to find out ...

How to gsub an unicode 0083 with ruby ?

Hi Guys, I have loaded a string from a html.file, and I have writen it to a yaml file with the plugin ya2yaml: - title: 'What a wonderful day!' body: ... # main contents here and I will load the .yml file by YAML::parse_file method. but "\n" in the string will cause load problems, so I tried to gsub all "\n" to "", but the...

php: How to get the unicode character from the STRING "U4e9c"?

This doesn't work (just echoes "U4e9c"): echo mb_convert_encoding("U4e9c","UTF-8","auto"); I guess some sort of casting "U4e9c" is needed, but can't figure out how... ...

SPARQL QUERY OWL FILE

Hi everybody, Could I ask you about a SPARQL query on an Ontology. I have a family.owl file is the ontology build from protege 3.4 with data: Lan haschild Tuấn, Tùng haschild Tuấn. I use Java and CORESE API on site (http://www-sop.inria.fr/edelweiss/software/corese/v2_4_0/manual/index.php#coreseapi ) to query the family.owl above. W...

How to deal with character Encoding in Obj-C ?

Hi, I'm a new to Obj-C (my experience is in Java and a little C) I have this project these days, which is An Arabic-Text encryption .. I need to read an arabic text file (character by character), but when I want to use these characters and store them in variables (of type char) I couldn't .. it gives me this warning "Multi-character cha...

How can I display a tux character in a shell script?

I realize this is very much of a long shot, but... In shell scripts on Macs I can display an Apple character. Is there any way to display a Tux character (or anything else associated with Linux) on Linux systems? The simplest solution would be if there's something in the Unicode set that symbolizes Linux, whether a Tux or something el...

Encapsulating CTRL-SHIFT-A-D in a java String

Hi, I use a ssh library with java to connect to a server. I want to detach a GNU Screen with CTRL-SHIFT-A-D, why I need to send this sequence to the server. Can someone tell me how I can write this in a Java String? I looked through the unicode and ascii tables but I couldn't find a hint. Sincerely, Heinrich ...