utf-8

Can I print UTF-8 encoded files from Linux command-line?

enscript doesn't support utf-8 and the only other suggestion I've seen is to use lpr: lpr -o document-format=text/utf8 file_to_print but that gives an "Unsupported format" error. (Ubuntu 9.04 / GNOME Terminal 2.26.0) ...

php import utf-8 txt file to latin1 database

I have an UTF-8 encoded txt file and I want to import it to latin1_general_ci table. Problem is that some characters display as ? in database and not as they supposed to. I tried mb_convert_encoding($str, "ISO-8859-1", "UTF-8"); but that didn't do anything. What am I doing wrong? ...

PHP: Convert curl_exec output to UTF8

I would like to only work with UTF8. The problem is I don't know the charset of every webpage. How can I detect it and convert to UTF8? <?php $url = "http://vkontakte.ru"; $ch = curl_init($url); $options = array( CURLOPT_RETURNTRANSFER => true, ); curl_setopt_array($ch, $options); $data = curl_exec($ch); // $data = magic($data); p...

Why do I get an extra newline in the middle of a UTF-8 character with XML::Parser?

I encountered a problem dealing with UTF-8, XML and Perl. The following is the smallest piece of code and data in order to reproduce the problem. Here's an XML file that needs to be parsed: <?xml version="1.0" encoding="utf-8"?> <test> <words>בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת</words> <words>בְּרֵאשִׁ֖ית בָּרָ֣...

Invalid byte 1 of 1-byte UTF-8 sequence

I have a MyFaces Facelets application, where the page coding is a bit rugged. Anyway, it's developed with Eclipse and built with Ant, and kindof runs ok in Tomcat 2.0.26. So far so good. Now, I'd rather build with Maven, so I made a couple of pom-files, opened them in Netbeans and built, and now I have a war file that deploys ok. Howeve...

Character encoding issues when generating MD5 hash cross-platform

This is a general question about character encoding when using MD5 libraries in various languages. My concern is: suppose I generate an MD5 hash using a native Python string object, like this: message = "hello world" m = md5() m.update(message) Then I take a hex version of that MD5 hash using: m.hexdigest() and send the message & M...

"incomplete universal character name" with stringWithUTF8String

hi, when i try to convert form utf-8 string to NSString like so: NSString *s = [NSString stringWithUTF8String:"\U0627\U0644\U0641\U0631\U0646"]; NSLog(@"%@", s); i get the compile error: incomplete universal character name note that it sometime just works fine: NSString *UAE = [NSString stringWithUTF8String:"\U0627\U0644\U0641\U0...

Char C question about encoding signed/unsigned.

Hi guys. I read that C not define if a char is signed or unsigned, and in GCC page this says that it can be signed on x86 and unsigned in PowerPPC and ARM. Okey, I'm writing a program with GLIB that define char as gchar (not more than it, only a way for standardization). My question is, what about UTF-8? It use more than an block of m...

Contents of a node in Nokogiri

Is there a way to select all the contents of a node in Nokogiri? <root> <element>this is <hi>the content</hi> of my æøå element</element> </root> The result of getting the content of /root/element should be this is <hi>the content</hi> of my æøå element Edit: It seems like the solution is simply to use myElement.inner_html(). Th...

Save text file UTF-8 encoded with VBA

Hello, how can I write UTF-8 encoded strings to a textfile from vba, like Dim fnum As Integer fnum = FreeFile Open "myfile.txt" For Output As fnum Print #fnum, "special characters: äöüß" 'latin-1 or something by default Close fnum Is there some setting on Application level? ...

Any tool to convert bulk php files to UTF-8 without BOM?

Hi, i have a very large script which contains a lot of php files, so i need some windows tool or software which converts all those files into UTF-8 without BOM, i know this can be done with Notepad++ but you should convert each one. Thanks ...

Does Perl's Net::Cassandra module support UTF-8?

I've run into a really strange UTF-8 problem with Net::Cassandra::Easy (which is built upon Net::Cassandra): UTF-8 strings written to Cassandra are garbled upon retrieval. The following code shows the problem: use strict; use utf8; use warnings; use Net::Cassandra::Easy; binmode(STDOUT, ":utf8"); my $key = "some_key"; my $column = "s...

Confused about C++'s std::wstring, UTF-16, UTF-8 and displaying strings in a windows GUI

I'm working on a english only C++ program for Windows where we were told "always use std::wstring", but it seems like nobody on the team really has much of an understanding beyond that. I already read the question titled "std::wstring VS std::string. It was very helpful, but I still don't quite understand how to apply all of that infor...

I dont know how or where to add the correct encoding code to this iPhone code...

Ok, I understand that using strings that have special characters is an encoding issue. However I am not sure how to adjust my code to allow these characters. Below is the code that works great for text that contains no special characters, but can you show me how and where to change the code to allow for the special characters to be used....

Python: UTF-8 problems (again...)

I have a database which is synchronized against an external web source twice a day. This web source contains a bunch of entries, which have names and some extra information about these names. Some of these names are silly and I want to rename them when inserting them into my own database. To rename these silly names, I have a standard d...

What is WordPress doing for content encoding in its MySQL database?

For some convoluted reasons best left behind us, I require direct access the contents of a WordPress database. I'm using MySQL 5.0.70-r1 on Gentoo with WordPress 2.6, and Perl 5.8.8 ftr. So, sometimes we get high-order characters in the blog, we have quite a few authors contributing too, for the most part these characters end up in Wor...

PHP: Cyrillic characters not displayed correctly

Recently I switched hosting from one provider to the other and I have problems displaying Cyrillic characters. The characters which are read from the database are displayed correctly, but characters which are hardcoded in the php file aren't (they are displayed as question marks). The files which contain the php source code are saved in...

Is PHP serialize function compatible UTF-8 ?

I have a site I want to migrate from ISO to UTF-8. I have a record in database indexed by the following primary key : s:22:"Informations générales"; The problem is, now (with UTF-8), when I serialize the string, I get : s:24:"Informations générales"; (notice the size of the string is now the number of bytes, not string length) So...

Convert a MySQL database from latin to UTF-8

I am converting a website from ISO to UTF-8, so I need to convert the MySQL database too. On the Internet, I read various solutions, I don't know wich one to choose. Do I really need to convert my varchar columns to binary, then to UTF-8 like that: ALTER TABLE t MODIFY col BINARY(150); ALTER TABLE t MODIFY col CHAR(150) CHARACTER SET ...

SQL Server 2005 Fail: Return Dates As Strings

Hello all, I am using the SQL Server PHP Driver, I think this question can be answered without knowing what this is. I have come across this many times, what does it mean by NAMES? Column names?: SET NAMES utf8 Is there a query similar to the above that will get my dates to be returned as a string? For some reason on my SQL Sever 20...