utf-8

decode base64 string as UTF-8

Hi all! I am using the base64 implementation at the bottom of this post. If I use following code: NSLog(@"decoded:%@",[[[NSString alloc] initWithData:[Base64 decode:@"8fEmIzEyNDA3OyYjMTI0MTE7"] encoding:NSUTF8StringEncoding] autorelease]); I get decoded:(null) However, if I use: NSLog(@"decoded 1:%@",[[[NSString alloc] initWithDat...

how to support UTF8 (japanese, arabic, spanish, ...) URL's in PHP

For a web application, we need to link to some user generated content. A users types in a title for e.g. a product and we generate an SEO friendly url for that product: like this title: a nice product www.user.com/product/a-nice-product title: أبجد هوز www.user.com/product/أبجد هوز The problem is that those foreign language url's ...

UTF-8 & Unicode, what's with 0xC0 and 0x80 ?

I've been reading about Unicode and UTF-8 in the last couple of days and I often come across a bitwise comparison similar to this : int strlen_utf8(char *s) { int i = 0, j = 0; while (s[i]) { if ((s[i] & 0xc0) != 0x80) j++; i++; } return j; } Can someone clarify the comparison with 0xc0 and checking if it's the mos...

Handling UTF8 strings in C# web service

Hi, I created a simple web service client using the C# tool wsdl.exe. It works fine except for one thing. It seems that UTF8 strings returned in response are converted to ascii. Using SOAPUI I can see normal UTF8 encoded strings being returned by the web service. But when I debug the response I received the UTF8 content seems to have al...

How to convert this string manipulation function UTF-8 Compatible in PHP?

Good day, I had trouble finding a function that does exactly what I am looking for. Unfortunatly, this function isn't UTF-8 Compatible. This functions is like a basic ucwords but it also do the uppercase on a character followed by one of the given characters found (in my case I need to apply an uppercase on the character found after a -...

Rails 3 invalid multibyte char (US-ASCII)

I found a similar post here but I can't solve the problem anyway. I got this /home/fra/siti/Pensiero/db/seeds.rb:32: invalid multibyte char (US-ASCII) /home/fra/siti/Pensiero/db/seeds.rb:32: invalid multibyte char (US-ASCII) /home/fra/siti/Pensiero/db/seeds.rb:32: syntax error, unexpected $end, expecting ')' ... ed il valore della vita,...

Where can I find an UTF8 bits to char table to convert for instance "ñ" into "ñ"?

Hello.. I have been looking thoroughly through the Web and I cannot seem to find a table with those kind of conversions. The ones I find have some mistakes and are not too reliable, so I have looked for some official table or alike, but unfortunately I haven't.. so here I am.. As mentioned in the title, what I want to do is for instance...

chartset-utf8 and character entities

I am proposing to convert my windows-1252 xhtml web pages to utf-8. I have the following character entities in my coding (all preceded by &#): 39; - apostrophe 9658; - a right pointer 9668; - a left pointer If I change the chartset and save the pages as utf-8 using my editor: - the apostrophe remains in as a character entity; - the p...

accented letters are not displayed correctly on the server, even if the encoding is correct

hello! i wrote some html with utf-8 charset. in the head of the html there is also a <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> everything works fine in local, but when i upload files to the server, i see all my letters àèìòù etc distorted. anybody know how could it be the problem? is possible that the ...

Silverlight UTF8 encoder produces wacky output...

I've been trying to trace down a bug for hours now and it has come down to this: Dim length as Integer = 300 Dim buffer() As Byte = binaryReader.ReadBytes(length) Dim text As String = System.Text.Encoding.UTF8.GetString(buffer, 0, buffer.Length) The problem is the buffer contains 300 bytes but the length of the string 'text' is now 28...

UTF-8 compatible compression in python

I'd like to include a large compressed string in a json packet, but am having some difficulty. import json,bz2 myString = "A very large string" zString = bz2.compress(myString) json.dumps({ 'compressedData' : zString }) which will result in a UnicodeDecodeError: 'utf8' codec can't decode bytes in position 10-13: invalid data An...

Python Unicode CSV export (using Django)

Hi All, I'm using a Django app to export a string to a CSV file. The string is a message that was submitted through a front end form. However, I've been getting this error when a unicode single quote is provided in the input. UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 200: ordinal not in rang...

php greek characters encoding

I have a url string which i encode to utf8 at client side. When data recieved in server with my php script i can not see greek language characters! Could you please help me to convert them? data will be saved in mysql database . ...

Convert from ß to s c++

Is there a way in c++ to convert from ö to o, or ß to s, in general from utf-8 to the corresponding char from ASCII ? ...

PHP: Replacing funny character on PHP

Hi all, I'm trying to replace string "Red Dwarf (TV Series 1988â€") - IMDb" to "Red Dwarf (TV Series 1988') - IMDb" I have a translation table of these funny characters in an array. I tried to replace them using: str_replace but it did not work. Can anybody suggest a workaround on this? This is the snippet of the code: function repla...

unicode support in android ndk

I have a large C/C++ library that I need to use as part of an Android NDK project. This library needs to be able to intelligently process UTF8 strings (for example, conversion to lowercase/uppercase). The library has conditional compilation to punt to an OS API to do the conversion, but there don't seem to be any Android APIs for UTF8....

XMLReader -- Getting problem with utf characters

Hi, I am parsing a huge xml file and encoding of file is to be said < ? xml version="1.0" encoding="ISO-8859-1" ?>**bold The db encoding is utf8 and I am running this query before anything is saved to db $sql='SET NAMES "utf8" COLLATE "utf8_swedish_ci"'; What the problem is that sometimes some non standard characters comes in the ...

create a NSString from the \uXXXX representation

Hi there. I need to display the EURO (€,$,£) sign inside my UI. those sign are stored inside a SQLite database with theire \uXXXX representations. How can i create theire NSString representation? Here is a sample of code: NSString *currency = [[OptionDAO sharedInstance] readStringOption:@"TEST" ...

Characters with accents keep appearing as "�"

I'm using a simple php script to scour an RSS feed, store the scoured data to a temporary cache flat file, then display it along the side of my website. However all the characters with accents appear as "�" What is causing this and how can I fix it? Thank you! ...

Strings and character encoding in C++

I read a few posts about best practices for strings and character encoding in C++, but I am struggling a bit with finding a general purpose approach that seems to me reasonably simple and correct. Could I ask for comments on the following? I'm inclined to use UTF-8 and UTF-32, and to define something like: typedef std::string string8;...