utf-8

Broken accent characters when Copy / Paste into ASP .Net

I am copy pasting from an MS word document into an ASCX file. When I view the ascx file, the accented characters appear normally. BUT, when the page is rendered through my ASP.net application, the accented characters are broken: Une promenade dans un verger ensoleillé, un peau de pêche délicatement parfumée… Les plaisirs du pr...

How can I download a utf-8-encoded web page with libcurl, preserving the encoding?

Im trying to get libcurl to download a webpage that is encoded in UTF-8, which is working fine, except for the fact that it converts it to ASCII and screws up some of the characters. Is there an easy way to get it to keep it in UTF-8? ...

How do I convert between ISO-8859-1 and UTF-8 in Java?

Does anyone know how to convert a string from ISO-8859-1 to UTF-8 and back in Java? I'm getting a string from the web and saving it in the RMS (J2ME), but I want to preserve the special chars and get the string from the RMS but with the ISO-8859-1 encoding. How do I do this? ...

Converting UTF-8 to ISO-8859-1 in Java - how to keep it as single byte

Hi, I am trying to convert a string encoded in java in UTF-8 to ISO-8859-1. Say for example, in the string 'âabcd' 'â' is represented in ISO-8859-1 as E2. In UTF-8 it is represented as two bytes. C3 A2 I believe. When I do a getbytes(encoding) and then create a new string with the bytes in ISO-8859-1 encoding, I get a two different char...

C#: Cycle through encodings

I am reading files in various formats and languages and I am currently using a small encoding library to take attempt to detect the proper encoding (http://www.codeproject.com/KB/recipes/DetectEncoding.aspx). It's pretty good, but it still misses occasionally. (Multilingual files) Most of my potential users have very little understandi...

PHP Multibyte String Functions

Today I ran into a problem with the php function strpos(), because it returned FALSE even if the correct result was obviously 0. This was because one parameter was encoded in UTF-8, but the other (origin is a HTTP GET parameter) obviously not. Now I have noticed that using the mb_strpos function solved my problem. My question is now: I...

How do I add UTF-8 support, and an associated font-table, to an embedded project?

Hello, I am currently designing a font engine for an embedded display. The basic problem is the following: I need to take a dynamically generated text string, look up the values from that string in a UTF-8 table, then use the table to point to the compressed bitmap array of all the supported characters. After that is complete, I call...

Using norwegian letters æøå in python

Hello I'm learning python and PyGTK now, and have created a simple Music Organizer. http://pastebin.com/m2b596852 But when it edits songs with the Norwegian letters æ, ø, and å it's just changing them to a weird character. So is there any good way of opening or encode the names into utf-8 characters? Two relevant places from the above ...

Are XHTML entity encodings valid in XML documents as long as they're contained inside CDATA tags?

Is this a valid (well-formed) XML document? <?xml version="1.0" encoding="UTF-8" ?> <outer> <inner>&copy;</inner> </outer> At issue is whether the HTML/XHTML "" entity encoding is valid in an XML document where there is no DTD or schema to define it. An alternative way of expressing the above would be to say this: <?xml version="1...

Convert unicode representations on incoming string to UTF-8?

Hi, I'm reading some data that has already been converted to html style υ code. I now need to convert this back to UTF-8 characters for viewing. Unfortunately I can't use a browser to view the string. I've read around about conversion in java and it seems if you have a string of \uxxxx then the compiler will convert for you; Howeve...

Is there a way to disable the printing of the additional header and footer information in Firefox?

We were using htmldoc but unfortunately it does not support UTF-8. I tried using the Mozilla Firefox command-line printPDF extension but it placed the URL on the upper right of every page of the PDF which unfortunately isn't acceptable because these files are client-facing. I've also heard of Prince but it simply costs too much. Is th...

How to create a UTF-8 string literal in Visual C++ 2008

In VC++ 2003, I could just save the source file as UTF-8 and all strings were used as is. In other words, the following code would print the strings as is to the console. If the source file was saved as UTF-8 then the output would be UTF-8. printf("Chinese (Traditional)"); printf("中国語 (繁体)"); printf("중국어 (번체)"); printf("Chinês (Tradicio...

Unicode, UTF, ASCII, ANSI format differences

whatis the difference between Unicode, UTF8, UTF7,UTF16,UTF32,ASCII, ANSI code format of encoding in ASP.net In what these are helpful for programmers. ...

UTF-8 characters mangled in HTTP Basic Auth username

I'm trying to build a web service using Ruby on Rails. Users authenticate themselves via HTTP Basic Auth. I want to allow any valid UTF-8 characters in usernames and passwords. The problem is that the browser is mangling characters in the Basic Auth credentials before it sends them to my service. For testing, I'm using 'カタカナカタカナカタカナカ...

Change text encoding for multiple files at once in Eclipse

I have some UTF-8 HTML templates in my Eclipse project and Eclipse keeps treating them as if they had a different encoding. It says the encoding is "determined from content". I want to force the correct encoding. I can force it for a single file but setting an encoding for the parent folder won't affect the files in it because instead o...

Can a PHP file name (or a dir in its full path) have UTF-8 characters?

I would like to access a PHP file whose name has UTF-8 characters in it. The file does not have a BOM in it. It just contains an echo statement that displays a few unicode characters. Accessing the PHP page from the browser (FireFox 3.0.8, IE7) results in HTTP error 500. There are two entries in the Apache log (file is /க.php; the let...

How can a text file be converted from ANSI to UTF-8 with Delphi 7?

I written a program with Delphi 7 which searches *.srt files on a hard drive. This program lists the path and name of these files in a memo. Now I need convert these files from ANSI to UTF-8, but I haven't succeeded. Please help me... ...

UTF-8 and Servlets on Tomcat/Linux

I've had some problems with reading and writing UTF-8 from servlets on Tomcat 6 / Linux. request and response were utf-8, browser was utf-8, URIEncoding was set in server.xml on both connectors and hosts. Ins short, every known thing for me in code itself, and server configuration was utf-8. When reading request, I've had to take byte ...

Overcome Encoding Problems with PHP, SoapServer, UTF-8, and non English Characters?

I'm having problems getting PHP to play nicely with SoapServer + UTF-8. Anytime anyone sends a Soap Request with non english characters (i.e. funny quotes, accented characters, etc) the SoapServer throws an exception saying "Bad Request." I've tried decoding the request with utf8_decode and even HTML Special Characters encoded the text. ...

Delphi 2009: Search skipping diacritics in unicode utf-8

I am having utf-8 encoded file containing arabic text and I have to search it. My problem are diacritics, how to search skipping them? Like if you load that text in Internet Explorer (converting text in HTML ofcourse ), IE is skipping those diacritics? Any help? Edit1: Search is simply performed by following code: var m1 : TMemo; /...