character-encoding

.NET Weird character encoding issue

Our globalization mechanism stores error messages in a SQL 2005 DB. Some of the error messages are used as subjects on email messages sent to the development team. Recently, with no clear reason, we started receiving emails with strangely encoded subjects, such as: =?utf-8?B?Qm1mQm92ZXNwYS5Qb3NUcmFkaW5nRXNwZWNpZmljYWNhbyAtIFN1Y2Vzc...

UTF-8 BOM signature in PHP files

I was writing some commented PHP classes and I stumbled upon a problem. My name (for the @author tag) ends up with a ș (which is a UTF-8 character, ...and a strange name, I know). Even though I save the file as UTF-8, some friends reported that they see that character totally messed up (È™). This problem goes away by adding the BOM sign...

I get funny characters when reading multipart and text messages using Zend Mail?

Hi guys I've shifted to using the zend framework for reading messages from an inbox however when reading some html messages I see a lot of weird charcters like: don’t looks like don=92t Plus other weird characters like =20 .. whats going on? Is it an ecoding issue? How do I fix it? ...

How can I get Velocity to output a greater than / less than without escaping it?

I'm trying to get Velocity to output the following Javascript code: if ((whichOne+1) <= numCallouts ) { whichOne = whichOne + 1; } else { whichOne = 1; } Whenever I try to get Velocity to print a > or a <, it represents it as a & gt; or & lt;, which doesn't help me since I'm trying to get it to produce Javascript. I've tried: #s...

two byte character or one byte character

Hi, How can I see if the input string is a two byte character or one byte character; and from which encoding system the character is coming from? I am using C# and SilverLight; I assume I could find the encoding the computer is running and then the character? Any code snippet? Thank you, Rune // Get a UTF-32 encoding by codepage.Enco...

How can I convert input to HTML Characters correctly

Let's say I'm including a file which contains html. The html have characters as exclamation symbols, Spanish accents (á, ó). The parsed included text gets processed as symbols instead of their correct value. This happens on FF but not on IE (8). I have tried the following functions: htmlspecialchars, htmlentities, utf8_encode include ...

Problems with display of UTF-8 encoded content from a DB

Dear members of the Stackoverflow community, We are developing a web application using the Zend Framework, and we are facing some encoding issues that we hope you might help us solve. The situation goes something like this: There are certain tables on a MySQL database that need to be displayed as html. Because the site is designed using...

What are Windows code pages?

I'm trying to gain a basic understanding of what is meant by a Windows code page. I kind of get the feeling it's a translation between a given 8 bit value and some 'abstraction' for a given character graphic. I made the following experiment. I created a "" character literal with two versions of the letter u with an umlaut. One create...

SHA-1 and Unicode

Hi everyone, Is behavior of SHA-1 algorithm defined for Unicode strings? I do realize that SHA-1 itself does not care about the content of the string, however, it seems to me that in order to pass standard tests for SHA-1, the input string should be encoded with UTF-8. ...

Mysql Character Mapping

Hi, I would like to map foreign characters, especially Turkish characters, to their Latin-1 equivalent in Mysql. For example, Select name FROM users WHERE id = 1 Result = Çakır but I would like to get it as: Cakir or Özel -> Ozel There are couple of Turkish characters and they all have Latin-1 equivalents. ( http://webdesign.ab...

Which encoding does Alt+Numpad keys generate?

In short: For this code: Encoding.ASCII.GetBytes("‚") I want the output to be 130, but this gives me 63. I am typing the string using Alt+0130. ...

Html Entity code for ž

What is the HTML entity code for ž? I am looking for something similar to &raquo; instead of something like &#x17E;. ...

html-encode output && incorrect string error

my data includes arabic characters which looks like garbage in mysql but displays correctly when run on browser. my questions: how do i html-encode the output? if i add this to all my files: <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> i get this error: Error: Incorrect string value: '\xE4\xEE\xC3\xD8\xEF\xE6...'...

How to ANSI-C cast from unsigned int * to char *?

I want these two print functions to do the same thing: unsigned int Arraye[] = {0xffff,0xefef,65,66,67,68,69,0}; char Arrage[] = {0xffff,0xefef,65,66,67,68,69,0}; printf("%s", (char*)(2+ Arraye)); printf("%s", (char*)(2+ Arrage)); where Array is an unsigned int. Normally, I would change the type but, the proble...

Decoding not reversing unicode encoding in Django/Python

Ok, I have a hardcoded string I declare like this name = u"Par Catégorie" I have a # -- coding: utf-8 -- magic header, so I am guessing it's converted to utf-8 Down the road it's outputted to xml through xml_output.toprettyxml(indent='....', encoding='utf-8') And I get a UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3...

Calling Msbuild from Php - Wrong Codepage and Culture

I have a Php script that calls Msbuild via System: <?php system( "msbuild umlaut.proj" ); ?> This is the project file: <?xml version="1.0" encoding="UTF-8"?> <Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="EchoUmlaut" ToolsVersion="3.5"> <Target Name="EchoUmlaut"> <Message Text="Umla...

How to enable reading non-ascii characters in Servlets

How to make the servlet accept non-ascii (Arabian, chines, etc) characters passed from JSPs? I've tried to add the following to top of JSPs: <%@page language="java" contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%> And to add the following in each post/get method in the servlet: request.setCharacterEncoding("UTF-8"); res...

Character Encoding, UTF or ANSI?

I'm using Eclipse in Ubuntu to edit PHP files. But, unfortunately, some of these PHP files were created in Notepad++ in Windows XP, with ANSI encoding defined. Also, these files generates HTML codes with charset=ISO-8859-1. When I configured Eclipse to ISO-8859-1, many special characters were lost and changed to '???', and when I try ...

Oracle DUMP procedure returns question marks for Chinese characters

I am using Oracle 10g and am performing the following query: SELECT DUMP('炫耀他的', 1017) FROM DUAL; This outputs: Typ=96 Len=4 CharacterSet=AL32UTF8: ?,?,?,? Why have the Chinese characters been replaced with question marks? How do I get it to return the correct characters? ...

Help with proper character encoding.

I have a HTML form that is sometimes submitted with accented characters: à, è, ì, ò, ù I have a PHP script that exports these form submissions into CSV format, when I look at the CSV format in a text editor (vim or notepad for example) the characters look fine, but when opened with Open Office or Word, I get some funky results: ����� I...