character-encoding

XSLT character padding for European characters to fixed width output.

I've got a requirement to take some XML and transform it into a fixed-width load file for loading to an SAP system. My algorithm works fine except for some weird European characters such as Ã, which, when in a string returns a string length of +1 for each instance of the char. So for example the text Ãbcd would have a string-length($va...

Submitted character encoding -- _charset_ hidden field

For our web app, we have multiple HTML pages containing text areas. All of our pages are rendered with an ISO-8859-1 charset. When the page is accessed through IE6 on a Windows machine and special characters such as a "smart quote" are copied in to the text area, some of our pages submit the page using the Windows 1252 character encodi...

Character Encoding Mismatch

My scripts are definitely saved in UTF-8. I'm instantiating PDO with "{$this->engine}:host={$this->host};dbname={$this->name};charset=UTF-8". My tables use InnoDB and are collated using utf8_general_ci. My pages are sent either with the Content-Type: text/html; charset=UTF-8 header or the <meta> equivalent. When using PDO to store a € c...

What is the character encoding?

I have several characters that aren't recognized properly. Characters like: º á ó (etc..) This means that the characters encoding is not utf-8 right? So, can you tell me what character encoding could it be please. ...

How to store unicode data in a format that doesn't support utf-8

Okay, here's yet another character encoding question, demonstrating my ignorance of all things Unicode. I am reading data out of Microsoft Excel .xls files, and storing it in ESRI shapefiles .shp. For versions of Excel > 5.0, text in excel files is stored as Unicode. However, Unicode (and specifically UTF-8 support for shapefiles is i...

inserting latin1-encoded text into utf8 tables (forgot to use mysql_set_charset)

I have a PHP web app with MySQL tables taking utf8 text. I recently converted the data from latin1 to utf8 along with the tables and columns accordingly. I did, however, forget to use mysql_set_charset and the latest incoming data I would assume came through the MySQL connection as latin1. I don't know what happens when latin1 comes in t...

Arabic Encoding With windows

Hello, i am trying to write a CSV file include arabic data using java as PrintWriter out = new PrintWriter("file.csv", "UTF8"); and the when i open the file in Linux machine , the Arabic displayed fine but it doesn't work with windows machine. and when set encoding to be "Cp1256" as PrintWriter out = new PrintWriter("file.csv",...

Encoding problems with hpricot

I am getting the following encoding error when trying to scrap web pages with hpricot in ruby 1.9: Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8 I can reproduce the error by doing the following: ska:~ sam$ rvm 1.9.2@hpricot ska:~ sam$ ruby -v ruby 1.9.2dev (2010-05-31 revision 28117) [x86_64-dar...

Broken encoding after postback

I have a query string with a parameter value that contains the norwegian character å encoded as %e5. The page contains a form with an action attribute which is automatically filled by ASP.Net. When the URL is output into said attribute it is printed with a full two byte encoding: %u00e5. When posting back this seems to be ok when debug...

Handling Spanish characters in Java/JSP

I have a small webapp which handles a lot of Spanish text. At one point in the code, a JSP page responds with a Json String containing some of this text. If I print the String to the Console, it looks like jibberish. But if I examine the header/content of the response in Chrome Developer Tools, it looks correct. It is transferred in the...

BeautifulSoup doesn't give me Unicode

I'm using Beautiful soup to scrape data. The BS documentation states that BS should always return Unicode but I can't seem to get Unicode. Here's a code snippet import urllib2 from libs.BeautifulSoup import BeautifulSoup # Fetch and parse the data url = 'http://wiki.gnhlug.org/twiki2/bin/view/Www/PastEvents2007?skin=print.pattern' dat...

How do I transform "ТеÑ" (it is russian word) into something readable?

Hello, I got MySQL DB which contains UTF8 column with such "ТеÑ" records. PHP's mb_detect_encoding() told me that this is UTF-8. How can I transform this "horror" into something readable? Thank you ...

Force enconding of all files within a Visual Studio solution directly from Visual Studio

Any way to convert the encoding of all the files within a VS solution (*.sln) directly inside Visual Studio? (I am using 2008). Any Add-in for this effect? ...

Safari encodes already encoded URL on request

I do an HTTP GET request for a page using the following URL in Safari: mysite.com/page.aspx?param=v%e5r The page contains a form which posts back to itself. The HTML form tag looks like this when output by asp.net: <form method="post" action="page.aspx?param=v%u00e5r" id="aspnetForm" > When Safari POSTs this back it somehow converts t...

How to force a specific code page for a website?

HI I have the following (apparently simple) problem: I have to install a simple website, made by someone else, on a web hosting account. The site consists of lot and lot of HTML pages, no dynamic content, created some in MS Word and saved as html, some in frontpage, etc. A mixed bag. I uploaded initially on a test account on my server ...

Rails charachter encoding problem view to controller

The character encoding starts to irritate me. It took me a while to get everything from the DB in the right encoding on the screen, but with help from the i18n helper, this worked out. Now I only have one more problem: saving text... If i add some letters with accents (eg é ç ...) in a text field and want to save it, already in my contro...

How to "force" a file's ISO-8859-1ness?

I remember when I used to develop website in Japan - where there are three different character encodings in currency - the developers had a trick to "force" the encoding of a source file (so it would always open in their IDEs in the correct encoding etc). What they did was to put a comment at the top of the file containing a Japanese ch...

Character encoding error!

Hi! I try get the content this URL: http://www.chromeball.com, but the character encoding is not good. I have this code: $url = 'http://www.chromeball.com'; $ch = curl_init($url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); $data = curl_exec($ch); curl_close($ch); $dom = new DOMDocu...

Python: Sanitize a string for unicode?

I have a string that I'm trying to make safe for the unicode() function: >>> s = " foo “bar bar ” weasel" >>> s.encode('utf-8', 'ignore') Traceback (most recent call last): File "<pyshell#8>", line 1, in <module> s.encode('utf-8', 'ignore') UnicodeDecodeError: 'ascii' codec can't decode byte 0x93 in position 5: ordinal not in ran...

How to encode a text file using ASMO449+? .NET

Dear All, How I can encode a text file to ASMO449+? Thanks ...