encoding

How do I decode garbled text from the Library of Congress?

I am making a z39.50 search in python, but have a problem with decoding search results. The first search result for "harry potter" is apparantly a hebrew version of the book. How can I make this into unicode? This is the minimal code I use to get a post: #!/usr/bin/env python # encoding: utf-8 from PyZ3950 import zoom from PyZ3950 i...

encoding/decoding string and a special character to byte array

Hi I had a requirement of encoding a 3 character string(always alphabets) into a 2 byte[] array of 2 integers. This was to be done to save space and performance reasons. Now the requirement has changed a bit. The String will be of variable length. It will either be of length 3 (as it is above) or will be of length 4 and will have 1 spe...

.NET DataSet.GetXml() - what's the default encoding?

Existing app passes XML to a sproc in SQLServer 2000, input parameter data type is TEXT; The XML is derived from Dataset.GetXML(). But I notice it doesn't specify an encoding. So when the user sneaks in an inappropriate character into the dataset, specifically ASCII 146 (which appears to be an apostrophe) instead of ASCII 39 (single q...

How can I see raw bytes stored in a MySQL column?

I have a MySQL table properly set to the UTF-8 character set. I suspect some data inserted into one of my columns has been double encoded. I am expecting to see a non-breaking space character (UTF-8 0xC2A0), but what I get when selecting this column out of this table is four octets (0xC3A2 0xC2A0). That's what I would expect to see if...

English/Arabic Encoding Problem

Hi I am designing a web page this web page should support English and Arabic languages my problem is : Arabic characters doesn't appear in its way it appears some thing like that "أهلاً يا معلم " I have tried to change the encoding of this page with the following tag <META CONTENT="text/html; charset=windows-1256" HTTP-E...

Jasper Report PDF Encoding

Hi. I am trying to generate or export to PDF a jasper report but I can't displaynihongo or japanese characters. How do I fix this... :( ...

Java: Readers and Encodings

Hi, Maybe isn't a good or relevant question, so please don't kill me. Java's default encoding is ASCII. Yes? (See my edit) When a textfile is encoded in UTF-8? How does a Reader know that he has to use UTF-8? The Readers I talk about are: FileReaders BufferedReaders from Sockets A Scanner from System.in ... EDIT: So the encoding ...

c# HttpWebResponse Header encoding

Hi, I have the following problem. I contact an address which I know employs a 301 redirect. using HttpWebRequest loHttp = (HttpWebRequest)WebRequest.Create(lcUrl); and loHttp.AllowAutoRedirect = false; so that I am not redirected. Now I get the header of the response in order to identify the new url. using loWebResponse.GetResponseHe...

Creating files with french characters and encoding.

HI, I am creating a file like so. FileStream temp = File.Create( this.FileName ); Then putting data in the file like so. this.Writer = new StreamWriter( this.Stream ); this.Writer.WriteLine( strMessage ); That code is encapsulated in a class hierarchy but that is the meat and potatoes of it. My problem is this. MSDN says that the ...

How to set the mechanize page encoding?

Hi, I'm trying to get a page with an ISO-8859-1 encoding clicking on a link, so the code is similar to this: page_result = page.link_with( :text => 'link_text' ).click So far I get the result with a wrong encoding, so I see characters like: 'T�tulo:' instead of 'Título:' I've tried several approaches, including: Stating the enco...

clean up strange encoding in ruby

I'm currently playing a bit with couchdb. I'm trying to migrate some blog data from redis (key value store) to couchdb (key value store). Seeing as I probably migrated this data a gazillion times from and to different blogging engines (everybody has got to have a hobby :) ), there seem to be some encoding snafus. I'm using CouchREST to a...

Converting UTF-8 PostgreSQL DB into WIN-1255 Shapefile

Hi, I have a PostgreSQL\PostGIS spatial database which contains Hebrew text columns. The system runs on Ubuntu, and everything works flawlessly with UTF-8. I am trying to dump some tables into shapefile for a Windows program which can only read Windows-1255 strings. Unfortunately, pgsql2shp has no encoding option, although shp2pgsql ha...

How to read non-english texts in java? They are represented in wrong encoding.

I use apache HttpClient. And when I'm trying to "read site", all non-english content is represented wrongly. Actually, it's represented in windows-1252 but it should be in UTF-8. How can I fix this? I tried to use InputStreamReader (inputStream, Charset.forName ("UTF-8")), but it didn't help (wrong symbols transformed into ????????). ...

What character set is this?

I received a bunch of CSV files from a client (that appear to be a database dump), and many of the columns have weird characters like this: Alain Lefèvre Angèle Dubeau & La Pietà That's seems like an awful lot of characters to represent an é. Does anyone know what encoding would produce that many characters for ...

accent ajax encoding issue

Source file has: header('Content-type: text/html; charset=iso8859-1'); Source ajax (jQuery) script is: $(document).ready(function() { $.ajaxSetup({ cache: false }); $("#searchfield").keyup(function(){ $("#insert_search") .load('ajax/searchobjects.php', {search_word: $("#searchfield").val()}, function(){ }); }); }); ...

Java: Turkish Encoding Mac/Windows

Hi I have a problem with turkish special characters on different machines. The following code: String turkish = "ğüşçĞÜŞÇı"; String test1 = new String(turkish.getBytes()); String test2 = new String(turkish.getBytes("UTF-8")); String test3 = new String(turkish.getBytes("UTF-8"), "UTF-8"); System.out.println(test1); System.out.println(...

Encoding utf-8 to base64 with accents

Hi, I have some data like this: data1 = ['Agos', '30490349304'] data2 = ['Desir\xc3\xa9','9839483948'] I'm using an API that expects the data encoded in base64, so what I do is: data = data1 string = base64.b64encode("Hi, %s! Your code is %s" % (data[0], data[0])) myXMLRPCCall(string) Which works fine with data1. With data2 the enc...

Rails html encoding

I am using h helper method in Rails to encode/escape a string that has an apostrophe (') In my view I am using it like this <%=h "Mike's computer" %> My understanding is that the html when viewing the source should be Mike%27s computer but the html produced has an apostrophe in it, Mike's computer Am I missing something obvious? Ho...

why would std::string s("??<") output a { instead of ??< as expected???

std::string s("??<"); std::cout << s << std::endl; Why does that output { instead of ??< I'm using Visual Studio 2008. I'm assume it's encoding it but why and what is the encoding called if that is what's happening? This little %#$^*! caused me to look for a bug in my (unit test) code for 30 minutes before I figured out my string wa...

C# Convert string from UTF-8 to ISO-8859-1 (Latin1) H

Hello, I know this has been asked before! I have googled on this topic and I have looked at every answer, but I still don't get it. Basically I need to convert UTF-8 string to ISO-8859-1 and I do it using following code: Encoding iso = Encoding.GetEncoding("ISO-8859-1"); Encoding utf8 = Encoding.UTF8; string msg = iso.GetString(utf8....