encoding

UnicodeEncodeError with BeautifulSoup 3.1.0.1 and Python 2.5.2

With BeautifulSoup 3.1.0.1 and Python 2.5.2, and trying to parse a web page in French. However, as soon as I call findAll, I get the following error: UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1146: ordinal not in range(128) Below is the code I am currently running: import urllib2 from BeautifulSoup i...

"unmappable character for encoding" warning in Java

I'm currently working on a Java project that is emitting the following warning when I compile: /src/com/myco/apps/AppDBCore.java:439: warning: unmappable character for encoding UTF8 [javac] String copyright = "� 2003-2008 My Company. All rights reserved."; I'm not sure how SO will render the character before the date, but it ...

 characters appended to the begining of each file

I've downloaded an HttpHandler class that concatenates JS files into one file and it keeps appending the  characters at the start of each file it concatenates. Any ideas on what is causing this? Could it be that onces the files processed they are written to the cache and that's how the cache is storing/rendering it? Any inputs would...

Converting from a Html encoded string to a 'normal' Url

I have a url in an xml documnent which is encoded <Link>http://www.sample.com/test.asp?goto=HOTWIZ%26eapid=857&lt;/Link&gt; I would like to convert that into a Url in the outputed Html. I can output a link ok but i need the %26 to be converted to an & I assume i could use some sort of replace functionality in XSLT but I imagine ther...

Base64 Encode a PDF in C#?

Can someone provide some light on how to do this? I can do this for regular text or byte array, but not sure how to approach for a pdf. do i stuff the pdf into a byte array first? ...

Converting UTF-8 to WIN1252 using C++Bulder 5

I have to import some UTF-8 encoded text-file into my C++Builder 5 program. Are there any components or code samples to accomplish that? ...

How to test an application for correct encoding (e.g. UTF-8)

Encoding issues are among the one topic that have bitten me most often during development. Every platform insists on its own encoding, most likely some non-UTF-8 defaults are in the game. (I'm usually working on Linux, defaulting to UTF-8, my colleagues mostly work on german Windows, defaulting to ISO-8859-1 or some similar windows codep...

XSLT encoding problem, questionmarks in result

I'm trying to run an XSLT transformation, but characters like ëöï are replaced by a literal '?' in the output (I checked with an hex editor). The source file has the correct characters, and the stylesheet has: <xsl:output encoding="UTF-8" indent="yes" method="xml"/> What else am I missing? I'm using saxon as the transformer, if that ...

Asp.net C# EMail Address with Special Character

Hi all, i am sending emails with the integrated System.Net.Mail i do like MailAddress abs = new System.Net.Mail.MailAddress("[email protected]", "Web Präsenz", System.Text.Encoding.UTF8); when the E-Mail comes to Client the "ä" character is missing. seems like some encoding Problems. anyone knows how to fix it? ...

jsp utf encoding

I'm having a hard time figuring out how to handle this problem: I'm developing a web tool for an Italian university, and I have to display words with accents (such as è, ù, ...); sometimes I get these words from a PostgreSql table (UTF8-encoded), but mostly I have to read long passages from a file. These files are encoded as utf-8 xml, ...

Setting the correct encoding when piping stdout in python

When piping the output of a python program, the python interpreter gets confused about encoding and sets it to None. This means a program like this: # -*- coding: utf-8 -*- print "åäö" will work fine when run normally, but fail with: UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 0: ordinal not in range(...

Two encodings used in RTF string won't display correct in RichTextBox?

I am trying to parse some RTF, that i get back from the server. For most text i get back this works fine (and using a RichTextBox control will do the job), however some of the RTF seems to contain an additional "encoding" and some of the characters get corrupted. The original string is as follows (and contains some of the characters u...

Base64 Encode String in VBScript

I have a web service load driver that's a Windows Script File (WSF), that includes some VBScript and JavaScript files. My web service requires that the incoming message is base64 encoded. I currently have a VBScript function that does this, but it's very inefficient (memory intensive, mostly due to VBScripts awful string concatenation) ...

How to encode a large number (in an URL)?

Quite often one has to encode an big (e.g. 128 or 160 bits) number in an url. For example many web applications use md5(random()) for UUIDs. If you need to put that value in an URL the common approach is to just encode it as an hexadecimal string. But obviously hex encoding is not a very tight encoding. What other approaches are there...

How to convert a string from utf8 to ASCII (single byte) in c#?

I have a string object "with multiple characters and even special characters" I am trying to use UTF8Encoding utf8 = new UTF8Encoding(); ASCIIEncoding ascii = new ASCIIEncoding(); objects in order to convert that string to ascii. May I ask someone to bring some light to this simple task, that is hunting my afternoon. EDIT 1: What...

Are URLs allowed to have a space in them?

Are URIs (specifically HTTP URLs) allowed to have a space in them? If they must be encoded, is '+' just a commonly followed convention, or a legitimate alternative? Thanks! EDIT: Can someone point to an RFC indicating that a URL with a space must be encoded? joe ...

Java App : Unable to read iso-8859-1 encoded file correctly.

I have a file which is encoded as iso-8859-1, and contains characters such as ô . I am reading this file with java code, something like: File in = new File("myfile.csv"); InputStream fr = new FileInputStream(in); byte[] buffer = new byte[4096]; while (true) { int byteCount = fr.read(buffer, 0, buffer.leng...

Encoding problem using SQLite and WinForms 2.0 C#

Hello all I am developing a WinForms app using .NET 2.0 and am trying to use SQLite as a DB solution. My main problem is that I have trouble seeing data from the DB in the WinForm when the data is in a non english language (in my case greek). For db administration purposes I use the SQLite administrator which has no trouble at all retu...

Java : How to determine the correct charset encoding of a stream

With reference to the following thread: http://stackoverflow.com/questions/498636/java-app-unable-to-read-iso-8859-1-encoded-file-correctly What is the best way to programatically determine the correct charset encoding of an inputstream/file ? I have tried using the following: File in = new File(args[0]); InputStreamReader r = ne...

embedded script displaying gibberish depending on encodying type (utf-8)...

I have a widget that people can put in their site. The widget is generated via php script that echos the populated string using: document.write('$widget_output'). The hosting sites call to the widget using a javascript tag: <script type="text/javascript" src="http://www.link.com/page.php?param=1"&gt;&lt;/script&gt; The problem is t...