how to encode/decode escape sequence character '\x13' in python into a character that is valid in a RSS or XML.
use case is, I am getting data from arbitrary sources and making a RSS feed for that data. The data source sometimes have escape sequence character which is breaking my RSS feed.
So how can I sanitize the input data with e...
Hi friends,
I have an index.html and global.css files. When I open these files at Coda, Textmate, etc. , everything looks fine. than I try in firefox, index.html loads css from right path, but it doesnt take effect. than I tried to see css code from firefox, and I see signs like;
ॵ氮扵汬整筰慤摩湧㨰‵灸‰′㕰硽畬畬汥琠汩筬楳琭獴祬攺摩獣㭰慤摩湧㨲灸紮摲慷汩湥筢潲摥爭扯瑴潭㨱灸慳桥...
This is somehow related to my question here.
I process tons of texts (in HTML and XML mainly) fetched via HTTP. I'm looking for a library in python that can do smart encoding detection based on different strategies and convert texts to unicode using best possible character encoding guess.
I found that chardet does auto-detection extrem...
All of our tables are currently set with a LATIN1 character set. A user is currently capable of putting together unicode sequences on the client and trying to embed them into our application. What's the best way to discard all Unicode characters from hitting our database? Even better, that's the best way to ensure that only characters ba...
I currently have a bunch of tables using the latin1 charset in a MySQL 5.1.x DB. Problem is, we recently had a bunch of users trying to input text using UTF-8 encoding, and that seemed to break things.
Is it safe to blindly update the table's character set? What are some best practices (besides obviously backing everything up) for a sit...
I have my own Twitter API and I've received a couple emails about a problem when trying to post a status update with accent marks and other diacritics. I would like to encode these so that the status update still has them.
I know there are ways to remove the diacritic, but I would like to keep it.
I read the Twitter Counting Character...
Hi,
I'm actually working on a Java Host integration. The actual system uses Microsoft SNA Server, where an ASCII-EBCDIC conversion is done based on local COMTBLG Gtable. Do you know the specification of this file? Is there anyone having coded a Java program to read it?
Thanks in advance.
Esteve
...
I have a function that I have used a bunch of times in various files which has a signature like:
Translate("English Message", "Spanish Message", "French Message")
and I am wanting to pull out the English, Spanish and French messages and then output them into a csv so that people who actually know these languages can tell me what I SHO...
I need to transfer a column from one table to another. The source table has a different collation than the target table (latin1_general_ci and latin1_swedish_ci).
I use
UPDATE target
LEFT JOIN source ON target.artnr = source.artnr
SET target.barcode = source.barcode
I get an "illegal mix of collations".
What is a quick fix to ge...
I am in the U.S. I have the following line in my web page:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
And my MYSQL table is MyISAM latin1_swedish_ci
But when someone fills out a form with a foreign character it gets stored in MySql as garbage. An example would be an e with accent over it, etc. - something...
We have a Crystal Reports 2008 report which merges database data with some SurveyMonkey free-text data stored in an Excel spreadsheet.
The free text data looks OK in Excel, looks OK when copied/pasted to Notepad, and looks OK in the Crystal Report. But when we export the crystal report to PDF, a lot of strange box characters get append...
How would you design an 8-bit encoding of a set of 256 characters from western languages (say, with the same characters as ISO 8859-1) if it had not to be backward-compatible with ASCII?
I'm thinking to rules of thumb like these: if ABC...XYZabc...xyz0123...89 were, in this order, the first characters of the set (codes from 0 to 61), th...
My web application stores URL segments in a database. These URL segments are based on user-submitted content.
What collation should I use for character strings that will appear in URLs?
My assumption is ASCII General CI (?) based on this question: http://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid
...
So, I'm trying to do some screen scraping off of a certain site using nokogiri, but the site owners failed to specify the proper encoding of the page in a <meta> tag. The upshot of this is that I'm trying to deal with strings that think they're utf-8, but really aren't.
(If you care, here are the files I was using to test this:
main ...
Hi I am trying to store names into an Oracle database and fetch them back using PHP and oci8.
However, if I insert the é directly into the Oracle database and use oci8 to fetch it back I just receive an e
Do I have to encode all special characters (including é) into html entities (ie: é) before inserting into database ... or am ...
Is there any problem with ASPX to render french accented characters?
I am using utf-8 to encode.
I never had any problem like this before (but since this is the first time I am working on an ASP server is there any fix?)
e.g
Événements = Événements
Journées fériées = Journées fériées
Is this an encoding problem? or is there any ...
I'm using all of the below to take a field called 'code' from my database, get rid of all the HTML entities, and print it 'as usual' to the site:
<?php $code = preg_replace('~&#x([0-9a-f]+);~ei', 'chr(hexdec("\\1"))', $code);
$code = preg_replace('~&#([0-9]+);~e', 'chr("\\1")', $code);
$code = html_entity_decode($code); ?>
H...
Hi,
I have read many similar questions, apologies if this is considered a duplicate.
Suppose I am reading a file containing 3 comma separated numbers. The file was saved with with an unknown encoding, so far I am dealing with ANSI and UTF-8. If the file was in UTF-8 and it had 1 row with values 115,113,12 then:
with open(file) as f:
...
I have a PostgreSQL database with some Unicode values. For example "vaishali" in Marathi. I want to fire a query SELECT * FROM table WHERE name LIKE vaishali (I type "vaishali" in Marathi, so I first convert to unicode in my prog). But it matches nothing. Why?
...
I'm trying to use the ASP.NET chart controls for a website that is localised for number of languages. However, we've had issues with the charts when we recently added a Chinese localisation - all of the labels show squares where we actually want Chinese characters, as shown in my sample below (please note I don't know any Chinese so thi...