character-encoding

How to know the encoding of a file in Python?

Hello Does anybody know how to get the encoding of a file in Python. I know that you can use the codecs module to open a file with a specific encoding but you have to know it in advance. import codecs f = codecs.open("file.txt", "r", "utf-8") Is there a way to detect automatically which encoding is used for a file? Thanks in advance...

Should i convert every single possible charcter in my xhtml/html code? for both encoding iso-8859-1 vs utf?

Should i convert every single possible charcter in my xhtml/html code? for both encoding iso-8859-1 vs utf? If yes then is there any software to convert any needed character (which should be always in entity code) in my xhtml/html. like after complete xhtml coding in dreamweaver or in any editor i will put all code in converter and will...

How to remove  character

Hello, I got very strange problem. I have one php website which is running in two server. One is on Apache (Linux) and second is on IIS (WIndow). Linux Server, I just run it for demo. IIS is the actual hosting that I need to host. Even with all the same code, database, in the linux server, there's no  character. But in IIS, everywhere ...

From compilation to runtime, how does Java String encoding really work

I recently realized that I don't fully understand Java's string encoding process. Consider the following code: public class Main { public static void main(String[] args) { System.out.println(java.nio.charset.Charset.defaultCharset().name()); System.out.println("ack char: ^"); /* where ^ = 0x06, the ack char */ ...

JavaScript variable to ColdFusion variable

I have a tricky one. By means of a <cfoutput query="…"> I list some records in the page from a SQL Server database. By the end of each line viewing I try to add this in to a record in a MySQL database. As you see is simple, because I can use the exact variables from the output query in to my new INSERT INTO statement. BUT: the rsPick....

Character encoding problem in PHP/MySQL/jQuery

Hello! I made some CMS in PHP which manipulates with data from MySQL. In my CMS, i have some input fields in which I would like to have jQuery's fancy autocomplete implemented. Basically, the idea is to create jQuery's arrays from MySQL tables... I'm working with PHP 5.3.0, MySQL 5.0.82 and Eclipse 3.4.2. My PHP project in Eclipse is UT...

String formatting c# decode?

Hi , I have a string which looks like this '%7B%22id%22%3A1%2C%22name%22%3A%22jim%22%7D' When read from a cookie it is in fact a JSON object and should look like {"id":1,"name":"jim"} Do I need to HTML decode the string to make it appear in the correct JSON notation? Thanks, ...

Python thinks a 3000-line text file is one line long?

I have a very long text file that I'm trying to process using Python. However, the following code: for line in open('textbase.txt', 'r'): print 'hello world' produces only the following output: hello world It's as though Python thinks the file is only one line long, though it is many thousands of lines long, when viewed in a t...

Java File parsing toolkit design, quick file encoding sanity check

(Disclaimer: I looked at a number of posts on here before asking, I found this one particularly helpful, I was just looking for a bit of a sanity check from you folks if possible) Hi All, I have an internal Java product that I have built for processing data files for loading into a database (AKA an ETL tool). I have pre-rolled stages ...

Javascript convert data from utf-8 to iso-8859-1

Hi, I work on a website which is all done in iso-8859-1 encoding using old ASP 3.0. I use Yahoo YQL to request data (XML) from external websites but which I request to be returned as JSON-P (JSON with a callback function so I can retrieve the data). The problem I am facing is that YQL seems to always return data encoded in utf-8, which...

Convert CharCode to Char?

What I need ok I googled this and there are many tutorials on how to get the charCode from the character but I cant seem to find out how to get the character from the charcode. Basically I am I am listening for the KeyDown event on a TextInput. I prevent the char from being typed via event.preventDefault(); Later I need to add the te...

How can I escape HTML character entities when using ColdFusion function XMLFormat()?

I have the following block of HTML: <p>The quick brown fox jumps over the lazy dog &mdash; The quick brown fox jumps over the lazy dog.</p> <p>The quick brown fox jumps over the lazy dog &mdash; The quick brown fox jumps over the lazy dog. <br>The quick brown fox jumps over the lazy dog &mdash; The quick brown fox jumps over the lazy do...

wcstombs: character encoding?

wcstombs documentation says, it "converts the sequence of wide-character codes to multibyte string". But it never says what is a "wide-character". Is it implicit, like say it converts utf-16 to utf-8 or the conversion is defined by some environment variable? Also what is the typical use case of wcstombs? ...

Python: How do I force iso-8859-1 file output?

How do I force Latin-1 (which I guess means iso-8859-1?) file output in Python? Here's my code at the moment. It works, but trying to import the resulting output file into a Latin-1 MySQL table produces weird encoding errors. outputFile = file( "textbase.tab", "w" ) for k, v in textData.iteritems(): complete_line = k + '~~~~~' + v ...

Python conversion to ISO-8859-5

I'm facing problems when trying to convert a UTF-8 file (containing Russian characters) into an ISO-8859-5 file: 'charmap' codec can't encode character u'\ufeff' in position 0: character maps to . Has anyone got an idea of what's wrong(?) given the following: def convert(): try: import codecs data = codecs.open('in.t...

Perl iso-8859-1 string comparison

I wrote a small program to go through /usr/share/dict/words finding palindromes while(<>){ chomp; print "$_\n" if $_ eq reverse; } However, this does not work for a list of Danish words encoded in Latin-1 (ISO-8859-1). Just wondering how I'd go about making it work? ...

MySQL database collation and character set.

I have a mySQL database that has collation and character sets as follows: mysql> show variables like "character_set_database"; +------------------------+-------+ | Variable_name | Value | +------------------------+-------+ | character_set_database | utf8 | +------------------------+-------+ 1 row in set (0.00 sec) mysql> show...

Can I include characters such as "ã" and "ê" in UTF-8 encoded XML, or must it be UTF-16 encoded?

Can I include characters such as "ã" and "ê" in UTF-8 encoded XML, or must it be UTF-16 encoded? ...

manually converting between ASCII and .NET characters

I am working on writing some code to scrub user input to my ASP.NET site. I need to scrub input to remove all references to ASCII characters 145, 146, 147, 148 which are occasionally getting input from my mac users who are copying and pasting content they write in a word processor on their macs. My issue is the following three strings I...

What's different between utf-8 and utf-8 without BOM?

What`s different between utf-8 and utf-8 without BOM? Which is better? ...