utf-8

Icelandic, utf8 and utf8x in LaTeX

First of all, what's the difference between utf8 and utf8x in \usepackage[utf8]{inputenc} \usepackage[utf8x]{inputenc} when used in LaTeX? Secondly, what packages are required when writing an article in Icelandic using LaTeX? I found: \usepackage[icelandic]{babel} \usepackage[T1]{fontenc} \usepackage[utf8x]{inputenc} after experim...

YUI datatable XHR get not using UTF8

OK, so my datasource is serving json and it's UTF8 encoded. Viewing the json result in the browser or Firebug confirms this. But when the YUI datatable displays the results in a table the UTF encoding is lost, resulting in G�teborg instead of Göteborg I see according to the documentation provided by Yahoo that you can specify the conn...

excel file not generating after UTF-8 encoding chosen

This part of my code was creating xls file successfuly FileOutputStream fileOut = new FileOutputStream("c:\\Decrypted.xls"); wb.write(fileOut); fileOut.close(); when other part of the code had this statement ( which was before the above code ) in = new ByteArrayInputStream(theCell_00.getBytes("")); But when I changed it to in ...

communicate with a process in utf-8 on a cp1252 consoless

I need to control a program by sending commands in utf-8 encoding to its standard input. For this I run the program using subprocess.Popen(): proc = Popen("myexecutable.exe", shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE) proc.stdin.write(u'ééé'.encode('utf_8')) If I run this from a cygwin utf-8 console, it works. If I run it from ...

Problems with character encodings in LAMP app - UTF-8 or not?

I'm still learning the ropes with PHP & MySQL and I know I'm doing something wrong here with how character sets are set up, but can't quite figure out from reading here and on the web what I should do. I have a standard LAMP installation with PHP 5, MySQL 5. I set everything up with the defaults. When some of my users input comments to ...

PHP: Convert unicode codepoint to UTF-8

I have my data in this format: U+597D or like this U+6211. I want to convert them to UTF-8 (original characters are 好 and 我). How can I do it? ...

Why my typeset function doesn't work for non-latin/Asian characters?

Hi all: I've convinced my boss to do the typesetting stuff using PHP(PHP Version 5.2.8). And this is what I got so far(set Character encoding to Unicode(UTF-8) if you see misrendered Japanese characters): demo page at my personal website Basically, if you copy and paste the latin sample paragraph into the textarea and click the button...

Adding non-escaped Ampersands to HTML with Nokogiri::XML::Builder

I would like to add things like bullet points "•" and such to html using the XML Builder in Nokogiri, but everything is being escaped. How do I prevent it from being escaped? I would like the result to be: <span>&#8226;</span> rather than <span>&amp;#8226;</span> What am I missing? I'm just doing this: xml.span { xml...

Google Geocode: PHP Implimentation - character encoding issues

Hello, I'm working with UK address data and also International address data. I need to geocode the address data for use on a google map. I'm doing this using the HTTP service. Ie/ Constructing a query string and passing it to file_get_contents($THEURL). I've managed to geocode 80% of the address data perfectly, however those addresse...

How can I set the encoding of shell-command-on-region output?

I have a small elisp script which applies Perl::Tidy on region or whole file. For reference, here's the script (borrowed from EmacsWiki): (defun perltidy-command(start end) "The perltidy command we pass markers to." (shell-command-on-region start end "perltidy" t ...

Batch UTF-8 Validation Tool?

Anyone know an app/service/method that I could use to validate a bunch of XML files for UTF-8? Basically I have a ton of XML files that are suppose to be UTF-8 and some of them happen to contain some bogus characters causing them not to render right in the content viewer. I know I can check one at a time with methods found in this answ...

Is it correct to write to a database which has 'NLS_CHARACTERSET' and 'NLS_NCHAR_CHARACTERSET' parameter values AL32UTF8 and UTF-8 with UTF-16 code page values?

The value of parameters 'NLS_CHARACTERSET' and 'NLS_NCHAR_CHARACTERSET' is UTF-8 for source database from where i am reading data, and AL32UTF8 and UTF-8 for target database where i am writing data. I am reading data from a text file which has english, european and asian characters, I am using UTF-16 code page to read from source flat fi...

PHP: Make Site Unicode Compatible

Hello, How can i make my site unicode compatible to support more languages other than english. Thanks ...

Byte order mark screws up file reading in Java

I'm trying to read CSV files using Java. Some of the files may have a byte order mark in the beginning, but not all. When present, the byte order gets read along with the rest of the first line, thus causing problems with string compares. Is there an easy way to skip the byte order mark when it is present? Thanks! ...

Converting UTF-8(or other 8-bit encoding) to 7 or fewer bits.

I wish to take a file encoded in UTF-8 that doesn't use more than 128 different characters, then move it to a 7-bit encoding to save the 1/8 of space. For example, if I have a 16 MB text file that only uses the first 128(ascii) characters, I would like to shave off the extra bit to reduce the file to 14MB. How would I go about doing thi...

MD5 Hash of ISO-8859-1 string in Java

I'm implementing an interface for digital payment service called Suomen Verkkomaksut. The information about the payment is sent to them via HTML form. To ensure that no one messes with the information during the transfer a MD5 hash is calculated at both ends with a special key that is not sent to them. My problem is that for some reason...

php htmlspecialchars and utf-8

I am just trying to confirm something with htmlspecialchars I have just converted my database into UTF-8 and I think I finally have it all working, but throughout my code i have used the php htmlspecialchars function htmlspecialchars($val, ENT_QUOTES,'ISO-8859-1',false); Do I need to worry about changing all the entries to : htmlspec...

UTF-8, and mbstring extension in php

While I was converting my latin-1 mysql database into utf-8 i came across this article (http://developer.loftdigital.com/blog/php-utf-8-cheatsheet) please note I have successfully converted my database and my app appears to be working/outputting correctly It the previously mentioned link it says about installing and using the mbstring ...

How to Spool UTF-8 format data in Oracle database into text file

How to Spool UTF-8 format data in Oracle database into text file with all UTF-8 Chars comming porperly. UTF-8 Characters example Chinese characters. I am trying to spool data from oracle data base which is UTF-8 enabled and trying to spool the same data into txt or cvs.Instead of the chinese charcters i am getting ????. ...

Python csv library with Unicode/UTF-8 support that "just works"

The csv module in Python doesn't work properly when there's UTF-8/Unicode involved. I have found in Python documentation (http://docs.python.org/library/csv.html) and other webpages snippets that work for specific cases, but you have to understand well what encoding you are handling and use the appropiated snippet. Is there any universa...