character-encoding

Problem with gzip compressing utf-8 encoded php page. HELP!

i use this at top of my php page: if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start(); when page is save with ANSI encoding page was compressed. but when i change page encoding to utf-8 compression was faild. please help!!! i test compression on www.gidnetwork.com/tools/gzip-t...

merging my CSS and JS files breaks code (working on mac, not on my server)

I'm merging (and afterwards minify with YUI compressor) my CSS and JS files. My web application works fine when just linking the separate files. Now I want to merge the files as one CSS file, so I just basically do the following: find /myapp/js/ -type f -name "incl_*.js" -exec cat {} + > ./temporary/js_backend_merged.js That merges al...

Django: How can I determine why Django isn't displaying certain data?

I have a Django app that runs a tool and displays the results from the tool back to the user using a Django template. Sometimes Django does not display the results. It doesn't complain about anything, it just doesn't display the results. I'm guessing this is something to do with one or more of the characters in the results being illegal ...

Visual Studio 2010 changes files to wrong encoding

I've been annoyed by this for a long time now. Somehow Visual Studio 2010 (VS2008 too IIRC) changes the encoding of my files from "Unicode (UTF-8 with signature) - Codepage 65001" to "Western European (Windows) - Codepage 1252". I have a faint idea that it's either ReSharper or VisualSVN, that's doing the character encoding changes, but...

git->Process->git round trip text encoding

I'm trying to write a utility to migrate code from our custom source control to git. The 'commit' messages in the old system were added incrementally to a text file in the project path, so I'm using git to diff these files from one version to the next, then using that diff as a commit message. Unsurprisingly, I'm having trouble with ac...

iphone charset howto

Hi all, i've a class (downloaded from internet) that read a value of a file in this way (and after split the file): NSArray* array = [fileContents componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]; where fileContents is my fullpathoffile. in which way i must write my string? i'm trying with thi...

how to set string character encoding in android

HI! I have a web page content in encoded in ISO-8859-2. How to convert a stream encoded in this charset to java's UTF-8. I'm trying the code below, but it does not work. It messes up some characters. Is there some other way to do this? BufferedInputStream inp = new BufferedInputStream(in); byte[] buffer = new byte[8192]; int...

MySql varchar change from Latin1 to UTF8

In a mySql table I'm using Latin1 character set to store text in a varchar field. As our website now is supported in more countries we need support for UTF8 instead. What will happen if I change these fields to UTF8 instead? Is it secure to do this or will it mess up the data inside these fields? Is it something I need to think about whe...

Accent support for mails in Spring Framework

I'm sending a mail with the word Òmnium (see the accent) in the sender using Spring Framework. The code is the one I found for Spring: org.springframework.mail.javamail.JavaMailSenderImpl sender = sender(); javax.mail.internet.MimeMessage msg = sender.createMimeMessage(); MimeMessageHelper helper = new MimeMessageHelper(msg...

Character encoding issue between different versions of PHP, Apache and MySQL.

I recently converted an old MySQL database stored as latin1_swedish_ci to utf8_general_ci. I've now got the the HTTP header specifying UTF-8, the HTML tag on the page, and the data in the database is correctly encoded as utf8_general_ci. It all works fine on my testing server, so I upload the updated HTML files and PHP scripts to the ...

Question Marks Instead of Chinese Characters

I'm trying to place some Chinese text on a website, but as soon as the page is placed online, instead of Chinese text, i see a row of question marks ?????????? ??????????? I tested the same page on a WAMP server before putting it online (all the pages have a php extension) and the Chinese characters show just fine, it is only when the p...

Weird charactors on HTML page.

i am using Last.fm API to fetch some info of artists .I save info in DB and then display on my webpage. But characters like “ (double quote) are shown as “ . Example Artist info http://www.last.fm/music/David+Penn and i got the first line as " Producer, arranger, dj and musician from Madrid-Spain. He has his own record company “Ze...

Protocol charset conflict, ESMTP vs. XML in an email body

We have a process in which XML is transferred to us via ESMTP in an email body. The character set of the email body is specified as ISO-8859-1, and no encoding is specified for the XML. According to the protocol, the default is UTF-8. The problem is our XML parser is throwing an exception when it encounters the ® character because it ...

Grab Kanji webpage using Nokogiri

Hi, I would like to grab a kanji table on a Wikipedia page and I have a trouble using Nokogiri with special char. Here is my script: # -*- encoding: utf-8 -*- require 'rubygems' require 'nokogiri' require 'open-uri' link = 'http://en.wikipedia.org/wiki/List_of_j%C5%8Dy%C5%8D_kanji' doc = Nokogiri::HTML(open(link)) doc.encoding = 'U...

When parsing XML, the character é is missing

I have an XML as input to a Java function that parses it and produces an output. Somewhere in the XML there is the word "stratégie". The output is "stratgie". How should I parse the XML as to get the "é" character as well? The XML is not produced by myself, I get it as a response from a web service and I am positive that "stratégie" is ...

Converting from utf-16 to utf-8 in Python 3

I'm programming in Python 3 and I'm having a small problem which I can't find any reference to it on the net. As far as I understand the default string in is utf-16, but I must work with utf-8, I can't find the command that will convert from the default one to utf-8. I'd appreciate your help very much. ...

Advice on marshalled string that can be either ASCII or UTF-16

Welcome to unsafe land. I'm doing P/Invoke to a legacy lib that gives me a 0-terminated C-style string in the form of an unknown-length unmanaged byte buffer that can be either ASCII or UTF-16, but without giving any indication whatsoever thereof - other than the byte stream itself that is... Right now I have a bad scheme, based on che...

How to cast wchar_t into int for displaying the code point?

I have a simple function in my program, when I was wanting to mess around with unicode and do stuff with it. In this function, I wished to display the code value of the character the user entered. It SEEMED possible, here's my function: wstring listcode(wchar_t arg) { wstring str = L""; str += static_cast<int> (arg); //I tried (...

Character encoding problem from Facebook JSON to HTML via PHP

I'm getting a JSON encoded array from Facebook which contains: [{"message":"D\u011bkujeme Zuzana Boh\u00e1\u010dov\u00e1 za na\u0161i novou profilovou fotku :-)\nWe thank Zuzana Boh\u00e1\u010dov\u00e1 for our new profile picture :-)"}] When I decode the JSON and output the contents I get: DÄ›kujeme Zuzana BoháÄová za...

In Java: why some Stream methods take int instead of byte or even char?

Hi Folks, Why some methods that write bytes/chars to streams takes int instead of byte/char?? Someone told me in case of int instead of char: because char in java is just 2 bytes length, which is OK with most character symbols already in use, but for certain character symbols (chines or whatever), the character is being represented in ...