I made a PHP script that generates CSV files that were previously generated by another process.
And then, the CSV files have to be imported by yet another process.
The import of the old CSV files works fine, but but when importing the new CSV files there are issues with special characters.
When I open old CSVs with Notepad++, it says t...
Interesting question... if I have a MySQL table with CHARSET=utf8, and I open a connection with latin1 encoding, what happens?
I tried this, and even characters such as ß and æ could be stored and retrieved properly. Those characters are represented with different byte sequences in utf8 and in latin1, so I didn't expect it to work.
Is ...
Hello!
I want to output the following string in PHP:
ä ö ü ß €
Therefore, I've encoded it to utf8 manually:
ä ö ü ß €
So my script is:
<?php
header('content-type: text/html; charset=utf-8');
echo 'ä ö ü ß €';
?>
The first 4 characters are correct (ä ö ü ß) but unfortunately the € sign isn't correct:
ä ö ü ß
Here you...
Hello
I am interfacing with a Java application via Python. I need to be able to construct byte sequences which contain utf-8 strings. Java uses a modified utf-8 encoding in DataInputStream.readUTF() which is not supported by python (yet at least)
Can anybody point me in the right direction to construct java modified utf-8 strings in py...
I'm attempting to apply a stylesheet to an XML document using Saxon. Given an XML file that was generated in Microsoft Word and that has Microsoft Word-style quotes, such as around FOO in the following document
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<act>
<performer typeCode=“FOO“ />
<performer typeCode="BAR" />
...
I have a question that may be quite naive, but I feel the need to ask, because I don't really know what is going on. I'm on Ubuntu.
Suppose I do
echo "t" > test.txt
if I then
file test.txt
I get test.txt:ASCII text
If I then do
echo "å" > test.txt
Then I get
test.txt: UTF-8 Unicode text
How does that happen? How does file...
Should I write my own or is there a library function that already does that? I need this for a pidgin plugin, so if there is something in the pidgin/purple/gnome libraries, that would be ideal. But other sources are fine, too.
...
On a modern Unix or Linux system, how can you tell which code set the /etc/passwd file stores user names in? Are user names allowed to contain accented characters (from the range 0x80..0xFF in, say, ISO 8859-1 or 8859-15)? Can the /etc/passwd file contain UTF-8? Can you tell that it contains UTF-8? What about the plain text of passwo...
I'm getting an HTML file as NSData and need to extract some parts of it. For that I need to convert it to NSString with UTF8 encoding. The thing is that this conversion fails, probably because the NSData contains bytes that are invalid for UTF8. I have tried to get the byte array of the data and go over it, but each time I come across no...
Which widely used programming languages were designed ground-up with Unicode support?
A lot of programming languages have added Unicode support as an afterthought in later versions, but which widely used languages were released with Unicode support from day one?
...
Heya guys,
I'm in desperate need of help.
I have a Java servlet that is accessed by a HTTP Get URL with eight parameters in it.
The problem is that the parameters are not exclusive to English.
Any other language can be in those parameters, like Hebrew, for example.
Now, when I send the data - either from the class that is supposed to...
We are having trouble getting a Unicode string to convert to a UTF-8 string to send over the wire:
// Start with our unicode string.
string unicode = "Convert: \u10A0";
// Get an array of bytes representing the unicode string, two for each character.
byte[] source = Encoding.Unicode.GetBytes(unicode);
// Convert the Unicode bytes to U...
I'm using python 2.6.2's xml.etree.cElementTree to create an xml document:
import xml.etree.cElementTree as etree
elem = etree.Element('tag')
elem.text = (u"Würth Elektronik Midcom").encode('utf-8')
xml = etree.tostring(elem,encoding='UTF-8')
At the end of the day, xml looks like:
<?xml version='1.0' encoding='UTF-8'?>
<tag>WÃ&#...
Hi all,
I'm working on a project which involves maven, java and clojure. The problem I'm facing is this, I have some UTF-8 chars in my clojure source files because of which my source code is not interpreted correctly by the java compiler, I kinda got it working by setting the environment variable JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF...
I have this character showing up occasionally and I can't seem to find it in the ascii table. I'd like to run a filter on the data before it's sent to the database but I have to know what it is first. Maybe someone can clue me in. I am using a wysiwyg editor and this is where it's coming from. The character appears very sporadicly but se...
How replace (use regex in PHP5) invalid characters in utf-8 string on white space characters?
...
I use OS X and I am currently cooperating with a windows user and deploying the scripts on a linux server. We use git for version control, and I keep getting R scripts from his end where the character encoding used has mixed latin1 and utf8 encodings. So I have a couple of questions.
Is there a simple to use editor for windows that h...
I'm at the receiving end of a HTTP POST (x-www-form-urlencoded), where one of the fields contains an XML document. I need to receive that document, look at a couple of elements, and store it in a database (for later use).
The document is in UTF-8 format (and has the appropriate header), and can contain lots of strange characters.
When I...
I have a huge MySQL table which has its rows encoded in UTF-8 twice.
For example "Újratárgyalja" is stored as "Újratárgyalja".
The MySQL .Net connector downloads them this way. I tried lots of combinations with System.Text.Encoding.Convert() but none of them worked.
Sending set names 'utf8' (or other charset) won't solve it.
How can...
Just a quick one:
Will SELECT ... WHERE name LIKE '...' query be faster if name column is ASCII rather then UTF-8?
Thanks!
...