I have a table like this, where one column is latin1, the other is UTF-8:
Create Table: CREATE TABLE `names` (
`name_english` varchar(255) character NOT NULL,
`name_chinese` varchar(255) character set utf8 default NULL,
) ENGINE=MyISAM DEFAULT CHARSET=latin1
When I do an insert, I have to type _utf8 before values being inserted in...
One of our program writes program information(window title, memory etc) in Java Preferences. On windows this is available under registry. How can I read the values written by Java program using c (or c++).
Looks like API I should use is RegGetValue. Is this guaranteed to work on Windows XP 32 bit?
The String written by java is UTF-8 ...
I have several documents I need to convert from ISO-8859-1 to UTF-8 (without the BOM of course). This is the issue though. I have so many of these documents (it is actually a mix of documents, some UTF-8 and some ISO-8859-1) that I need an automated way of converting them. Unfortunately I only have ActivePerl installed and don't know muc...
Well, the subject says everything. I'm using json_encode to convert some UTF8 data to JSON and I need to transfer it to some layer that is currently ASCII-only. So I wonder whether I need to make it UTF-8 aware, or can I leave it as it is.
Looking at JSON rfc, UTF8 is also valid charset in JSON output, although not recommended, i.e. som...
My OS is Debian, my default locale is UTF-8 and my compiler is gcc. By default CHAR_BIT in limits.h is 8 which is ok for ASCII because in ASCII 1 char = 8 bits. But since I am using UTF-8, chars can be up to 32 bits which contradicts the CHAR_BIT default value of 8.
If I modify CHAR_BIT to 32 in limits.h to better suit UTF-8, what do I ...
I'd like to use unicode symbols within my website (especially Dingbats).
Is there any way to enable this inside all (or at least some) browsers in Windows XP, without having the user to adjust any of his settings?
I use the HTML5 doctype with the charset configured to UTF-8:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
...
I have a sting in unicode is "hao123--我的上网主页", while in utf8 in C++ string is "hao123锛嶏紞鎴戠殑涓婄綉涓婚〉", but I should write it to a file in this format "hao123\uFF0D\uFF0D\u6211\u7684\u4E0A\u7F51\u4E3B\u9875", how can I do it. I know little about this encoding. Can anyone help? thanks!
...
I wanna constrain to input special signs like £ ¬ ¦ in javascript,but they are always displayed in ��� on Page source. How can i let them display correctly and page can be validated ? my page is using utf-8
thanks
...
I have a form on a page that sends data to php file via ajax request. The data is then collected into a single variable and sent to email specified in the php file. The data is in slovenian an uses a lot of letters that use diacritics (š,ć,ž). Everything works fine when the form is submitted from any browser that isn't Internet Explorer,...
Good day,
I have a script that scrapes the title/description of remote pages and prints those values into a corresponding charset=UTF-8 encoded page. Here is the problem, whenever a remote page is encoded with non-Latin characters encoding like (Arabic, Russian, Chinese, Japanese etc.) the imported values print as garbled text.
I've tr...
I'd like to remove all invalid UTF-8 characters from a string in JavaScript.
I've tried using the approach described here (link removed) and came up with the JavaScript:
strTest = strTest.replace(/([\x00-\x7F]|[\xC0-\xDF][\x80-\xBF]|[\xE0-\xEF][\x80-\xBF]{2}|[\xF0-\xF7][\x80-\xBF]{3})|./g, "$1");
It seems that the UTF-8 validation...
How can I "say" to SPARQL that ?churchname is in UTF-8 formatting? because response is like:Pražský hrad
PREFIX lgv: <http://linkedgeodata.org/vocabulary#>
PREFIX abc: <http://dbpedia.org/class/yago/>
SELECT ?churchname
WHERE
{
<http://dbpedia.org/resource/Prague> geo:geometry ?gm .
?church a lgv:castle .
?church geo:g...
Hi all,
I have a wchar_t string, for example, L"hao123--我的上网主页", I can convert it to utf8
encoding, the output string is "hao123锛嶏紞鎴戠殑涓婄綉涓婚〉", but finally, I must write this
string to a plain text file, its format is utf16 (I know this from others), "hao123\uFF0D\uFF0D\u6211\u7684\u4E0A\u7F51\u4E3B\u9875".
Because I must save it in...
I'm developing a firefox plugin and i fetch web pages to do some analysis for the user. The problem is when i try to get (XMLHttpRequest) pages that are not utf-8 encoded the string i see is messed up. For example hebrew pages with windows-1125 or Chinese pages with gb2312.
I already tried the following:
var uDecoder=Components.classe...
How can I get in Jena (Java language) result in UTF-8 format?
My code:
Query query= QueryFactory.create(queryString);
QueryExecution qexec= QueryExecutionFactory.sparqlService("http://lod.openlinksw.com/sparql", queryString);
ResultSet results = qexec.execSelect();
List<QuerySolution> list = ResultSetFormatter.toList(results);
System....
Hi,
i am trying to display characters like £ on a device which runs under linux .
it is using utf-8 charset format .
when i get to display a string which contains special characters, it displays other characters too .
if i print the string on the console it appears ok, but when i parse the string to load each letter font on the screen i...
I'm writing a php script to export MySQL database rows into a .txt file formatted for Adobe InDesign's internal markup.
Exports work, but when I encounter special characters like é or umlauts, I get weird symbols (eg Chloë Hanslip instead of Chloë Hanslip). Rather than run a search and replace for every possible weird character, I need...
We've recently hit a snag where a trademark symbol is being copied from one Oracle database to another, but have had it come across as a '?'.
We've tracked the issue to the destination database being configured with a character set of 'US7ASCII'. Unfortunately, rebuilding the database to address this is not something we can do at the p...
I'm trying to search a text for a match and return it with snippet around it. For this, I want to find match with regex, then cut the string using match index +- snippet radius (text.mb_chars[start..finish]).
However, I cannot get ruby's (1.8) regex to return match index which would be multi-byte aware.
I understand that regex is one ...
I have these lines in Vim:
a
c
b
e
é
f
g
and when I do :%sort, I get this:
a
b
c
e
f
g
é
Obviously, the "é" line should not be at the end, it should be after the "e" line. Is it possible to get Vim to sort these lines correctly? Not using the ASCCI key for the characters but the actual character.
I also tried with :!sort (to use G...