questions about utf-8 | ansaurus

utf-8

Prawn & Prawnto Rails PDF generation - UTF-8 ??

I'm using ruby, prawn, and prawnto to dynamically generate pdf's containing text in other languages. I can't seem to get any text in languages with non-english characters to show up. It doesn't throw any errors...just shows a bunch of dashes instead of characters. Prawn brags on its homepage about UTF-8 support so I don't see why this is...

Search or compare within a Grapheme Cluster in Korean

In my current implementation of a UISearchBarController I'm using [NSString compare:] inside the filterContentForSearchText:scope: delegate method to return relevant objects based on their name property to the results UITableView as you start typing. So far this works great in English and Korean, but what I'd like to be able to do is se...

Converting Multibyte characters to UTF-8

Hi All, My application has to write data to an XML file which will be read by a swf file. The swf expects the data in the XML to be in UTF-8 encoding. I have to convert some Multibyte characters in my app(Chinese simplified, Japanese, Korean etc..) to UTF-8. Are there any API calls which could allow me to do this?I would pre...

character-encoding

MySQL "incorrect string value" error when save unicode string in Django

I got strange error message when tried to save first_name, last_name to Django's auth_user model. Failed examples user = User.object.create_user(username, email, password) user.first_name = u'Rytis' user.last_name = u'Slatkevičius' user.save() >>> Incorrect string value: '\xC4\x8Dius' for column 'last_name' at row 104 user.first_name ...

Using unicode charater in generated pdf (java, iText)

Hi. I have a problem with with unicode characters in generated pdf. Everything works fine on my own workstation, but at the test environment things go wrong. Code inserting value is following: Font boldDefaultFont = FontFactory.getFont(FontFactory.HELVETICA, 10, Font.BOLD); // ... PdfPCell headerCell = new PdfPCell(); // unit.getName() ...

VS2005 UTF-8 generic HTTP handler: problem with certain chars in query string (e.g. þ æ)

I am developing a generic HTTP handler in VS2005 and testing it in Debug Mode. It works well except when the query string contains higher-bit characters, e.g. Latin Small Letter Thorn /u00FE þ and Latin Small Letter Ae /u00E6 æ. IE8 on my machine is set to send UTF-8 URLs. I am typing the following into the IE8 address bar when debuggi...

visual-studio-2005

contentencoding

How to read unicode (utf-8) / binary file line by line

Hi programmers, I want read line by line a Unicode (UTF-8) text file created by Notepad, i don't want display the Unicode string in the screen, i want just read and compare the strings!. This code read ANSI file line by line, and compare the strings What i want Read test_ansi.txt line by line if the line = "b" print "YES!" else pri...

LESSCHARSET=utf-8 less doesn't seem to work

I'm trying to view a UTF-8 text file/stream in less, and even if I invoke it like this: cat file | LESSCHARSET=utf-8 less the non-ASCII compatible UTF-8 characters don't display correctly. Instead, their hex values appear highlighted in brackets, e.g. <F4>. The reading the same text in vim with UTF-8 encoding poses no problems. So I'...

Change encoding from UTF-8 to ISO-8859-2 in Javascript

I would like to change string encoding from UTF-8 to ISO-8859-2 in Javascript. How can I do it? I need it because I've designed a widget. User just copies < script > tag from my site and puts it on his. This script creates div and puts into div widget contents with text. If target website is in UTF-8 encoding - it works fine. But when ...

character-encoding

internationalization

Reading and outputting UTF-8 strings in c/cocoa

In an objective-c/cocoa app, I am using c functions to open a text file, read it line-by-line and use some lines in a third-party function. In psuedo-code: char *line = fgets(aFile); library_function(line); // This function calls for a utf-8 encoded char * string This works fine until the input file contains special characters (such ...

How I print UTF-8 characters C++?

How I print these UTF-8 characters in C++? ...

character-encoding

utf-8 problem in using jquery autocomplete tags

hey mates . recently i used jquery auto-complete tag http://devthought.com/projects/jquery/textboxlist/ everything goes fine except utf-8 tag suggesting , only English tags are suggested i guess something goes wrong with script lines it works fine with English tags but not with multi byte languages like Persian ...

PHP and Russian Letters

What is happening with Russian letters when sending via PHP request to ... a mail, by e.g.? the "hardcoded" russians letters are displayed properly, but from the Form's textboxex with hieroglyphs: HTML page: <tr> <td style="width: 280px">Содержание работ</td> <td><input type="text" id="workContent"/></td> </tr> PHP page: $WorkCont...

Importing bulk CSV data in UTF-8 into MySQL

Hi, I'm trying to import about 10K rows of UTF-8 encoded data into a new MySQL table. I can do so successfully with LOAD DATA INFILE via MySQL Workbench but it the UTF-8 characters get mangled. I've tested the database otherwise via PHP and it accepts stores UTF-8 charaters fine. The problem seems to be with LOAD DATA INFILE, and I've ...

load-data-infile

PHP filter_var() - FILTER_VALIDATE_URL

The FILTER_VALIDATE_URL filter seems to have some trouble validating non-ASCII URLs: var_dump(filter_var('http://pt.wikipedia.org/wiki/', FILTER_VALIDATE_URL)); // http://pt.wikipedia.org/wiki/ var_dump(filter_var('http://pt.wikipedia.org/wiki/Guimarães', FILTER_VALIDATE_URL)); // false Why isn't the last URL correctly validated? And ...

Django dumpdata UTF-8 (Unicode)

Is there a easy way to dump UTF-8 data from a database? I know this command: manage.py dumpdata > mydata.json But the data I got in the file mydata.json, Unicode data looks like: "name": "\u4e1c\u6cf0\u9999\u6e2f\u4e94\u91d1\u6709\u9650\u516c\u53f8" I would like to see a real Unicode string like 全球卫星定位系统 (Chinese). ...

Working with UTF-8 instead of windows-1255

In the past, I used to work with windows-1255. Now my new page is written in UTF-8. When I send a query to DB (MS-Access), I get no results. The query on the URL shows the same like I type in by myself, but in this case (typing) I get results. How can it happen that I see the same URL on my IE and get the results and the other (that come...

Protocol buffers and UTF-8

The history of Encoding Schemes / multiple Operating Systems and Endian-nes have led to a mess in terms of encoding all forms of string data (--i.e., all alphabets); for this reason protocol buffers only deals with ASCII or UTF-8 in its string types, and I can't see any polymorphic overloads that accept the C++ wstring. The question then...

protocol-buffers

Character encoding changes after window.open()

The site from where I'm calling the window.open() function is in UTF-8, using the <meta> tag, everything works well, but once I call the function and open another window with the same tag, the new window shows weird characters even thought in page info it shows that the encoding stays the same (UTF-8). This is the same problem as mine: ...

character-encoding

Converting HTML character encoding in Java

We are trying to download source of webpages, however we cannot see some specific characters -like ü,ö,ş,ç- propoerly due to character encoding. We tried the following code in order to convert encoding of the string ("text" variable): byte[] xyz = text.getBytes(); text = new String(xyz,"windows-1254"); We observed that if encoding is...

1
...
30
31
32
33
34
...
69