this is my code:
from whoosh.analysis import RegexAnalyzer
rex = RegexAnalyzer(re.compile(ur"([\u4e00-\u9fa5])|(\w+(\.?\w+)*)"))
a=[(token.text) for token in rex(u"hi 中 000 中文测试中文 there 3.141 big-time under_score")]
self.render_template('index.html',{'a':a})
and it show this on the web page:
[u'hi', u'\u4e2d', u'000', u'...
I have a PHP web app with MySQL tables taking utf8 text. I recently converted the data from latin1 to utf8 along with the tables and columns accordingly. I did, however, forget to use mysql_set_charset and the latest incoming data I would assume came through the MySQL connection as latin1. I don't know what happens when latin1 comes in t...
Actually i'm having String in UTF-8 encoded form in the mail. I want it to decode it. I use Java mimeutility.decode text. But it doesn't decode properly.
Example String
=?UTF-8?B?0J/RgNC40LLQtdGC?==?UTF-8?B?0JfQtNGA0LDQstGB0YLQstGD0LnRgtC1?=
When i used
MimeUtility.decodeText("=?UTF-8?B?0J/RgNC40LLQtdGC?==?UTF-8?B?0JfQtNGA0LD...
I have 'Malformed UTF-8 character' error when I'm putting some scalar data in XML::Simple or Data::Dumper. There are regular expressions on the lines where the error occurs.
Malformed UTF-8 character (fatal) at /usr/share/perl5/XML/Simple.pm line 1690.
Malformed UTF-8 character (fatal) at /usr/lib/perl/5.10/Data/Dumper.pm line 682.
At...
I'm using Rails to generate a PDF with the executable wkhtmltopdf and then using send_data to send the result back to the user as a PDF file.
view = ActionView::Base.new(ActionController::Base.view_paths, {})
html = "<h1>A heading</h1>"
pdfdata = `echo '#{html}' | #{RAILS_ROOT}/lib/pdf/wkhtmltopdf-i386 - -`
send_data pdfdata, :file...
Hello,
I'm fetching data from external database (I cannot edit it so don't suggest that please) which has internal encoding set to cp1250_general_ci.
I need to display that data as UTF-8 but I cannot get it to work. I'm using this to fetch the data:
$dsn = 'mysql:dbname=eklient;host=127.0.0.1';
$user = 'root';
$password = 'root'...
I'm writing a bash script that needs to parse html that includes special characters such as @!'ó. Currently I have the entire script running and it ignores or trips on these queries because they're returned from the server as decimal unicode like this: '. I've figured out how to parse and convert to hexadecimal and load these into py...
Hi,
I tried the ruby hacks for utf8 (from : http://gist.github.com/273741) ... and I'm still getting the following error:
ActionView::TemplateError (incompatible character encodings: ASCII-8BIT and UTF-8)
What is bizarre for me is that the same content if retrieved with a post action (searching the app with an html from) it is display...
Hi,
I have a custom java library which getResource() from an UTF-8 encoded text file in the package.
keyWordPairs = new Hashtable<String, Vector<String>>();
try {
File pinYinDatabase = new File(this.getClass().getClassLoader().getResource("myCustomLibrary/NewPinYin.utf").getFile());
BufferedReader br = new BufferedReader(new Fi...
Hiya,
I am adding UTF-8 data to a database in Django.
As the data goes into the database, everything looks fine - the characters (for example): “Hello” are UTF-8 encoded.
My MySQL database is UTF-8 encoded. When I examine the data from the DB by doing a select, my example string looks like this: ?Hello?. I assume this is showing the c...
So, I have set up the PEAR Mail_queue package on my server, and I have it running fine, and sending emails out. I have it set to run by a cron-job every 15 minutes.
Everything works fine, except the problem is that I need to send emails in Chinese, and when I send them using the Mail_queue package, I only get gibberish. I'm assuming that...
The URLs exhibiting this behavior is here:
http://culturewithinaculture.org/introduction.php
http://culturewithinaculture.org/about.php
user: cwac
pass: cwac2112
The site has not been launched officially. But my problem is on the right side, Japanese copy. I have my document type set to UTF-8 which is what I thought it should be. On...
I am working on a CakePHP site saved in MacRoman char encoding. I want to change all the files to UTF-8 for internationalisation. For all the other files in the site this works fine. However, in the core.php file there is a security salt, which is a string with special characters ("!:* etc.). When I save this file as UTF-8 the salt g...
I have a JavaFX/Groovy application that I'm trying to localize.
It turns out that when I use JavaFX standard execution with the Java VM arg "-Dfile.encoding=UTF-8" locally, all of my international characters (for example, ü) display correctly.
However, if I invoke the app via a JNLP file, using java-vm-args="-Dfile.encoding=UTF-8" e.g....
What's the best way to manage i18n urls?
It's strange because google and facebook encode utf8
ex. search ★。SмAck%2BтнAт。★ on google
while yahoo doesn't do it.
ex. search ★。SмAck%2BтнAт。★ on yahoo
How do u manage utf8 urls and which libs do u use?
-- edit
I tried on Firefox and the behavior is the same, so the question is: Do you have ...
Hi guys
I have just encountered something rather strange, I use the Zend Framework 1.10 with the Zend_Db_Table module to read some data from a databse. The database itself, the table and the fields in question all have their collation set to "utf8_general_ci" and all special chars appear correctly formatted in the DB when checked with p...
I'm trying to migrate a sinatra application to ruby 1.9
I'm using sinatra 1.0, rack 1.2.0 and erb templates
when I start sinatra it works but when I request the web page from the browser I get this error:
Encoding::CompatibilityError at /
incompatible character encodings: ASCII-8BIT and UTF-8
all .rb files has this header:
#!/usr/b...
I'm writing a wrapper layer to be used with mingw which provides the application with a virtual UTF-8 environment. Functions which deal with filenames are wrappers which convert from UTF-8 and call the corresponding "_w" functions, and so on. The big problem I've run into is that Windows' wchar_t is 16-bit.
For filesystem operations, it...
Directory listing with broken filenames encoding
C:\Downloads\1>dir
18.01.2010 10:45 <DIR> РЎР?Р>Р?Р?С?Р+
18.01.2010 10:45 <DIR> Р?Р?С'Р?Р>Р?Рє
18.01.2010 10:45 <DIR> Р"Р?С?Р?Р°С╪Р°-Р>РчС╪РчР+Р?Рё РєР?С?РїС?С?
18.01.2010 10:45 <DIR> Р•Р>Р•Р?РўР Р?Р?Р?
Is there any tools for windows t...
In Django how to use unicode when inserting into DB
Example:
name =request.POST["name"] //This may be in Chinese or any other lanuages
usr = Users(name=name)
usr.save()
The Python version that is used in Cent os is python 2.4.3 and mod python version is 1.2.1_p2-1
...