utf-8

node.js Nerve framework unicode response

code: var nerve = require("./nerve"); var sitemap = [ ["/", function(req, res) { res.respond("Русский"); }] ]; nerve.create(sitemap).listen(8100); show in browser: CAA:89 How it should be correct? ...

Problems in inserting utf-8 string into database and then outputting it to web page.

I am learning PHP programming, so I have setup testing database and try to do various things with it. So situation is like that: Database collation is utf8_general_ci. There is table "books" created by query create table books ( isbn char(13) not null primary key, author char(50), title char(100), price float(4,2) ); Then ...

UTF-8 to Unicode using C#

Help me please. I have problem with encoding response string after GET request: var m_refWebClient = new WebClient(); var m_refStream = m_refWebClient.OpenRead(this.m_refUri); var m_refStreamReader = new StreamReader(this.m_refStream, Encoding.UTF8); var m_refResponse = m_refStreamReader.ReadToEnd(); After calling this code my string ...

How to create this string encoding thing?

I have an NSURL object to an text file, but the simple reader method of NSString is deprecated. Now, there's this complex one: + (id)stringWithContentsOfURL:(NSURL *)url encoding:(NSStringEncoding)enc error:(NSError **)error How can I provide the correct encoding here? I have an text file which I created in Smultron and edited in Xcod...

UTF-8 Corrupted from MySQL to SQLite

I'm porting a PHP Web application I wrote from MySQL 5 to SQLite 3. The text encoding for both is UTF-8 (for all fields, tables, and databases). I'm having trouble transferring a geo database with special characters. mb_detect_encoding() detects both as returning UTF-8 data. For example, Raw output: MySQL (correct): Dārāb, Iran SQLit...

parsing \\xc3\\xb6 from a url

Hi I'm trying to support an API, it's a remote server pushing data to me. When passing data, reading directly from $_GET['value'] I get strings with this in them: \xc3\xb6 They also needed a very specific url structure, which I've had to use curl to work around. When using curl on the api I get this instead: \xc3\xb6 Is the problem on...

Multiple character encodings inside one HTML page possible?

I have a webpage that is set to UTF-8. But parts of its content (built in php) come from iso-8859-1 files and are thus not displayed correctly. Is it possible to set a specific encoding for a particular page element? ...

Does MySQL handle a single utf-8 character key as well as an integer?

I' working on a Chinese/Japanese learning web app where many tables are indexed by the characters (the "glyphs") of those languages. I'm wondering if the integer codepoint value of the glyph would be better for performance than using a single utf8 character (for primary key and indexes)? Using a single utf8 character would be very usef...

Multi-byte safe wordwrap() function for UTF-8

PHP's wordwrap() function doesn't work correctly for multi-byte strings like UTF-8. There are a few examples of mb safe functions in the comments, but with some different test data they all seem to have some problems. The function should take the exact same parameters as wordwrap(). Specifically be sure it works to: cut mid-word if ...

why we need sys.setdefaultencoding("utf-8") in py scipt

I have seem a few script use this at top of py script,i curised when i need use it ? import sys reload(sys) sys.setdefaultencoding("utf-8") ...

Java PreparedStatement UTF-8 character problem

Hi All; I have a prepared statement: PreparedStatement st; and at my code i try to use st.setString method. st.setString(1, userName); Value of userName is şakça. setString methods changes 'şakça' to '?akça'. It doesnt recognize UTF-8 characters. How can i solve this problem? Thanks. ...

Intermittent problem with UTF-8 characters

I am running a fairly standard LAMP stack. The problem is an intermittent rendering of UTF-8 characters correctly. About 50% of the time the non-ASCII UTF-8 characters render correctly (e.g. with appropriate diacritical marks), but about 50% of the time I get the '?' rendition instead. If I reload the page, sometimes it corrects the pro...

How do I fix invalid HTML characters in pages served with different encoding?

I have a number of websites that are rendering invalid characters. The pages' meta tags specify UTF-8 encoding. However, a number of pages contain characters that can't be interpreted by UTF-8, probably because the files were saved with another encoding (such as ANSI). The one in particular I'm concerned about right now is a fancy apostr...

Isn’t on big endian machines UTF-8's byte order different than on little endian machines? So why then doesn’t UTF-8 require a BOM?

UTF-8 can contain a BOM. However, it makes no difference as to the endianness of the byte stream. UTF-8 always has the same byte order. If Utf-8 stored all code-points in a single byte, then it would make sense why endianness doesn’t play any role and thus why BOM isn’t required. But since code points 128 and above are stored ...

UTF-8 content type meta tag is slowing down the page loading, why?

I'm setting the following meta tag to set the content type and in doing so the page load time jumps by about 30% (350 --> 500 msec using chrome dev tools and firefox firebug). Note: I have it placed first thing inside the tag to prevent re-rendering of page content. Also, the size of the page in kb is essentially the same, so that is no...

file encoding generating blank character in ruby -- why?

I'm using this little bit of ruby: File.open(ARGV[0], "r").each_line do |line| puts "encoding: #{line.encoding}" line.chomp.split(//).each do |char| puts "[#{char}]" end end And I have a sample file that I'm feeding in the file just contains three periods and a newline. When I save this file with a fileencoding of utf-8 ...

MsWord weird identification of language when reading UTF8 text file

Hi, I have to merge a document in MsWord 2003 that includes some Thai characters, to do it I dump the information using UTF-8 charcode. That kind of works ok. Problem is that when I open this text file in MsWord (as it is my DataSource for the merging), it identifies some english characters as if they were foreign language (so not bein...

fckeditor characters encoding issue

am using fckeditor inside codeigniter framework when i retrive data from data base it show an unreconized characters instead of special characters (french charachters like é è ç ..) ? in data base data is converted to html entities, and i can show it without problem on the front pages, but in the backend i have a probleme with the editor...

the blank space below the <body> tag & script tag and link tag goes under body tag from head tag

hey guys, i wrote a php app using php+smarty. when i view web source code in firebug, i find that link tag and script tag get under the body tag. but i should be under head tag. and there are some space below body tag. and there's blank space on my top of my web page. so , what's the problem? ...

Rails, MySQL, Unicode data and latin1 tables - Where to go from here?

I'm not 100% sure on the particulars, so I'd love someone straightening me out, but I'll forge ahead with what I think is going on... When I first setup my database, I used the default character encoding of the system without even thinking, and it was latin1. I never even thought about i18n/l10n. It just didn't occur to me. I just accep...