character-encoding

Encoding UTF8 string to ISO-8859-1 String (VB.NET)

Hi I need to convert UTF8 string to ISO-8859-1 string using VB.NET. Any example? Thanks in advance. ...

How can I detect japanese text in a Java string?

I need to be able to detect Japanese characters in a Java string. Currently I'm getting the UnicodeBlock and checking to see if it's equal to Character.UnicodeBlock.KATAKANA or Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS, but I'm not 100% that's going to cover everything. Any suggestions? ...

how to detect and fix character encoding in a mysql database via php?

I have received this database full of people names and data in French, which means, using characters such as é,è,ö,û, etc. Around 3000 entries. Apparently, the data inside has been encoded sometimes using utf8_encode(), and sometimes not. This result in a messed up output: at some places the characters show up fine, at others they don't...

C# method to do URL encoding?

Hi what c# class method can I use to URL encode a URL string? In my use case I want to pass a URL string as a URL parameter itself. So like burying a URL within a URL. Without some encoding the "&" and "?" characters in the inner URL can get picked up when the parameters for the outer Url parameters are processed thanks ...

Encode unc path for firefox in ASP.NET

Hi! I need to encode a path so firefox can open it directly. I've tried HttpUtility.UrlEncode, HttpUtility.HttpEncode and HttpUtility.HtmlEncode but none of these seem to work. Any ideas? ...

What's the proper technical term for "high ascii" characters?

What is the technically correct way of referring to "high ascii" or "extended ascii" characters? I don't just mean the range of 128-255, but any character beyond the 0-127 scope. Often they're called diacritics, accented letters, sometimes casually referred to as "national" or non-English characters, but these names are either imprecis...

C#/.NET - Method for converting character codes to equivalent chars

After extracting a piece of text in my application, I might end up with a string like this: "More kitchen supplies for the people" Which in plain text would be: "More kitchen supplies for the people" Is there a component/method in .NET I can use to "process" the string into its plain text equivalent? I'm able to assume r...

Why isnt WPF displaying an accented character correctly?

I'm downloading a webpage, and then loading strings from the page into a WPF UI. One string has an accented character: "Áine". In the debugger, the string looks fine, but when added to a WPF ListBox, it appears like this: Á[]ine, where [] is a single rectangular symbol. When I copy the text from the debugger UI and paste it, a space ap...

Classic ASP, SQL Server and character encodings

I have a classic ASP page that gets POSTed to. The data gets POSTed as UTF-8 (I can see this in Fiddler). I then open an ADODB connection to a database and store the data in a VARCHAR field. If the data can be represented by 8859-1 (e.g. iñtërnâtiônàlizætiøn) it is stored correctly in the varchar field. If I try strings that can't be...

Internationalisation - character set to support all languages?

Regarding MySql, is there a character set to support all or the vast majority of languages? ...

can someone help me to figure this out ? about unicode.

hibyte lobyte makeunicode 250 65 57345 I got this table, and the hibyte and lobyte are some chinese character which may use big5 or GBK encoding, hibyte is hight byte, and lobyte is low byte. And I think the unicode might be some encoding in unicode that corresponding to the big5/GBK character with the hibyte and lobyte....

View webpage in Hebrew language

hi, I have a website with few HTML pages. How can I display them in the Hebrew language? What are the steps that I should follow to ensure it is viewed in different languages for different countries? Thanks ...

Fast ESP character normalization

Hi, I'm running a search application on a FAST ESP server. Now I have this problem with character normalization. What I want is to search for 'wurth' and get a hit in 'würth'. i've tried configuring the following in esp/etc/tokenizer/tokenization.xml <normalizationlist name="German to Norwegian"> <normalization description="Germ...

Please help me trace how charsets are handled every step of the way

We all know how easy character sets are on the web, yet every time you think you got it right, a foreign charset bites you in the butt. So I'd like to trace the steps of what happens in a fictional scenario I will describe below. I'm going to try and put down my understanding as well as possible but my question is for you folks to correc...

Does python's print function handle unicode differently now than when Dive Into Python was written?

I'm trying to work my way through some frustrating encoding issues by going back to basics. In Dive Into Python example 9.14 (here) we have this: >>> s = u'La Pe\xf1a' >>> print s Traceback (innermost last): File "<interactive input>", line 1, in ? UnicodeError: ASCII encoding error: ordinal not in range(128) >>> print s.encode('latin-...

How does UTF-8 "variable-width encoding" work?

The unicode standard has enough code-points in it that you need 4 bytes to store them all. That's what the UTF-32 encoding does. Yet the UTF-8 encoding somehow squeezes these into much smaller spaces by using something called "variable-width encoding". In fact, it manages to represent the first 127 characters of US-ASCII in just one...

How To Store Hmong Characters In MySQL Database

I've read a number of articles on storing multi-language strings in MySQL, but I can't seem to find anything specific (or credible) on Hmong. I have no trouble with latin (European) languages, but if someone could enlighten me on Hmong, that would be terrific. Thanks! P.S. Using PHP for the scripting, if anyone cares. ...

iText encoding problem

I have encoding problem with iText (http://www.lowagie.com/iText/). I load data from database and insert it as html to pdf with iText, for some reason my non-english (Finnish ä,ö etc) characters don't show up correctly. Following example shows how insert text to html: text = "<p>" + data + "</p>"; HTMLWorker htmlWorker = new HTMLWorke...

Preventing invalid characters from being written to an RSS Feed

I am working on blogging software. Occasionally users manage to paste control characters into their blog posts (for example someone recently managed to paste in the vertical tab character, ). When we render the posts in an RSS Feed, XML parsers fail to parse the control character and declare the feed invalid. One way to fix this wo...

how to overcome font problem in blackberry?

I am reading data from a .csv file and displaying it. When I encounter the micro character (µ) some special symbols are displayed instead. How can I display the micro character? ...