utf-8

Is there a standard literal constant that I can use instead of "utf-8" in C# (.Net 3.5)?

Hi, I would like to find a better way to do this: XmlNode nodeXML = xmlDoc.AppendChild( xmlDoc.CreateXmlDeclaration( "1.0", "utf-8", String.Empty) ); I do not want to think about "utf-8" vs "UTF-8" vs "UTF8" vs "utf8" as I type code. I would like to make my code less prone to typos. I am sure that some standard library has declatred ...

How to properly encode this local file Uri with utf8 characters

I'm trying to have libcurl download a local file with this name: C:\Users\Lucas Meijer\Desktop\我能吞下玻璃而不傷.chinesefile But can't seem to be able to find the proper url encoded string that will make libcurl find this. ...

How to check if letter is upper or lower in PHP?

I have texts in UTF-8 with diacritic characters also, and would like to check if first letter of this text is upper case or lower case. How to do this? ...

How to convert from HTML to UTF-8 in java

Hi, I have an ASCII String, with HTML entities, like: à ¨ ç I need this String to be without those entities and convert them into UTF-8 chars. Is there any easy way, in java to do that? Where: Clazz.method("aà","UTF-8") returns "aà" or something like that? ...

How to pass UTF8 string into your PHP HTML API?

so I have my php API (html Get api for Flash builder and C# apps). So if you want to submit data to it you use string like http://localhost/cms/api.php?method=someMethod&string=Your_String If there are english letters in it its ok. But what if I need to pass UTF-8 string like this Русское &...

How to create simpliest PHP Get API with UTF-8 support?

How to create simpliest *(less lines of code, less strange words) PHP Get API *(so any programm made in .Net C# could call url like http://localhost/api.php?astring=your_utf-8_string&bstring=your_utf-8_string ) with UTF-8 support? What I need Is PHP API with one function - concatinate 2 strings so that a simple .net client like thi...

Tilde not recognised in XML public identifier

Hi everyone I found an interesting bug and wanted to know you think. Brief background: I've written a custom DTD and an example XML file (both UTF-8). I have now implemented a SAX parser in Java which I want to test. I got a SAXException complaining "An invalid XML character (Unicode: 0x7e) was found in the public identifier". Now, ...

Problem with PHP localeconv() - Maybe UTF-8

I'm having an issue with the localeconv() in PHP. I'm using a Windows PC. I set my locale to France using setLocale(LC_ALL, 'fra_fra') function. Then I call the localeconv() function to a variable. When I output that variable, below is what I get. Array ( [decimal_point] => , [thousands_sep] => � [int_curr_symbol] => EUR ...

How is this website fixing the encoding ??

Hi all, I am trying to turn this text: ×וויר. העתיד של רשתות חברתיות והתקשורת ×©×œ× ×• Into this text: אוויר. העתיד של רשתות חברתיות והתקשורת שלנו Somehow, this website: http://www.pixiesoft.com/flip/ Can do it, and I would like to know how I might be able to do it myself (with whatever programming...

UTF-8 xml file shows Gibberish

I have a UTF-8 encoded xml file, which was exported from a Wordpress MySQL database. While the file is saved as UTF-8, and the encoding is UTF-8, I get gibberish instead of the Hebrew text that is supposed to be in there, which looks like this: ™×•×˜×•×ª How can I find the original encoding or charset and convert the text into pro...

problems with UTF-8 encoding in PHP

Hi guys, The characters I am getting from the URL, for example www.mydomain.com/?name=john , were fine, as longs as they were not in Russian. If they were are in Russian, I was getting '����'. So I added $name= iconv("cp1251","utf-8" ,$name); and now it works fine for Russian and English characters, but screws up other languages. :))) ...

Dreaded python encoding errors, how to stop them?

These have been plaguing me endlessly. Why? It seems that my console can't handle the encoding. I take it that the my browser and word processor can handle it. I don't have a master list of all the possible characters that it's choking on. What is the best way to relieve this without modifying my data? 'charmap' codec can't encode chara...

Is there a list of language only character regions for UTF-8 somewhere?

I'm trying to analyze some UTF-8 encoded documents in a way that recognizes different language characters. For my approach to work I need to ignore non-language characters, such as control characters, mathematical symbols etc. Just trying to dissect the basic Latin section of the UTF standard has resulted in multiple regions, with charac...

Count bytes in textarea using javascript

I need to count how long in bytes a textarea is when UTF8 encoded using javascript. Any idea how I would do this? thanks! ...

Reporting sanitized user input to the user via AJAX

I am writing some code to give live feedback to the user on the validation of a form using AJAX. I have got it checking length and if the field is empty. Now I want it to sanitize the users input and if the sanatized input differs from the users original input then tell them which characters are not allowed. The code I have written so f...

C++ iterate or split UTF-8 string into array of symbols?

Searching for a platform- and 3rd-party-library- independent way of iterating UTF-8 string or splitting it into array of UTF-8 symbols. Please post a code snippet. Solved: http://stackoverflow.com/questions/2852895/c-iterate-or-split-utf-8-string-into-array-of-symbols#2856241 ...

What new Unicode functions are there in C++0x?

It has been mentioned in several sources that C++0x will include better language-level support for Unicode(including types and literals). If the language is going to add these new features, it's only natural to assume that the standard library will as well. However, I am currently unable to find any references to the new standard librar...

Getting XML Numbered Entities with PHP 5 DOM

Hello guys, I am new here and got a question that is tricking me all day long. I've made a PHP script, that reads a website source code through cURL, then works with DOMDocument class in order to generate a sitemap file. It is working like a charm in almost every aspect. The problem is with special characters. For compatibility reaso...

process.standardInput encoding problem

I have an issue with encoding of process.standartInput encoding. i am using some process in my windows form application but input should be UTF-8. Process.StandardInput.Encoding is read only so i can't set it to UTF-8 and it gets windows default encoding which deteriorate native characters which are good in UTF-8. 2 processes are used in...

php validation function for name that allow UTF-8 characters plus - (minus) and space

Hello everybody, Please help me with a function that validate an input string to allow: 1) UTF-8 characters (ex: şţăîâ) ; 2) space ; 3) minus symbol(-) String cannot start or end with space or minus. Thanks! ...