character-encoding

What is the set of valid first characters in an XML document?

I'm working on some code to determine the character encoding of an XML document being returned by a web server (an RSS feed in this particular case). Unfortunately, sometimes the web server lies and tells me that the document is UTF-8 when in fact it's not, or the boilerplate XML generation code on the server has <?xml encoding='UTF-8'?...

Do mysql indexes translate UTF-8 to ASCII?

I've been working with UTF-8 characters in my db, and have been using php inconv function to translate characters from utf-8 to ascii before putting them into the database. This way, I thought, I would just translate a query into ASCII before querying the database. However, now I am seeing results which lead me to believe that mysql d...

Mysql - Insert queries inserting funny characters

Hi guys - I have a simple script which inserts values from a text file into a mysqldatabase - however some accented characters aren't inserted properly. Like lets say I have a word: Reykjavík I try to insert it using a simple insert sql statement and instead I this value ends up in the database???? Reykjavík How do I fix this? ====...

struggling with special characters (html_entity_decode, iconv, and more)

I've been struggling with getting a bunch of characters translated down to core utf-8 to store them in my database. PHP iconv fails on many characters, so i've been forced to build my own 'solution', which really isn't a solution if it doesn't work, and it fails almost completely in windows, so developing with iconv is mostly fruitless...

PHP + SQL Server - How to set charset for connection?

I'm trying to store some data in a SQL Server database through php. Problem is that special chars aren't converted properly. My app's charset is iso-8859-1 and the one used by the server is windows-1252. Converting the data manually before inserting doesn't help, there seems to be some conversion going on. Running the SQL query 'set ...

Why does SQL Server require INT to be converted to NVARCHAR?

During an ordeal yesterday, I learned that you can't pass this query to EXEC(): @SQL = @SQL + 'WHERE ID = ' + @SomeID EXCEC(@SQL) Where @SomeID is an INT and @SQL is NVARCHAR. This will complain about not being able to convert NVARCHAR back to INT during execution. I realized you have to do it like @SQL = @SQL + 'WHERE ID = ' + CONV...

How do I set -Dfile.encoding within ant's build.xml?

I've got java source files with iso-8859-1 encoding. When I run ant, I get "warning: unmappable character for encoding UTF-8". I can avoid this if I run ant -Dfile.encoding=iso-8859-1 or add encoding="ISO-8859-1" to each javac statement. Is there a way to set the property globally within build.xml? <property name="file.encoding" valu...

write special characters into excel table by python package pyExcelerator/xlwt

Task: I generate formated excel tables from csv-files by using the python package pyExcelerator (comparable with xlwt). I need to be able to write less-than-or-equal-to (≤) and greater-than-or-equal-to (≥) signs. So far: I can save my table as csv-files with UTF-8 encoding, so that I can view the special characters in my text editor, by...

What is the difference between utf8_general_ci and utf8_unicode_ci in MySql?

Are there any big differences a developer should care about? ...

[PHP|HTML] - Problem with file uploads from China

Hi, I have a web app, part of which accepts user uploads of csv files. There is a prospective client in China trialing the site. They report that when they try to upload a file the page 'hangs' ie the 'Please wait etc...' graphic which shows while the file is uploading is staying on their page and the file doesn't get uploaded. I have...

How to convert ansi as UTF8 or big5

Could someone take several characters as example to explain the principle of converting ansi to big5, gb2312 to big5 and gb2312 to UTF/unicode ? Thanks in advance !!! ...

Problems with special characters in php soap client

Hi! I have a problem related to this question. I have a web service (also using php) that returns some names. When any of them contains Swedish characters (å, ä or ö) and probably others as well i get a soapfault (looks like we got no XML document). I can however see the full (correct afaik) response using $soapcalo->__getLastResponse()....

C#: Is there any way to discover what charset encoding a file is using?

Is there any way to discover what charset encoding a file is using? ...

Check unicode in PHP

How can I check whether a character is a Unicode character or not with PHP? ...

lamp stack / user input and character encoding

Is there a one stop solution to solving all character encoding issues? I always seem to have issues somewhere along the line between user input, database storage and data retrieval (html forms. I want all my data and web pages to be encoded as utf-8 but it seems I always end up with a invalid utf-8 character somewhere. I don't really u...

Latin letters with acute : DjangoUnicodeDecodeError

Hi, I have a problem reading a txt file to insert in the mysql db table, te sniped of this code: file contains the in first line: "aclaración" archivo = open('file.txt',"r") for line in archivo.readlines(): ....body = body + line model = MyModel(body=body) model.save() i get a DjangoUnicodeDecodeError: 'utf8' codec can't...

character set problem in mysql

Mysql's environment is following: character_set_database="big5" And when I send a SQL which contains tranditional Chinese (such as "select * from a where name = '中') from jdbc to mysql database, it will throw the following exception: Illegal mix of collations (big5_chinese_ci,IMPLICIT), (latin1_swedish_ci,COERCIBLE), (latin1...

Need to decode ascii for xml file, into literal characters

I have an xml file that is being used on a third party system, and I have no control over the third party system that is using the xml file. The third party system fails because there are ascii in the xml file. For example, it fails when it sees &#44; when it wants a single quote ’ Is there a way to throw php code around the va...

How to change NetBeans charset?

I researched on google and I found this article, but my codes still being saved as ansi. Notepad++ has a feature to change/convert the code charset, does someone know if there exists any option in Netbeans? nebeans.conf: netbeans_default_options="-J-Dorg.glassfish.v3.installRoot=\"E:\Programs\sges-v3-prelude\" -J-Dcom.sun.aas.installRo...

How are unicode allocated for different languages?

It seems the most confusing issue to me. How is the beginning of a new character recognized? How are the codepoints allocated? Let's take Chinese character for example. What range of codepoints are allocated to them, and why is it thus allocated,any reason? EDIT: Plz describe it in your own words,not by citation. Or could you rec...