charset

Facebook charset detection mechanism?

Today, I have looked into HTML code of facebook.com, and found something like this: <input type="hidden" value="€,´,€,´,水,Д,Є" name="charset_test"/> It's repeated two times inside the <form>...</form>. Any idea what this code might be useful for - some kind of server-side client charset detection? As far as I know, browser charset is...

Detect and filter user input depending on charset

Hi, I am trying to filter user input depending on their charset. I am displaying keywords from user inputs to other users but to not want to display e.g. arabic or chinese characters but only english/latin characters. How can I do that with PHP? Is there a easy solution on doing this? Thanks. ...

nodeValue from DomDocument returning weird characters in PHP

So I'm trying to parse HTML pages and looking for paragraphs (<p>) using get_elements_by_tag_name('p'); The problem is that when I use $element->nodeValue, it's returning weird characters. The document is loaded first into $html using curl then loading it into a DomDocument. I'm sure it has to do with charsets. Here's an example of a ...

MySQL - Ideal Field Type For Fixed Width Binary Data

If I want to store binary data (hash values) and they're always 128bytes long, what field type should I use? BLOBs are nice, but they aren't fixed width (and result in dynamic tables).. CHAR requires a charset. ...

What is the default VB6 charset?

Hi, we have an application written in Java which reads some text generated by a VB6 application. The problem is: this VB6 application generate this output using some special characters, like ç,ã,á which we don't know in what charset. So the question is: is there a default charset used by VB6? Which is it? ...

How to post a HTML form using Javascript that has both "application/x-www-form-urlencoded" and "charset=UTF-8" in the Content-Type header

Hey, I need to be able to specify using Javascript how to post a form that contains both the enctype as "application/x-www-form-urlencoded" and charset as "charset=UTF-8" in the Content-Type header. Any ideas? I have a aForm object of type Form. Thanks! ...

how to use different oracle character sets in one application

Hi Guys, i'm developing a 32bit Client-Application with Delphi. From this application I need to connect to databases on two different servers. First databse character set ist WE8MSWIN1252, the other server decodes with WE8PC850. Setting the client NLS_LANG parameter to the correct value solves correct sql-query results. Unfortunately t...

Fast alternative to java.nio.charset.Charset.decode(..)/encode(..)

Anybody knows a faster way to do what java.nio.charset.Charset.decode(..)/encode(..) does? It's currently one of the bottleneck of a technology that I'm using. [EDIT] Specifically, in my application, I changed one segment from a java-solution to a JNI-solution (because there was a C++ technology that was most suitable for my needs than...

Data loss when converting UTF-8 XML to Latin-1?

If I convert a UTF-8-encoded XML document (which has an XML prolog declaring the encoding to be UTF-8) to Latin-1 using xmllint, will there be any data loss? xmllint --encode iso-8859-1 --output test-latin1.xml test-utf8.xml (the data will eventually be displayed as ISO-8859-1-encoded HTML) ...

ext/mysql charset support vs ext/mysqli charset

Hi, I read some articles that promoted the use of the new ext/mysqli in php due to it's support of character sets. I currently use ext/mysql and use SET NAMES UTF-8 to ensure all my data is stored as utf-8. isn't that charset support in ext/mysql or am I missing something larger? Thanks :) ...

'charset=UTF-16' is missing in the tranformation result with MSXML4.0

Hello Everyone, I have some problems about "charset" in the transformation result with different versions of MSXML. The code below will transform XML to HTML with MSXML3.0 Dim xmlDoc As New MSXML2.DOMDocument xmlDoc.async = False Dim strXML As String strXML = "<Results><ElapsedTime>3000</ElapsedTime></Results>" xml...

Creating mysql table with explicit default character set, what if I don't?

In mysql 5.x Whats the difference if I do something like this: CREATE TABLE aTable ( id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY, aNumber bigint(20) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARACTER SET=utf8; with this: CREATE TABLE aTable ( id BIGINT NOT NULL AUTO_INCREM...

Define Default Charset for htmlentities()

I was wondering if there were any way to define the default encoding for htmlentities(). I have a big project going that uses htmlentities calls all over the place, and was wondering if there was a simple way to set it from ISO-8859-1 to UTF-8 as the default character encoding, using something simple like init_set. Or possibly with a sep...

How to remove  character

Hello, I got very strange problem. I have one php website which is running in two server. One is on Apache (Linux) and second is on IIS (WIndow). Linux Server, I just run it for demo. IIS is the actual hosting that I need to host. Even with all the same code, database, in the linux server, there's no  character. But in IIS, everywhere ...

Java Charset problem on linux

Hi, problem: I have a string containing special characters which i convert to bytes and vice versa..the conversion works properly on windows but on linux the special character is not converted properly.the default charset on linux is UTF-8 as seen with Charset.defaultCharset.getdisplayName() however if i run on linux with option -Dfil...

MySql Turkish Character Problem

Hi all, I'm writing a program. This program transfer Datas to MySql Database Which is in SQL Server Datas. MySql Database Default CharSet is Latin1. Usually Latin5 charset is using for Turkish characters. Bu ı cant change the mySql table's CharSet. Because its very old a database. Is any way to Import Turkish chars to mySql database corr...

Get source code with Chinese characters PHP

Well, I give up. I've been messing around with all I could think of to retrieve data from a target website that has information in traditional Chinese encoding (charset=GB2312). I've been using the simple_html_parser like always but it doesn't seem to return the Chinese characters, in fact all I get are some weird question marks embedde...

What's the difference between encoding and charset?

I am confused about the text encoding and charset. For many reasons, I have to learn non-Unicode, non-UTF8 stuff in my upcoming work. I find the word "charset" in email headers as in "ISO-2022-JP", but there's no such a encoding in text editors. (I looked around the different text editors.) What's the difference between text encoding a...

Change File Encoding to utf-8 via vim in a script

Hi, i just got knocked down after our server has been updated from Debian 4 to 5. We switched to UTF-8 environment and now we have problems getting the text printed correctly on the browser, because all files are in non-utf8 encodings like iso-8859-1, ascii, etc. I tried many different scripts. The first one i tried is "iconv". That o...

For HTTP responses with Content-Types suggesting character data, which charset should be assumed by the client if none is specified?

If no charset parameter is specified in the Content-Type header, RFC2616 section 3.7.1 seems to imply ISO8859-1 should be assumed for media types of subtype "text": When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when r...