views:

5900

answers:

3

Hello,

I have an scenario with two MySQL databases (in UTF-8), a Java code (a Timer Service) that synchronize both databases (reading form first of them and writing/updating to second) and a Web application that lets modify data loaded in the second database.

All database access are made using IBATIS (but I detect that I have the same problem using JDBC, PreparedStatements and ResultSets)

When my java code reads data from first database, I obtain characters like 'ó' when really it must be 'ó'. This data is wroten without modifications to the second database.

Later, when I see the loaded data in my web application, I see the extrange character despite the <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />.

If I decode the data using ...

new String(data.getBytes("UTF-8"));

... I visualize correctly the character (ó). But I can not use this solution as a general rule because when I modify data using web aplication form, the data is not updated in UTF-8 in my second database (despite the database is UTF-8 and my connection string is using characterEncoding, characterSetResults and useUnicode parameters).

From my Java code I obtain the following Database settings:

character_set_client-->utf8 
character_set_connection-->utf8 
character_set_database-->utf8 
character_set_filesystem-->binary 
character_set_results-->utf8 
character_set_server-->latin1 
character_set_system-->utf8 
character_sets_dir-->/usr/local/mysql51/share/mysql/charsets/

the character_set_server setting can't be changed and I don't know what I am doing wrong!!

How can i read UTF-8 data from MySQL using JDBC connector (mysql-connector-java-5.1.5-bin.jar)?

The problem is reading data from first database or writing to second database?

Thanks in advance,

Cesar

A: 

ave Cesar

you can set the file.encoding property of your JVM to UTF-8 so all locale/encoding sensitive API will consider decoded Strings as UTF8.

For example, you can set it in your command line that launch your Java app :

java -Dfile.encoding=UTF-8 ....

You can also refer to This SO question for a complete explanation on Tomcat setup.

chburd
+1  A: 
erickson
A: 

Hello,

first of all, thanks for the answers.

I can't explain what it's happening in my program.

When my application starts, the file.encoding parameter value is "UTF-8" and the defaultCharset value is "UTF-8". I obtain those values using:

System.getProperty("file.encoding"); java.nio.charset.Charset.defaultCharset());

However, when I read a UTF-8 database value, I obtain characters like 'ó'. But I know that those characters are UTF-8 valid because when I use String constructor with "UTF-8" charset encoding, I see the right character.

If I change the file.encoding parameter value using -Dfile.encoding=ISO-8859-1, and I read the same database value, I see the right character!!!, but if execute a select like ...

SELECT 'ó' FROM DUAL

... I see an extrange value for that character.

If I modify the select query like this...

SELECT _latin1'ó' FROM DUAL

...I see the right character because I'm telling to Mysql parser that 'ó' has latin1 encoding. But if I execute System.out.println("ó"), I see a wrong value.

I can´t explain this behavior.

This is my test program


   Connection conn = null;
   PreparedStatement ps = null;
   ResultSet rs = null;

   try
   {
       String userName = "user";
       String password = "password";
       Properties prop = new Properties();
       prop.put("user", userName);
       prop.put("password", password);

       String url = "jdbc:mysql://localhost/database?characterSetResults=UTF-8&characterEncoding=UTF-8&useUnicode=yes";

       Class.forName ("com.mysql.jdbc.Driver").newInstance();
       conn = DriverManager.getConnection (url, prop);

       ps = conn.prepareStatement("show variables like '%character%'");
       rs = ps.executeQuery();

       while (rs.next()) {
           System.out.println(rs.getString(1) +"-->"+rs.getString(2));
       }
       rs.close();
       ps.close();

       System.out.println("-----------------------------");
       System.out.println("ARGS[0]:"+args[0]);
       System.out.println("-----------------------------");
       System.out.println("JVM DEFAULT CHARSET:"+java.nio.charset.Charset.defaultCharset());
       System.out.println("-----------------------------");
       System.out.println("JVM file.encoding:"+System.getProperty("file.encoding"));
       System.out.println("-----------------------------");


       //THE QUERY 
       String query = "select 'ó','ó' from dual";

       ps = conn.prepareStatement(query);

       rs = ps.executeQuery();
       rs.next();
       System.out.println("FIELD1:"+rs.getString(1));
       System.out.println("FIELD2:"+rs.getString(2));
   }
   catch (Exception e)
   {
       e.printStackTrace();
   }
   finally
   {
       if (rs != null) {
           try
           {
               rs.close ();
           }
           catch (Exception e) { /* ignore close errors */ }
       }

       if (ps != null) {
           try
           {
               ps.close ();
           }
           catch (Exception e) { /* ignore close errors */ }
       }
       if (conn != null)
       {
           try
           {
               conn.close ();
           }
           catch (Exception e) { /* ignore close errors */ }
       }
   }

}


If I execute the program passing as args[0] the 'ó' value...

this is the output with UTF-8 file.encoding:

character_set_client-->utf8
character_set_connection-->utf8
character_set_database-->utf8
character_set_filesystem-->binary
character_set_results-->utf8
character_set_server-->latin1
character_set_system-->utf8
character_sets_dir-->/usr/local/mysql51/share/mysql/charsets/
-----------------------------
ARGS[0]:ó
-----------------------------
JVM DEFAULT CHARSET:UTF-8
-----------------------------
JVM file.encoding:UTF-8
-----------------------------
FIELD1:ó
FIELD2:ó

And this is the output using ISO-8859-1

character_set_client-->utf8
character_set_connection-->utf8
character_set_database-->utf8
character_set_filesystem-->binary
character_set_results-->utf8
character_set_server-->latin1
character_set_system-->utf8
character_sets_dir-->/usr/local/mysql51/share/mysql/charsets/
-----------------------------
ARGS[1]:¦
-----------------------------
JVM DEFAULT CHARSET:ISO-8859-1
-----------------------------
JVM file.encoding:ISO-8859-1
-----------------------------
FIELD1:¦
FIELD2:ó

Thanks in advance

Hello,My problem has been solved changing the character_set_server variable to utf8. It seems like character_set_server value would prevail over driver settings.Thanks at all
That's what I suspected.
erickson