tags:

views:

40

answers:

3

how to read unicode text from java resultset?

+3  A: 

rs.getString() returns a Java String which is Unicode by definition.

If you get mangled characters, you have to configure your database driver to use the right encoding for the connection to the database.

Daniel
+2  A: 

Just read the strings. All strings in Java are unicode already. If you're having problems, then:

  • It could be a diagnostic problem - you may be reading the right data out of the ResultSet but displaying it so it looks like you haven't read it properly
  • It could be a configuration problem - there may be something you need to do when connecting to the database so that it determines the right encoding to use
  • It could be a database problem - the database may not be configured to store full Unicode data
  • It could be a database schema problem - the particular column you're using may be configured using a column type which doesn't support full Unicode
  • It could be a problem in the data, e.g. with another program incorrectly submitting data.

I've seen all of these before now. You should use detailed logging (e.g. of the individual characters, in hex) to work out whether you've got the data correctly or not - that will tell you where to look next.

Jon Skeet
+1  A: 

If you are using DataSource (f.e. com.mysql.jdbc.jdbc2.optional.MysqlDataSource) you can directly set channel encoding to UTF8 like ds.setEncoding("UTF-8")

Xorty