views:

248

answers:

4

So, here is the code which create the table in an Oracle 10g / UTF-8 database :

CREATE TABLE TEST_SEMANTIC
(
SEMANTIC_COLBYTE char(2 byte) ,
SEMANTIC_COLCHAR char(2 char)
);

meaning, that I use two differents types of semantic for the two columns, byte and char.

I then insert inside the database these corresponding data :

insert into test_semantic(SEMANTIC_COLBYTE,SEMANTIC_COLCHAR)
values('é','é');

So when I use the JDBC driver to query the database in a java program, and display the result, I expect an output like this :

Byte>é<
Char>é<

Whereas I get this :

Byte>é<
Char>é      <

When I query the database like this :

select dump(semantic_colbyte,16),dump(semantic_colchar,16) from test_semantic;

I get this :

Typ=96 Len=2: c3,a9 Typ=96 Len=3: c3,a9,20

Here is the java code :

public static void main(String[] args){
        Connection con = null;
        try {
            Class.forName("oracle.jdbc.driver.OracleDriver");

        } catch (java.lang.ClassNotFoundException e) {
            System.err.print("ClassNotFoundException:");
            System.err.println(e.getMessage());
        }

        try {
            Properties props = new Properties();
            props.put("user", "XXX");
            props.put("password", "XXX");

            con = DriverManager.getConnection("jdbc:oracle:thin:@xxx:1521:xxx", props);
            Statement stmt = (Statement) con.createStatement();
            stmt.execute("SELECT SEMANTIC_COLBYTE,SEMANTIC_COLCHAR FROM TEST_SEMANTIC"); 
            ResultSet result = stmt.getResultSet();
            result.next();
            String output_byte = result.getString(1);
            String output_char = result.getString(2);

        System.out.println("Byte>"+output_byte+"<");                
        System.out.println("Char>"+output_char+"<");

        } catch (SQLException ex) {
            System.err.println("SQLException: " + ex.getMessage());
        }
    }
+4  A: 

Don't forget to trim your values while using char. Or don't use char, use varchar2, until you are providing the exact sized value as the column size.

You might want to know why, so here it is.

Adeel Ansari
Thanks for your answer and the link. Actually, I know the difference between char and varchar2.And the corresponding string I store in database, is not variable, but predefined on 2 character, but in UTF-8 semantic (under Oracle).More precisely, my question is :By knowing the UTF-8 rule to convert non ascii char(http://fr.wikipedia.org/wiki/UTF-8), why does my java program don't give me a least :>é < with one blank char (the output I get by querying the database using sqlDevelopper for instance), but give me >é < 6 blanks ?
zor
Ah! I got your point. Sorry for misunderstanding. The value is from the driver actually. So, this the way driver implemented it. Can't be done much, trimming the value is your best bet.
Adeel Ansari
BTW, is your SQL client is also Java based?
Adeel Ansari
Yep, as far as I know, SqlDevelopper is java based, but I don't know which driver it uses.
zor
What jar are you using? Try ojdbc5 or ojdbc6. Or if nothing work then you might want to try OCI drivers. I suppose SQL Developer uses the same, as they are having platform specific downloads for SQL Developer, and OCI suits stand-alone apps.
Adeel Ansari
Just got to know that it uses Thin Driver by default (http://www.oracle.com/technology/obe/sqldev/plsql_debug/plsql_debug.htm). Anyways, try other latest drivers or the same driver used by SQL Developer. You might look for that into its `lib` directory.
Adeel Ansari
Good idea, I'm going to check if it can solve my problem, by using the same driver as Sql Developper
zor
Actually I use the ojdbc14.jar
zor
Exactly which version of ojdbc14.jar?
Alohci
Ok, the problem is the driver, ojdbc14.jar ==> when using this one I Get >é < instead of >é<When I use either ojdbc5.jar or ojdbc6.jar driver, it gets me the same results as in SqlDevelopper:>é <Thanks you for the answers
zor
+5  A: 

CHAR Datatype:

The CHAR datatype specifies a fixed-length character string. Oracle ensures that all values stored in a CHAR column have the length specified by size. If you insert a value that is shorter than the column length, then Oracle blank-pads the value to column length.

egorius
So why six spaces?
Alohci
+2  A: 

Have you already read the Oracle documentation on Oracle length semantics for character datatypes?

http://download.oracle.com/docs/cd/B19306%5F01/server.102/b14220/datatype.htm#sthref3787

David Aldridge
yes, of course, It seems to me that my question imply the understanding of Oracle length semantics.
zor
Just checkin', and others who find this question in search of an answer to a problem they've got have might not have read it.
David Aldridge
A: 

Which characterset is the database (and your session) actually in. Mine was in AL32UTF8 and wouldn't accept 'é' in a 2 byte CHAR field. In a 4 byte field, it went to Typ=96 Len=4: ef,bf,bd,20

A UTF-8 character can be four bytes and therefore the CHAR(2 char) can be up to eight bytes. So I could understand a string of length 8 coming out. Seven is a bit odd, almost like it was told the first character is three bytes and the second character can be up to four.

Can you play with ResultSetMetaData (eg getColumnDisplaySize, getColumnTypeName) and see what comes out.

Gary
Not seven bytes - seven characters. The é will be two bytes. The remaining 6 bytes are padding - space characters taking one byte each.Given that the dump function shows only 3 bytes, though, this does look like a problem with the JDBC driver as Vinegar has suggested.
Alohci
Don't know how you got to ef bf bd. That's U+FFFD, the replacement character for characters that are unknown or unrepresentable.
Alohci
My database is in UTF-8 charset
zor