tags:

views:

205

answers:

3

We are using Java and Oracle for development.

I have table in a oracle database which has a CLOB column in it. Some XYZ application dumps a text file in this column. The text file has multiple rows.

Is it possible that while reading the same CLOB file thru Java application, the escape sequences (new line chars, etc) may get lost??

Reason I asked this is, we gona parse this file line by line and if the escape sequences are lost, then we would be trouble. I would have done this analysis myself, but I am on vacation and my team needs urgent help.

Would really appreciate if you could provide any thoughts/inputs.

+1  A: 

A CLOB stores character data. Carriage returns and line feeds are valid characters, though unprintable ones. As long as your XYZ app is correctly filling your CLOBs, the contents should be just as manageable to you as if they had come from the file.

Depending on the platform and the nature of said "XYZ app," lines could be separated by either \r(Mac), \r\n (DOS/Windows) or \n (Unix/Linux), and you should make allowance for this fact if necessary. This is one aspect where BufferedReader.readLine() is more convenient, as it transparently gets rid of this difference for you.

Carl Smotricz
A minor clarification: Mac OS 1-9 used \r; Mac OS X uses \n.
trashgod
A: 

I'm not 100% sure what you mean by escape sequences in this context. Within a (for example) Java literal string, "\n" is an escape sequence representing a newline, but once that string is outputted into something (say, a database), it's not an escape sequence any more, it's an actual newline character.

Anyhow, to your direct question, Java through can read text from Oracle CLOBs perfectly fine. Newlines are not lost.

Dan
+2  A: 

You need to ensure that you use the one correct and same character encoding throughout the whole process. I strongly recommend you to pickup UTF-8 for that. It covers every human character known at the world. Every step which involves handling of character data should be instructed to use the very same encoding.

In SQL context, ensure that the DB and table is created with UTF-8 charset. In JDBC context, ensure that JDBC driver is using UTF-8; this is often configureable by JDBC connection string. In Java code context, ensure that you're using UTF-8 when reading/writing character data from/to streams; you can specify it as 2nd constructor argument in InputStreamReader and OutputStreamWriter.

BalusC