ansaurus

Question

Trouble using/displaying special characters from Oracle db in .Net app

Answer 1

+2 A:

Certain characters in the WE8ISO8859P1 character set have a different binary representation than the same character in UTF8.

What I suggest are 2 possible ways

1) Try using Oracle native data providers for .NET (ODP.NET). May be there is a bug/feature in Microsoft's library System.Data.OracleClient that this adapter do not automatically support converting WE8ISO8859P1 to unicode. Here is a link to ODP.NET

I hope that there will be a support for this encoding in ODP (but to say true I never checked this, it is only a suggestion)

2) Workaround: in Dataset, you should create a binary field (mapped to the original table field) and a String field (not mapped to the database). When you load data to the dataset, iterate for each row and perfrom convertion from binary array to string.

Code should be something like this

Encoding e = Encoding.GetEncoding("iso-8859-1");
foreach(DataRow row in dataset.Tables["MyTable"])
{
    if (!row.IsNull("MyByteArrayField"))
        row["MyStringField"] = e.GetString((row["MyByteArrayField"] as byte[]));
}

Bogdan_Ch 2009-08-03 14:26:26

Bogdan, thanks for your suggestions! I tried both paths, and while I don't have a solution yet, I do have more information. We are reading the strings as non-Unicode, and the special characters in question are coming through as three-byte characters. Depending on how we translate the string, the three bytes get interpreted any number of ways.The DB link to the remote database went down yesterday afternoon, and we're still working to fix that. When I have access again, I plan to try to read the string as unicode and attempt to translate the special character again.

RobLinx 2009-08-04 14:53:17

@RobLinx thanks for accepting my answer, I appreciate this :) However I need to inform other possible reader of this question that resolution was different from mine (and you pointed it out in your comments). Perhaps you can answer this question yourself and accept your own answer and earn a "Self Learner" badge if other users will vote :).

Bogdan_Ch 2009-08-05 09:47:40

Answer 2

A:

Postscript for anyone browsing this thread:

Bogdan was very helpful in getting me to the "answer" (such as it is) but as he points out, you might not have identical circumstances.

We communicated with the team responsible for using the commercial software. They had been copying/pasting from Word and Excel, which is how the special characters had been getting inserted.
The problem occurred in the translation of the character between the remote database and our database. Host database uses character set WE8ISO8859P1, where ours uses WE8MSWIN1252. Due to corporate-level concerns, modifying either character set is not feasible right now.
I used SYS.UTL_RAW.CAST_TO_RAW(fieldname) to convert the source field to search for 'BF' (the hex code for an inverted question mark in our character set). This at least let me identify the problem record and character. HOWEVER, many different special characters on the remote records would/could be translated to BF. For example, Word's hyphens are not simple "dash" characters, and also get translated to the inverted question mark.
dump(fieldname) somehow converts to decimal character codes BEFORE the translation, UNLESS I also used the SYS.UTL_RAW.CAST_TO_RAW in the same query. This caused amazing headaches. dump() by itself could be useful in identifying specific pretranslated characters from the source db.

Best solution would be to use the same character set on both dbs. Since that's not possible for us, we have manually replaced all occurrences of the special character on the source (remote) db with non-special equivalents (regular apostrophe or hyphen). However, since the commercial software doesn't correct or flag special characters, we may run into this problem in the future. So, our update application will scan for the inverted question mark and send a notification to the system owner with the ID of the bad record. This, like so many other corporate situations, will have to do. ;-)

Thanks again, Bogdan!

RobLinx 2009-08-05 15:13:22

ansaurus

tags:

views:

answers:

Trouble using/displaying special characters from Oracle db in .Net app

related questions