views:

490

answers:

2

Hi,

I'm working on a simple windows service which reads csv data and transfer the data to ms-sql server. Csv contains non-unicode chars(Ç窺İıÖöÜüĞğ). ODBC doesn't transfer the right endocing. I try the copy data without ODBC, with text reading, encoding doesn't right too. But text encoding is right when I debug the service. Csv read code:

string[] csvLines = File.ReadAllLines(csvFile, Encoding.GetEncoding(new System.Globalization.CultureInfo("tr-TR").TextInfo.ANSICodePage));

Service is running as LocalSystem and I can't change "System Locale" for non-Unicode programs, I have to use system locale as United States(English)

A: 

From memory, creating a new culture in the way you're doing it will use system defaults (i.e. "en-US" in your case).

So, rather than creating a new CultureInfo use the pre-cached one:

CultureInfo.GetCultureInfo("tr-TR")

It works when you debug because the code is running as you, not LocalSystem, and I assume your locale is Turkish.


Edit: Oops, should have been GetCultureInfo instead of GetCulture.

This works on my machine in a console app:

Console.WriteLine("en-US: {0}",
        CultureInfo.GetCultureInfo("en-US").TextInfo.ANSICodePage);
Console.WriteLine("tr-TR: {0}",
        CultureInfo.GetCultureInfo("tr-TR").TextInfo.ANSICodePage);

Outputs 1252 and 1254.

devstuff
I also tried "CultureInfo.GetCulture("tr-TR")" but it doesn't work :( Do you have any other idea?
mrt
Another thought: if the file might also be encoded using UTF-8, use StreamReader.ReadLine, using the StreamReader constructor version with the detectEncodingFromByteOrderMarks parameter set to true.
devstuff
File encoding is ANSI. detectEncodingFromByteOrderMarks doesn't work too
mrt
A: 

The file reading code looks fine and I believe the problem is in how you call the ODBC function. Looks like the encoding information is lost somewhere in between the calls.

Can you post a code snippet showing your ODBC calls? Thanks

Dmitry O_o
ConnStr="Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=D:\some_dic\;Extensions=asc,csv,tab,txt;Persist Security Info=False"OdbcConnection source = new OdbcConnection(ConnStr);string sql = "SELECT * FROM [data.csv]";OdbcCommand command = new OdbcCommand(sql, source);source.Open();OdbcDataAdapter adp = new OdbcDataAdapter(command);DataTable tblRecords = new DataTable();adp.Fill(tblRecords);
mrt
sorry, I overlooked that you use ODBC for reading from your csv file, not writing into ms-sql. I meant to ask for a snipped of the code that actually writes into ms-sql, since I believe this is where the problem lies. It looks like you read csv data fine, but on the other end - in the ms-sql table - encoding is incorrect, right? Thanks.
Dmitry O_o
(just saw your comment below in another answer)If the file is ANSI and contains non ASCII chars (I bet you meant non-ASCII, not non-unicode in you original post, since technically there are no non-unicod chars), then you need the codepage the file is encoded with, otherwise your file is just a series of bytes.
Dmitry O_o