Hi All,
I want to use BCP to load into a SQL Server 2005 table with an nvarchar field using a loader control file. As I understand it, SQL Server 2005 only supports UTF-16 (and I believe it is UTF-16 LE). The file is being output by a Java program. The way I have it currently set up is as follows:
An XML format BCP loader file (created using the following command:
bcp test_table format nul -c -x -T -f test_table.xml -S server
)A Java program using the following code to write the output:
File f = new File("from_java.txt"); String encoding = "x-UTF-16LE-BOM"; OutputStream os = new FileOutputStream(f); OutputStreamWriter outputStreamWriter = new OutputStreamWriter(os, encoding); String theString = "áááááLittle Endian, BOM\r\n"; outputStreamWriter.append(theString); outputStreamWriter.flush(); outputStreamWriter.close();
Then using the following bcp command:
bcp test_table in from_java.txt -T -f test_table.xml -S server -error error.txt
What I get in the table is ÿþá
. and not áááááLittle Endian, BOM
I've tried a few different permutations of changing parameters:
- changing the way I generate the loader control file (using -n for native data instead of -c for character data...I think this may have something to do with it, but I didn't see any improvement in my inserted data)
- tried several different forms of the UTF-16 encoding, including big endian and little endian with no BOM, to no avail
- tried to output the BOM manually in the file as I read somewhere that Microsoft really like to make use of BOM information
- looked into trying to output the file as UCS-2 (instead of UTF-16) as that is (apparently) what BCP is actually reading the file in as
- tried -w on the bcp import, this does work, but not in conjunction with a loader format file (is there a way to incorporate whatever magic tells BCP that the file is encoded in UTF-16 into the format file?)
- I can get it to work if I output the file in windows-1252 and specify that codepage as a
-c 1252
option to bcp when I load the file (but I don't want to do this as I will be losing information as UTF-16 is a superset of what can be represented compared to 1252)
Has anyone managed to get bcp to load into an nvarchar field using UTF-16 data in conjunction with a loader format configuration file?
Thanks in advance,
-James