views:

719

answers:

1

Hey guys,

I am trying to import some data to SQL Server 2008 by means of Bulk Insert, but I've been getting a ton of conversion errors:

Msg 4864, Level 16, State 1, Line 1 Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 5902, column 2 (Type).

OK so first things first:

a) the data comes from a parser i built in C#. In this file, there are 4 columns, separated by tabs. The columns may or not be null. The collation is UTF-8. Here's an excerpt of it.

D00486  DBLinks PubChem 7847552
D00486  DBLinks LigandBox D00486
**D00487    Name  Pyridostigmine bromide (JP15/USP/INN)** -- WORKS
D00487  Name  Mestinon (TN)
D00487  Chemical 260.016 C9H13N2O2. Br
D00487  Target PATH:hsa00564(43)
D00487  Remark  Therapeutic category: 1239
D00487  Remark  ATC code: N07AA02
D00487  Pathway PATH: map07220 Cholinergic and anticholinergic drugs
D00487  DBLinks CAS 101-26-8
D00487  DBLinks PubChem 7847553
D00487  DBLinks DrugBank DB00545
D00487  DBLinks LigandBox D00487
**D00488    Name  Pyrimethamine (JAN/USP/INN)** -- DOES **NOT** WORK!
D00488  Name  Daraprim (TN)

The second to last row generates an error message like the one i've previously shown you. What really amazes me is that the other line i've placed on bold is quite similar yet does not generate any error whatsoever.

b) This is what i'm using to create the table:

CREATE TABLE [dbo].[KB] ([BEName] [nvarchar](1000) NOT NULL, [Type] [nvarchar](1000) NULL, [Name] [nvarchar](1000) NULL, [Value] [nvarchar](1000) NULL) ON [PRIMARY]

c) This is what i'm currently using to import the data from a file:

DECLARE @SQL nvarchar(4000)
SET @SQL = 'BULK INSERT KB FROM ''C:\KB.xml'' WITH (FirstRow = 1, FORMATFILE = ''C:\out_KB.fmt'')';
EXEC(@SQL)

I had to manually generate the fmt file because for some reason BCP could not connect to my database (still working out the basics on it...) so it may be possible that it's not as it should be. Either way, here are it's contents:

9.0
4
1 SQLCHAR 0 100 "" 1 BEName SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 500 "\t" 2 Type SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 500 "\t" 3 Name SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 0 500 "\n" 9 Value SQL_Latin1_General_CP1_CI_AS

As i've previously stated, i don't have much experience with BCP; initially, I was simply using

BULK INSERT KB FROM ''C:\KB.xml'' WITH (FirstRow = 1, RowTerminator = ''\n'' , FieldTerminator = ''\t'')

but it wouldn't work anyway. I'd still rather use this instead of the FMT file; I googled and tried a couple of tricks (such as swapping \n with \n\cr or introducing the ASCII code of the \n instead).

Any ideas? I can't say that i have'em, i'm exhausted right now, going to sleep until noon or something :S (it's 4 AM here now).

Cheers mates, thanks and advance Hal

PS: I think the samples are ok, but please be my guest and request what you may need at any time. PS2: Sorry for the wall of text ;)

+1  A: 

Non-XML bulk-insert files are very finicky. I looked at yours and don't see a problem, but it is easy to miss problems.

XML bulk-insert files are a lot easier to work with. Can you refactor your code to produce XML bulk insert files instead? I have also found that you must also use XML format files when using XML insert files, and BCP won't create that for you -- you have to write your XML format file yourself, but it's pretty easy.

John Dibling
Good idea, thank you.I'll have to implement it someday, right now I've managed to import the data through the task's GUI.Thanks again
Hal