views:

2189

answers:

2

Hello, I can't seem to figure out how this is happening.

Here's an example of the file that I'm attempting to bulk insert into SQL server 2005:

***A NICE HEADER HERE***
0000001234|SSNV|00013893-03JUN09
0000005678|ABCD|00013893-03JUN09
0000009112|0000|00013893-03JUN09
0000009112|0000|00013893-03JUN09

Here's my bulk insert statement:

BULK INSERT sometable
FROM 'E:\filefromabove.txt
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR= '|',
ROWTERMINATOR = '\n'
)

But, for some reason the only output I can get is:

0000005678|ABCD|00013893-03JUN09
0000009112|0000|00013893-03JUN09
0000009112|0000|00013893-03JUN09

The first record always gets skipped, unless I remove the header altogether and don't use the FIRSTROW parameter. How is this possible?

Thanks in advance!

+2  A: 

Maybe check that the header has the same line-ending as the actual data rows (as specified in ROWTERMINATOR)?

Update: from MSDN:

The FIRSTROW attribute is not intended to skip column headers. Skipping headers is not supported by the BULK INSERT statement. When skipping rows, the SQL Server Database Engine looks only at the field terminators, and does not validate the data in the fields of skipped rows.

Marc Gravell
Hi Mark, yes unfortunately each of the rows have a CRLF. Thanks for the input, though.
gibbo
+1  A: 

I don't think you can skip rows in a different format with BULK INSERT/BCP.

When I run this:

TRUNCATE TABLE so1029384

BULK INSERT so1029384
FROM 'C:\Data\test\so1029384.txt'
WITH
(
--FIRSTROW = 2,
FIELDTERMINATOR= '|',
ROWTERMINATOR = '\n'
)

SELECT * FROM so1029384

I get:

col1                                               col2                                               col3
-------------------------------------------------- -------------------------------------------------- --------------------------------------------------
***A NICE HEADER HERE***
0000001234               SSNV                                               00013893-03JUN09
0000005678                                         ABCD                                               00013893-03JUN09
0000009112                                         0000                                               00013893-03JUN09
0000009112                                         0000                                               00013893-03JUN09

It looks like it requires the '|' even in the header data, because it reads up to that into the first column - swallowing up a newline into the first column. Obviously if you include a field terminator parameter, it expects that every row MUST have one.

You could strip the row with a pre-processing step. Another possibility is to select only complete rows, then process them (exluding the header). Or use a tool which can handle this, like SSIS.

Cade Roux
You are correct! When I add '||' to the end of the header, it works fine. I think that I'm going to attempt to strip the header out of each file I'm inserting. Thanks!
gibbo