views:

1473

answers:

3

I would like to use SSIS to create database table flat file exports then import them in to variously distributed SQL Express installs. As you may guess, SSIS is not available in the Express version. When I do the bulk insert, it errors with:

Msg 4866, Level 16, State 8, Line 1 The bulk load failed. The column is too long in the data file for row 1, column 5. Verify that the field terminator and row terminator are specified correctly.

The generated file is ascii (1252) with char(09) (tab) between fields and char(0d) char(0a) ending a line (I think that is \r\n).

This is the format file (4th col is width of col):

9.0
5
1 SQLCHAR 0 12 "\t" 1 Record_ID ""
2 SQLCHAR 0 498 "\t" 2 Filename SQL Latin1 General CP1 CI AS
3 SQLCHAR 0 498 "\t" 3 Path SQL Latin1 General CP1 CI AS
4 SQLCHAR 0 12 "\t" 4 Width ""
5 SQLCHAR 0 12 "\r\n" 5 Height ""

The sample data starting from the top of the file:

Record_ID Filename Path Width Height
1437 BreadCakeCooknPie.eps G:\BakedGoods\BreadCakeCooknPie.eps 595 647
1438 CakeChocolateRoses.eps G:\BakedGoods\CakeChocolateRoses.eps 1200 848

I import it with the following T-SQL code:

bulk insert item_table from 'Item_Table.txt' with
( FORMATFILE='Item_Table.fmt', FIRSTROW=2)

The table fields are:

[Record_ID] [int] NULL,
[Filename] [nvarchar](249) NULL,
[Path] [nvarchar](249) NOT NULL,
[Width] [int] NULL,
[Height] [int] NULL

Any write-ups available? Any idea how to fix the error? Any idea how to grab the Format File created by SSIS? Any other ideas?

Thank you so much for considering my question.

A: 

Have you considered running your SSIS export in a pipe or comma delimited format? There could be issues with how the tab character is getting processed between export and import.

Everything you have laid out here doesn't seem to point to a problem with SSIS or Bulk Inserting, just something with the format of the file itself.

Dillie-O
I tried switching tabs to pipes and eol to semi-colons. It complains that the data is too long for the field. I tried both Unicode and 1252 with the same answer. It seems the trouble is turning the INT stored as 10 characters in the text file does not translate to the 4 byte int field maybe?
Dr. Zim
Hmm, that is rather perplexing. I would recommend keeping the row terminator (eol) the same as before (\r\n). See if that helps.
Dillie-O
+1  A: 

Is this failing on the source portion of the data flow? Can you post the meta data relating to the source? I found occassionally with flat files SSIS does not estimate the sizes or data types correctly on the flat file connection. When this happens, a data flow task will fail on the source portion of the import as soon as it encounters data that exceeds the length the flat file connection expected in the position.

Is this failing on the destination portion of the data flow? If so, does it at least work if you use a data-reader as the destination instead of the actual target table? I'm guessing your problem is earlier in the pipeline, but if it is making it past the source portion of the data flow, then you may be able to at least isolate the problem as a destination issue.

Please let us know where it is actually breaking in the data flow.

Registered User
It breaks when I do the "Bulk Insert" from a query window. Nothing in the SSIS shows any errors at all. Oddly, if I use BCP to dump the table to disk, including the format file, then try to bulk insert it back, it fails with the same error.
Dr. Zim
I also tried exporting all fields in the table as SP1 2005 SQL has a bug where bcp and bulk insert fail with a format file that specifies less fields than the target table. I also checked to see that all metadata is not unicode.
Dr. Zim
I also tried removing the header row.
Dr. Zim
Your answer led me to remake all the meta data, which lead me to a solution, listed below.
Dr. Zim
A: 

This is a partial answer. Apparently, you can start with BCP, have it generate a format file, and a text data dump, use the format file from bcp and match the text export exactly in SSIS. Physical field arrangement in the data file (even though you can order them in the format file) and no missing fields (even if you don't need them) seem to be the key.

I had to manually make the text export meta-data match the format file (hand typed the meta data under columns in the text file connection manager). This caused the physical order of fields in the text file to match the physical order in the table (I used the format file column to match these before, which didn't seem to work).

The content exported from SSIS is still not the same, but close enough to import. For example, when bcp exported the data, it put .000 behind date fields and .0 behind numeric fields (float, etc).

My final format file that worked is below. Notice all the fields are SQLCHAR even though some are unicode, that all the sizes are wrong (int is 10 but it has 12), (the text is 249 length but it has 498), and (I had to put all the fields in, not just the ones I wanted to import. It wasn't skipping fields correctly.)

Crazy world. Any ideas why this would be the case?

9.0
29
1 SQLCHAR 0 12 "\t" 1 Record_ID ""
2 SQLCHAR 0 498 "\t" 2 Filename SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 498 "\t" 3 Path SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 0 498 "\t" 4 Extension_Win SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 0 498 "\t" 5 Short_Filename_Win SQL_Latin1_General_CP1_CI_AS
6 SQLCHAR 0 498 "\t" 6 Volume SQL_Latin1_General_CP1_CI_AS
7 SQLCHAR 0 24 "\t" 7 Created ""
8 SQLCHAR 0 24 "\t" 8 Last_Modified ""
9 SQLCHAR 0 24 "\t" 9 Cataloged ""
10 SQLCHAR 0 24 "\t" 10 Last_Updated ""
11 SQLCHAR 0 12 "\t" 11 File_Size ""
12 SQLCHAR 0 498 "\t" 12 Mac_File_Type SQL_Latin1_General_CP1_CI_AS
13 SQLCHAR 0 498 "\t" 13 Mac_Creator SQL_Latin1_General_CP1_CI_AS
14 SQLCHAR 0 498 "\t" 14 Mac_Zone SQL_Latin1_General_CP1_CI_AS
15 SQLCHAR 0 12 "\t" 15 Thumbnail_Size ""
16 SQLCHAR 0 12 "\t" 16 Color_Mode ""
17 SQLCHAR 0 30 "\t" 17 Horizontal_Resolution ""
18 SQLCHAR 0 30 "\t" 18 Vertical_Resolution ""
19 SQLCHAR 0 12 "\t" 19 Width ""
20 SQLCHAR 0 12 "\t" 20 Height ""
21 SQLCHAR 0 12 "\t" 21 MultipageCount ""
22 SQLCHAR 0 12 "\t" 22 PlaceHolder ""
23 SQLCHAR 0 12 "\t" 23 Watermarked ""
24 SQLCHAR 0 12 "\t" 24 FileStoreID ""
25 SQLCHAR 0 498 "\t" 25 Directory_Path SQL_Latin1_General_CP1_CI_AS
26 SQLCHAR 0 12 "\t" 26 RID ""
27 SQLCHAR 0 498 "\t" 27 Cataloged_By SQL_Latin1_General_CP1_CI_AS
28 SQLCHAR 0 498 "\t" 28 Updated_By SQL_Latin1_General_CP1_CI_AS
29 SQLCHAR 0 0 "\r\n" 29 File_Description SQL_Latin1_General_CP1_CI_AS

Dr. Zim
Although not a good answer: with command line bcp I export a format file and actual data. I setup the SSIS text file columns (3rd tab) exactly like the format file. Then, I setup the SSIS metadata to match the BCP exported data. Everything imports well, some data truncates like seconds on a date.
Dr. Zim
Cool! Congratulations!
Registered User