views:

5003

answers:

6

Has anyone been able to get a variable record length text file (CSV) into SQL Server via SSIS?

I have tried time and again to get a CSV file into a SQL Server table, using SSIS, where the input file has varying record lengths. For this question, the two different record lengths are 63 and 326 bytes. All record lengths will be imported into the same 326 byte width table.

There are over 1 million records to import.
I have no control of the creation of the import file.
I must use SSIS.
I have confirmed with MS that this has been reported as a bug. I have tried several workarounds. Most have been where I try to write custom code to intercept the record and I cant seem to get that to work as I want.

A: 

Why can't you just import it as a test file and set the column delimeter to "," and the row delimeter to CRLF?

Charles Graham
Basically, when you encounter a short record on the import file, SSIS continues importing into the current row on the SQL Server table. It SHOULD begin importing that very next byte of the file in a brand new row on the table. It totally mangles the data. Even the CRLF, itself, gets imported!
Optimal Solutions
+1  A: 

I used BULK Import with little programming to settle the files and it provided me same speed as that of SSIS package but more control over the import too.

Gripsoft
+1  A: 

You should have no problem importing this file. Just make sure when you create the Flat File connection manager, select Delimited format, then set SSIS column length to maximum file column length so it can accomodate any data.

It appears like you are using Fixed width format, which is not correct for CSV files (since you have variable length column), or maybe you've incorrectly set the column delimiter.

Michael
I had selected Delimited and set the length to the max record length - no luck. I had to give up on it and write custom import code using C#, processing record by record.
Optimal Solutions
+4  A: 

I had a similar problem, and used custom code (Script Task), and a Script Component under the Data Flow tab.

I have a Flat File Source feeding into a Script Component. Inside there I use code to manipulate the incomming data and fix it up for the destination.

My issue was the provider was using '000000' as no date available, and another coloumn had a padding/trim issue.

D.S.
+2  A: 

I think this was a problem with DTS / SQL Server 2000 also. It would pain me that they havent corrected this. The learning curve, as it is, is steep for me!

+1  A: 

Same issue. In my case, the target CSV file has header & footer records with formats completely different than the body of the file; the header/footer are used to validate completeness of file processing (date/times, record counts, amount totals - "checksum" by any other name ...). This is a common format for files from "mainframe" environments, and though I haven't started on it yet, I expect to have to use scripting to strip off the header/footer, save the rest as a new file, process the new file, and then do the validation. Can't exactly expect MS to have that out-of-the box (but it sure would be nice, wouldn't it?).

It would be much more simple to write custom code (Script Task) to process the file *IF* SSIS didnt keep importing the next record as the current record - the actual CR and LF characters being imported as well! I never checked back as this was just not possible last year. Maybe its corrected but I have moved along from this, for now. If I revisit it I will post what I find, above.
Optimal Solutions