views:

1432

answers:

3

I have a CSV file with quote text delimiters. Most of the 90000 rows are fine, but I have a few rows that have a text field that contains both a quote and a comma. For example the fields value would be:

AB",AB

When Delimited this becomes

"AB"",AB"

When SQL 2005 attempts to import this I get errors such as...

Messages
Error 0xc0202055: Data Flow Task: The column delimiter for column "Column 4" was not found.
 (SQL Server Import and Export Wizard)
This only seems to happen when a quote and comma are in a text value together. Values like AB"AB which becomes "AB""AB" or AB,AB which becomes "AB,AB" work fine. Here are some example rows...
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC   !!NOTE!!","","B",""
"1465076","BRKT-STR MTR            !NOTE!","","B",""
"1465172",":BRKT-SW MTG   !NOTE!","","B","MP16"
"1465388","BUSS BAR                !NOTE!","","B","MP10"
"1465391","PLT-BLKHD     ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224"",E122/261,8 CO","","B","MP11"

The last row is an example of the problem - the "", causes the error.

A: 

I would just do a search/replace for ", and replace it with ,

Do you have access to the original file?

orthod0ks
No I have access to the import file only. Also the file is full of ", or "", that are valid.Here are some example rows..."1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6""S01266330002","CABLE:224"",E122/261,8 CO","","B","MP11"2nd row is
A: 

How about just:

  1. Search/replace all "", with ''; (fix all the broken fields)
  2. Search/replace all ;''; with ,"", (to "unfix" properly empty fields.)
  3. Search/replace all '';''; with "","", (to "unfix" properly empty fields which follow a correct encapsulation of embedded delimiters.)

That converts your original to:

   "1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC   !!NOTE!!","","B",""
"1465076","BRKT-STR MTR            !NOTE!","","B",""
"1465172",":BRKT-SW MTG   !NOTE!","","B","MP16"
"1465388","BUSS BAR                !NOTE!","","B","MP10"
"1465391","PLT-BLKHD     ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224'';E122/261,8 CO","","B","MP11"

Which seems to run the gauntlet fine in SSIS. You may have to step 3 recursively to account for 3 empty fields in a row ('';'';'';, etc.) but the bottom line here is that when you have embedded text qualifiers, you have to either escape them or replace them. Let this be a lesson in your CSV creation processes going forward.

kthejoker
A: 

I've had MAJOR problems with SSIS. Things that Access, Excel and even DTS seemed to do very well, SSIS chokes on. Variable record-length data is another problem but, yes, these embedded qualifiers are a major problem. Especially if you do not have access to the import files because they're on someone else's server that you pay to gain access to and might even be 4 to 5 GB in size! Cant just to a "replace all" on that every import.

You may want to check into this at Microsoft Downloads called "UnDouble" and here is another workaround you might try.

Seems like with SSIS in SQL Server 2008, the bug is still there. I dont know why they havent addressed this in the parser but its like we went back in time with SSIS in basic import functionality.

Optimal Solutions