tags:

views:

2986

answers:

2

When an excel data source is used in SSIS, the data types of each individual column are derived from the data in the columns. Is it possible to override this behaviour?

Ideally we would like every column delivered from the excel source to be string data type, so that data validation can be performed on the data received from the source in a later step in the data flow.

Currently, the Error Output tab can be used to ignore conversion failures - the data in question is then null, and the package will continue to execute. However, we want to know what the original data was so that an appropriate error message can be generated for that row.

+1  A: 

Yes, you can. Just go into the output column list on the Excel source and set the type for each of the columns.

To get to the input columns list right click on the Excel source, select 'Show Advanced Editor', click the tab labeled 'Input and Output Properties'.

A potentially better solution is to use the derived column component where you can actually build "new" columns for each column in Excel. This has the benefits of

  1. You have more control over what you convert to.
  2. You can put in rules that control the change (i.e. if null give me an empty string, but if there is data then give me the data as a string)
  3. Your data source is not tied directly to the rest of the process (i.e. you can change the source and the only place you will need to do work is in the derived column)
Robert MacLean
Robert - Thanks for your response - I'm still having a problem seeing how to do this. I am using the Excel Source component which has a columns form, but the this only has the name of each column, not the datatype. The Excel Connection manager does not have a columns form as far as I can see.
Hugh Mullally
I think you are going to the wrong property editor (Excel source has two, a simple and an advanced). I have updated the answer with a bit more clarification and a step-by-step to get to the right place.
Robert MacLean
Getting there now - I have used the advanced editor and changed the input and output columns to all have a datatype of DT_WSTR. However the error output column data typs have to match the input columns data types, and it does not seem possible to change these... Thanks for your continuing help!
Hugh Mullally
+1  A: 

If your Excel file contains a number in the column in question in the first row of data, it seems that the SSIS engine will reset the type to a numeric type. It kept resetting mine. I went into my Excel file and changed the numbers to "Numbers stored as text" by placing a single quote in front of them. They are now read as text.

I also noticed that SSIS uses the first row to IGNORE what the programmer has indicated is the actual type of the data (I even told Excel to format the entire column as TEXT, but SSIS still used the data, which was a bunch of digits), and reset it. Once I fixed that by putting a single-quote in my Excel file in front of the number in the first row of data, I thought it would get it right, but no, there is additional work.

In fact, even though the SSIS External DataSource Column now has the type DT_WSTR, it will still read 43567192 as 4.35671E+007. So you have to go back into your Excel file and put single quotes in front of all the numbers.

Pretty LAME, Microsoft! But there's your solution. I have no idea what to do if the Excel file is not under your control.

Dave Scotese