views:

180

answers:

1

I'd like to know what my best option would be to import data from an excel file on a weekly or monthly basis. At first, I thought I would use SSIS, but after much struggle with seemingly simple tasks, I'm starting to rethink my plan. Would it be better/easier to just write the SQL by hand or use the services of an SSIS package? The basic process will be as follows:

  1. A separate process will download an .xls file to a local fileshare.
  2. The xls file will have a filename like: 'myfilename MON YY'.
  3. I will need to read the month and year from the the filename, reformat it to a sql date and then query a DimDate table to find the corresponding date key.
  4. For each row (after the first 2 header rows), insert the data with the date key, unless the row is a total row, then ignore.

Here are some of the issues I've been encountering with SSIS: I can parse the date string from a flat file datasource, but can't seem to do it with an excel data source. Also, once parsed, i cannot seem to convert the string to a date in order to perform the lookup for the date key. For example, I want to do something like this:

select DateKey from DimDate where ActualDate = convert(datetime, '01-' + 'JAN-10', 120)

but i don't think it is possible to use the 'convert' or 'datetime' keywords in an expression builder. I have been also unable to find where I can edit the SQL to ignore the first 2 rows of data.

I'm very skeptical of using SSIS because it seems like a Kludgy way of doing something that can probably be accomplished more efficiently writing the SQL yourself, but I may be forced to use SSIS. Thoughts?

+3  A: 

SSIS is definitely the direction to go.

To hit on your problems: (DT_DBTIMESTAMP) is the conversion you want. The syntax is a bit different. For instance to convert your example date I would use:

(DT_DBTIMESTAMP)"01/01/2010"

If you use that expression in a derived column to replace your string date (or create a new column), you could then do a lookup against datetime columns in a DB.

If you need to exclude the first two rows, you will either need to write an SQL statement to query the file (as opposed to an excel file reader source) or use a conditional split to throw them away based on any condition that can be repeated with every import.

Flat files easier to work with, and do allow you to throw away x number of initial rows.

Kenneth
I agree with Kenneth - SSIS is the tool to use.One comment on the date format - I would go with YYYY-MM-DD - it's the ISO/SQL unambiguous format.
Todd McDermid