tags:

views:

64

answers:

1

I have been given the task of developing a small library (using C# 3.0 and .NET 3.5) to provide data import functionality for an application.

The spec is:

  • Data can be imported from CSV file (potentially other file formats in the future)
  • The CSV files can contain any schema and number of rows, with a maximum file size of 10MB.
  • It must be possible to change the datatype and column name of each column in the CSV file.
  • It must be possible to exclude columns in the CSV file from the import.
  • Importing the data will result in a table matching the schema being created in a SQL Server database, and then being populated using rows in the CSV.

I've been playing around with ideas for a while now my current code feels like it has been hacked together a bit.

My current implementation approach is:

  1. Open CSV and estimate the schema, store in an ImportSchema class
  2. Allow the schema to be modified.
  3. Use SMO to create the table in SQL according to the schema.
  4. Create a System.Data.DataTable instance using the schema for datatypes.
  5. Use CsvReader to read the CSV data into the DataTable.
  6. Apply column name changes and remove unwanted columns from DataTable.
  7. Use System.Data.SqlClient.SqlBulkCopy() to add the rows from the DataTable into the created database table.

This sounds overly complex to me and I am facing a mental block trying to wrap it up neatly in a handful of testable/extensible objects.

Any suggestions/thoughts on ways to approach this problem, both from an implementation and a design perspective?

Many thanks for any suggestions.

A: 

As suggested in some previous SO answers, take a look at the FileHelpers Libraray. This might be at least helpful in your task to import and analyze the CSV files.

Frank Bollack
FileHelpers library is good but from what I can tell you need to to know the format of the data so it can be mapped to an existing object..
TonE