tags:

views:

62

answers:

1

In my program (a console application that does edit and imputation of data) I give the user the ability to provide a data dictionary in a number of different ways: tab-delimited text files, Excel workbooks or in a database. The dictionary consists of several (12-15) files/sheets/tables. I am trying to come up with a nice way to load the data from the various sources into the database.

My solution thus far has been to use a repository to isolate the various data sources and have these repositories spit out DTOs that I map onto my domain model. I use the Builder pattern to control the entire sequence of events.

Basically the sequence of events for each file/sheet/table is:

  1. Get the DTOs from the repository
  2. Validate the information in the DTOs
  3. Then
    • If the data is good, map the domain entity
    • else keep a running list of errors.

My question is this: I am trying to figure out where the best place is to validate the information in the DTOs? One possible solution was to add an interface on the DTOs like so

public interface IValidate
{
    void Validate();
    bool HasErrors { get; }
    IEnumerable<string> GetErrorMessages();
}

is this too heavy for a DTO? Should the validation happen somewhere else? Sorry if this is a little subjective.

+1  A: 

I can't answer your question definitively because, as you said, the question does seem to be subjective and any answer would ultimately be an opinion. It sounds like you are really struggling with a design decision based on the "academic" definition of a DTO as opposed to some kind of pragmatic requirement. We've all been there.

When faced with similar situations I generally tend to carry out the implementation in the most simple and straight forward way as I can avoiding things like complex tightly coupled relationships and excessive amounts of design. This way, once I am able to get everything running I can refractor from there will less impact.

Overall, it sounds like you are constructing some kind of ETL system. I don't know what platform you are dealing with but if you are using SQL Server Have you looked at SQL Server Integration Services? It has a lot of constructs to handle things like Office documents, XML, and flat files as data sources.

Anyway, good luck with your struggle.

Daniel Segan
ETL = Extract, transform, and load ? http://en.wikipedia.org/wiki/Extract,_transform,_load
Peter Mortensen
Thanks for the answer Daniel, but I'm merely trying to get infomation (a data dictionary in this case) into my program so that I can use it. The program is a batch Edit and Imputation system and doesn't have anything to do with ETL at all.
Jeffrey Cameron