views:

174

answers:

2

In our application we're considering using dynamically generated classes to hold a lot of our data. The reason for doing this is that we have customers with tables that have different structures. So you could have a customer table called "DOG" (just making this up) that contains the columns "DOGID", "DOGNAME", "DOGTYPE", etc. Customer #2 could have the same table "DOG" with the columns "DOGID", "DOG_FIRST_NAME", "DOG_LAST_NAME", "DOG_BREED", and so on. We can't create classes for these at compile time as the customer can change the table schema at any time.

At the moment I have code that can generate a "DOG" class at run-time using reflection. What I'm trying to figure out is how to populate this class from a DataTable (or some other .NET mechanism) without extreme performance penalties. We have one table that contains ~20 columns and ~50k rows. Doing a foreach over all of the rows and columns to create the collection take about 1 minute, which is a little too long.

Am I trying to come up with a solution that's too complex or am I on the right track? Has anyone else experienced a problem like this? Create dynamic classes was the solution that a developer at Microsoft proposed. If we can just populate this collection and use it efficiently I think it could work.

A: 

You should use a profiler to figure out, what exactly takes the time, and then optimize on that. If you are using reflection when setting the data, it will be slow. Much can be saved by caching the reflected type data.

I am curious, how will you use these class if the members are not known at compile time ? Perhaps you would be better off with just a plain DataTable for this scenario ?

driis
DataTable is too slow for what we're trying to do. We're currently using DataTable for our implementation.
Bob
+3  A: 

What you're trying to do wil get fairly complicted in .NET 3.5. You're probably better off creating a type with a Dictionary<> of key/value pairs for the data in your model.

If you can use .NET 4.0, you could look at using dynamic and ExpandoObject. The dynamic keyword lets you create reference to truly dynamic objects - ExpandoObject allows you to easily add properties and methods on the fly. All without complicated reflection or codegen logic. The benefit of dynamic types is that the DLR performs some sophisticated caching of the runtime binding information to allow the types to be more performant than regular reflection allows.

If all you need to do is load map the data to existing types, then you should consider using something like EntityFramework or NHibernate to provide ORM behavior for your types.

To deal with the data loading performance, I would suggest you consider bypassing the DataTable altogether and loading your records directly into the types you are generating. I suspect that most of the performance issues are in reading and casting the values from the DataTable where they original reside.

LBushkin
It would be nice to use an ORM to do this, but since our tables can change without notice, an ORM won't work.I've looked at the dynamic features in .NET 4.0, but the problem is we don't know the names of the properties at compile time, so it doesn't seem like the dynamic features will work. Correct me if I'm wrong though please.We're also working on a solution using dictionaries, but I'm not sure if it'll work until we try it.The problem that we have is how to load the data into our types that we're generating. Is there a fast way to do that?
Bob
@Bob: So you need to be able to map arbitrary data from your database into dynamically code-gen'd classes. Hmmm. Dynamic types may not be helpful on their own ... if you could use an actual dynamic language (like Ruby or Python) - you could do some metaprogramming to generate a mapping function from a DataTable to a custom type. It sounds like you already have code that dynamically generates your entity classes, yes? Have you tried using DataReader directly to populate it rather than using a DataTable? It may help to gen a Load(DbDataReader) method in your type to support efficient loading.
LBushkin
Your assumptions are all correct. I've thought about using IronPython to do this part, but I haven't gotten to that point just yet.I tried using a DataReader in a sample application I wrote to test this, and that increased performance a little, but I don't think we'll be able to use it in a real-world application as our data is all retrieved through a service. The client application doesn't have direct access to the database. We'd have to pass dynamic objects over WCF. :-O
Bob
I've thought about maybe trying to create a method dynamically in IL to populate my dynamic objects. I wonder if it would even be feasible. Is this what you're suggesting with generating a Load(DbDataReader) method?
Bob
@Bob: Yes, you're spot on with your understanding. Have you tried using DataRow.ItemArray to retrieve the raw values from the data row? This may be faster than individual accesses to the indexer property.
LBushkin
@Bob: Note that instead of generating IL code dynamically, you can also generate C# code and compile it, which was in my cases always fast enough if you do it once at application startup/when required. I found this to be easier if a lot of code is to be generated, because IL is in general much more complicated than plain C#. If the methods will be only rather simple (not necessarily short), of course, emitting IL is the way to go.
OregonGhost