Does an n-tier system make "sense" for large data set processing?

I recently became a part of the team of developers writing our "flagship" product. Its primarily a read intensive web app (asp.net(c#) and oracle) implemented in an N-tier system. Most of the writes in the DB are done through external services (not through the webapp). Instead of scheduling normal batch jobs in the DB for data aggregation, they're pushing everything up the tiers to the business layer (sometimes a creating a hundred million objects). While this does keep all the "business logic" in the same place, it also takes about 200 times longer than running the equivalent query in the database. This seems like a terrible idea to me. Am I wrong here and this is standard and good stuff? Does anybody have any real case studies I can point my co-workers towards (or myself if I'm in the wrong)?

I'm not debating whether n-tier is good or bad, but does it fit for data aggregation processing and the like?

You are right about the processing time (and also ressources, like memory).

Best-practices are to aggregate the closest possible to the data, ideally in the database. A hundred million object seem crazy.
However, we all know that the code is less maintainable that way. So it is more development time, and more cost in the end.

So you need to reach a correct balance. This cannot come to you from the outside,
you must weight carefully the advantages in the specific context of your project.

For example, the frequency all this happens matters a lot. The high cost is obviously acceptable if the process happens every minute, but probably not if it happens every year...

Maybe the correct balance would take a bit of both. For example, for a good ROI:

the queries to the database could do a first level of aggregation, getting rid of the tiny details, and dropping the number of objects to be created by a hundred.
the business layer could apply the rest of the rules

What makes a good candidate for a requirement being taken care in the query:

low level aggregation that drops the number of objects (or lines) that get out of the database
rules that rarely change
rules that read easily in SQL

To make your code more explicit (and reduce duplication between queries), I suggest that your code adopts a compile-time structure that makes it clear. Create explicit constants or functions that embody each business rule that you will put in a query, and use that to build (at runtime or compile-time) your queries.

ansaurus

tags:

views:

answers:

Does an n-tier system make "sense" for large data set processing?

related questions