views:

185

answers:

2

In a reporting application, Is it possible to abstract reporting logic and the database schema details?

I have a Reporting Services application with a reasonably complex reporting logic, I am trying to migrate the application to some other databases. (Databases that are built for the same purpose but developed by different software-houses. )

Is it a wise decision to use a web services / WCF layer in the middle? What else options can be considered?

+3  A: 

It would be hard to do this sort of thing in a one-size-fits-all way in the general case but you could try one of these:

  • Build some views over the database schema and write the reporting sprocs against those. This means that you have some flexibility in the underlying database schema and can use the views as an abstraction layer.

  • Build some sort of data warehouse platform and write an ETL process to populate it from the various data sources. This is more flexible but more effort to build and it will only work from a periodic refresh. If that degree of latency is acceptable for your application then I would suggest that the data warehouse system is the better approach.

    The key advantage of a data warehouse is that it is optimised for reporting and has a consistent interface across all the data sources - you consolidate them into a single database with one schema. The reports are developed against that schema. Adding new systems is achieved by writing an ETL process to populate the warehouse; the reports continue to work regardless of the data source.

WCF is a network communication system. You will find that it difficult to make this sort of architecture handle large volumes of data on a transaction by transaction basis - a batch loading ETL process would be much more efficient. However, if you need a real-time feed (perhaps for a trading floor system) you might be able to do it with something like this.

If you nead a low latency feed another approach would be to investigate a genre of tooling called Enterprise Information Integration. Perhaps the most widely available tool that can do this is a Data Source View in SSIS which does give you some flexibility in mapping arbitrary data sources to a consistent schema. It isn't as sophisticated as the best of breed EII tools but you can put SSIS packages on top of it and use those as a data source for your reports if you need to further transform the data.

However, I've never built a system structured like this so I can't really vouch for how well it works in practice. I would guess that it would be quite fragile and difficult to break down into parts that can be unit tested, so development and maintenance will be quite time-consuming for a system of non-trivial complexity.

If you want to investigate other EII systems on the market This link is a directory of various articles about EII and some other EII tooling vendors.

ConcernedOfTunbridgeWells
this is really helpful. thank you!
Onur Bıyık
+3  A: 

I agree with NXC's data warehouse suggestion:

  • If it was a one-off job, no, but since it is a "reporting application", it is highly likely that many of your reports will fit into the OLAP paradigm.
  • It is obviously a do-able task, as there are many successful OLAP query tools available on the market, with varying degress of complexity
  • OLAP schemas, e.g., the standard star schema, are designed to be easy to query. They are flat in nature, with the fact table joining to one or more dimension tables using simple keys in a very straightforward way.
  • the types of operations people will want to do with the reports are predictable: filter, sort, group by, add or remove columns, export to CSV, etc.
  • you will get a good level of flexibility with the reports that get generated - the ability to explore data for various relationships is very addictive
  • once you write the query tool, it is portable for reuse with any other star schema - not bad!
  • "Is it possible to abstract reporting logic and the database schema details?" - yes, OLAP schemas are for exactly this - they make all data conform to a standardized schema. Of course this moves business logic into the ETL layer, but I would argue that is where it belongs.

So, you are required to do ETL with this approach - one option is to do some form of ROLAP, but in practice I have found it to be as easy to write ETL scripts as it is to cajole good performance out of a ROLAP setup.

RedFilter
Plus, star schemas are pretty forgiving of exact cardinality of the source data. They are pretty flat, so it's easy to map arbitrary relationships and hierarchies to a set of sensibly chosen dimensions.
ConcernedOfTunbridgeWells