views:

694

answers:

2

We need to run reports at a scheduled time of the day. The application runs 24*7 and so there is not "off-peak" time as such.

Therefore, running the reports should not add undue load on the system.

The application runs on WebSphere v6.1 and the database is Oracle 10g R2.

I have the following approaches at my disposal

  1. A set of de-normalized tables aimed at reporting.
  2. Creating Materialized views and using them for reports. We can update the views once a day.
  3. We can create another schema and replicate the tables realtime using Oracle's Data Guard.

(1) is not feasible due to certain internal constraints we have.

I need to know, from a performance point of view, which is better, (2) or (3) ?

I hear from many ppl that Materialized views initially work well but as data volumes increase, the performance is very poor.

Anyone has experience with Replication of tables within the same DB server (but diff instances or schemas).

+1  A: 

The best option is to use another Data Guard instance (not schema, it have no sense) on another machine. In this case you can have an up-to-date database an no disturb the production application.

About using materialized views is good if the refresh does not produce a lot of resources usage. And if the queries against materialized views also doesn't waste too many resources.

A third option you don't talk about is to use Oracle Resource Manager which allows you to control the resources usage with a lot of possibilities.

Anyway, I prefer the first one (Data Guard) because you have a "Report dabatase" a "live- backup" on the same time.

FerranB
+1  A: 

To some degree, materialized views are denormalized tables - the denormalization is whatever you can define in a SELECT statement, e.g. joins, aggregations, and analytic functions. After the original definition of the MV, you can add whatever indexes to the underlying MV table that are necessary to gain the performance you need.

Having said that, I think your options are:

  1. Use denormalized tables in same database that are somehow maintained by code you write - This option will give you the greatest control over the loading process at the expense of having to write and maintain the code. You'll also eliminate the infrastructure overhead of a separate instance. The denormalization process and the reporting queries will add to the resource requirements of the active/transactional database and you must be sized to handle this. WIth this option you also have tied the availability of the reporting application to the availability of the transactional system.
  2. Use MV's in the same database - The above comments about infrastructure and resource overhead apply, but you gain the leverage of using Oracle's MV functionality for scheduling (implemented via DBMS_JOB) and transactional read consistency (the "old" data is still visible until the new SELECT is resolved and committed).
  3. Use MV's in another database/instance on the same host - You gain some marginal separation and potential availability with this option, but you are still affecting the overall resources of the database host. The later versions of Oracle let you control resource usage within the instance to a fine-grained level, so in my opinion there's no good reason to run a separate database on the same host.
  4. Use MV's in another database on a different host - You can set up a db link to the transactional system and perform the MV refresh across the link. You'll still have the MV refresh/"load" process affecting the resources on the source system, but all query activity will be isolated and you'll have some degree of availability of reports during down time for the source system. You'll have to buy additional Oracle licenses with this option.
  5. Use a Data Guard instance - Disadvantages: additional licenses, increased complexity to set up and administer. Advantages: lowest impact on the source system, true copy of system and if you use logical as opposed to physical replication you can create additional structures (views, indexes, etc.) in the Data Guard Database.
dpbradley