views:

3002

answers:

5

Before, I have found the "Cost" in the execution plan to be a good indicator of relative execution time. Why is this case different? Am I a fool for thinking the execution plan has relevance? What specifically can I try to improve v_test performance?

Thank you.

Using Oracle 10g I have a simple query view defined below

  create or replace view v_test as
  select distinct u.bo_id as bo_id, upper(trim(d.dept_id)) as dept_id
  from
      cust_bo_users u
  join cust_bo_roles r on u.role_name=r.role_name
  join cust_dept_roll_up_tbl d on 
                            (r.region is null or trim(r.region)=trim(d.chrgback_reg)) and 
                            (r.prod_id is null or trim(r.prod_id)=trim(d.prod_id)) and
                            (r.div_id is null or trim(r.div_id)=trim(d.div_id )) and
                            (r.clus_id is null or trim(r.clus_id )=trim( d.clus_id)) and
                            (r.prod_ln_id is null or trim(r.prod_ln_id)=trim(d.prod_ln_id)) and
                            (r.dept_id is null or trim(r.dept_id)=trim(d.dept_id))

defined to replace the following view

        create or replace view v_bo_secured_detail
  select distinct Q.BO_ID, Q.DEPT_ID
  from (select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'REGION' and
       trim(R.REGION) = UPPER(trim(D.CHRGBACK_REG))
     union all
     select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'RG_PROD' and
       trim(R.REGION) = UPPER(trim(D.CHRGBACK_REG)) and
       trim(R.PROD_ID) = UPPER(trim(D.PROD_ID))
     union all
     select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'PROD' and
       trim(R.PROD_ID) = UPPER(trim(D.PROD_ID))
     union all
     select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'DIV' and
       trim(R.DIV_ID) = UPPER(trim(D.DIV_ID))
     union all
     select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'RG_DIV' and
       trim(R.REGION) = UPPER(trim(D.CHRGBACK_REG)) and
       trim(R.DIV_ID) = UPPER(trim(D.DIV_ID))
     union all
     select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'CLUS' and
       trim(R.CLUS_ID) = UPPER(trim(D.CLUS_ID))
     union all
     select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'RG_CLUS' and
       trim(R.REGION) = UPPER(trim(D.CHRGBACK_REG)) and
       trim(R.CLUS_ID) = UPPER(trim(D.CLUS_ID))
     union all
     select U.BO_ID BO_ID, UPPER(trim(D.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R, CUST_DEPT_ROLL_UP_TBL D
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'PROD_LN' and
       trim(R.PROD_LN_ID) = UPPER(trim(D.PROD_LN_ID))
     union all
     select U.BO_ID BO_ID, UPPER(trim(R.DEPT_ID)) DEPT_ID
     from CUST_BO_USERS U, CUST_BO_ROLES R
     where U.ROLE_NAME = R.ROLE_NAME and
       R.ROLE_LEVEL = 'DEPT') Q

with the goal of removing the dependency on the ROLE_LEVEL column.

The execution plan for v_test is significantly lower than v_bo_secured_detail for simple

select * from <view> where bo_id='value'

queries. And is significantly lower when used in a real world query

  select CT_REPORT.RPT_KEY,
      CT_REPORT_ENTRY.RPE_KEY,
      CT_REPORT_ENTRY.CUSTOM16,
      Exp_Sub_Type.value,
      min(CT_REPORT_PAYMENT_CONF.PAY_DATE),
      CT_REPORT.PAID_DATE
  from CT_REPORT,
      <VIEW> SD,
      CT_REPORT_ENTRY,
      CT_LIST_ITEM_LANG Exp_Sub_Type,
      CT_REPORT_PAYMENT_CONF,
      CT_STATUS_LANG Payment_Status
  where (CT_REPORT_ENTRY.RPT_KEY = CT_REPORT.RPT_KEY) and
     (Payment_Status.STAT_KEY = CT_REPORT.PAY_KEY) and
     (Exp_Sub_Type.LI_KEY = CT_REPORT_ENTRY.CUSTOM9 and Exp_Sub_Type.LANG_CODE = 'en') and
     (CT_REPORT.RPT_KEY = CT_REPORT_PAYMENT_CONF.RPT_KEY) and
     (SD.BO_ID = 'JZHU9') and
     (SD.DEPT_ID = UPPER(CT_REPORT_ENTRY.CUSTOM5)) and
     (Payment_Status.name = 'Payment Confirmed' and (Payment_Status.LANG_CODE = 'en') and
     CT_REPORT.PAID_DATE > to_date('01/01/2008', 'mm/dd/yyyy') and Exp_Sub_Type.value != 'Korea')
  group by CT_REPORT.RPT_KEY,
      CT_REPORT_ENTRY.RPE_KEY,
      CT_REPORT_ENTRY.CUSTOM16,
      Exp_Sub_Type.value,
      CT_REPORT.PAID_DATE

The execution times are WILDLY different. The v_test view taking 15 hours, and the v_bo_secured_detail taking a few seconds.


Thank you all who responded

This is one to remember for me. The places where the theory and mathematics of the expressions meets the reality of hardware based execution. Ouch.

A: 

One aspect of low-cost -- high execution time is that when you are looking at large data-sets, it is often more efficient on the whole to do things in bulk, whereas if you want a quick results, it is more efficient to do as little work as possible to get the first record. The repetitiveness of doing the small operations that give the appearance of a quick response will not likely give a good result when working on the large sets.

Many times, when you want a quick result, the USE_NL optimizer hint will help.

Also, in your test view, it is relying on IS NULL... IS NULL cannot use an index nor can using a function such as trim on the 'table-side' parameter.

Greg Ogle
+1  A: 

An execution plan is theory, the execution time is reality.

The plan shows you how the engine goes about performing your query, but some steps might cause an inordinate amount of work to resolve the query. The use of "x is null or x = y" smells bad. If r and d are big tables you might have some sort of combinatorial explosion hitting you and the request cycles endlessly through large lists of disk blocks. I imagine you're seeing lots of I/O during the execution.

On the other hand, the unioned selects are short and sweet, and so probably reuse lots of disk blocks that are still lying around from the earlier selects, and/or you have some degree of parallelism benefitting from reads on the same disk blocks.

Also using trim() and upper() everywhere looks a bit suspicious. If your data are so unclean it might be worth running some periodic housecleaning from time to time, so that you can say "x = y" and know it works.

update: you asked for tips to improve v_test. Clean your data so that trim() and upper() are unnecessay. They may preclude indexes from being used (although that would be affecting the unioned select version as well).

If you can't get rid of "x is null or x = y" then y = nvl(x,'does-not-exist') might have better characteristics (assuming 'does-not-exist' is a "can't happen" id value).

dland
A: 

Have you gathered optimiser stats on all the underlying tables? Without them the optimiser's estimates may be wildly out of kilter with reality.

Tony Andrews
+2  A: 

As the Oracle documentation says, the cost is the estimated cost relative to a particular execution plan. When you tweak the query, the particular execution plan that costs are calculated relative to can change. Sometimes dramatically.

The problem with v_test's performance is that Oracle can think of no way to execute it other than performing a nested loop, for each cust_bo_roles, scan all of cust_dept_roll_up_tbl to find a match. If the table are of size n and m, this takes n*m time, which is slow for large tables. By contrast v_bo_secured_detail is set up so that it is a series of queries, each of which can be done through some other mechanism. (Oracle has a number it may use, including using an index, building a hash on the fly, or sorting the datasets and merging them. These operations are all O(n*log(n)) or better.) A small series of fast queries is fast.

As painful as it is, if you want this query to be fast then you need to break it out like the previous query did.

A: 

When you say the "query plan is lower", do you mean it is shorter, or that the actual cost estimates are lower? One obvious problem with your replacement view is that the join with cust_dept_roll_up_tbl uses almost exclusively unindexable criteria (the "is null" tests can be satisfied by an index, but the ones involving calling trim on each argument can't be), so the planner has to make at least one, and probably several sequential scans of the table to satisfy the query.

I'm not sure if Oracle has this limitation, but many DBs can only do a single index scan per included table, so even if you clean up your join conditions to be indexable, it may be able to satisfy only one condition with an index scan and have to use sequential scans for the remainder.

Nick Johnson