views:

106

answers:

3

I'm having a hard time wrapping my head around this query. it is taking almost 200+ seconds to execute. I've pasted the execution plan as well.

SELECT       
      user_id    ,
      ROLE_ID              ,
      effective_from_date  ,
      effective_to_date    ,
      participant_code     ,
      ACTIVE
FROM    
      CMP_USER_ROLE E
WHERE   
      ACTIVE = 0
 AND  (SYSDATE BETWEEN effective_from_date AND effective_to_date 
        OR TO_CHAR(effective_to_date,'YYYY-Q') = '2010-2')
 AND  participant_code = 'NY005'
 AND  NOT EXISTS
            ( SELECT 1 FROM CMP_USER_ROLE r 
              WHERE r.USER_ID= E.USER_ID 
              AND r.role_id = E.role_id 
              AND r.ACTIVE = 4
              AND E.effective_to_date 
                      <= ( SELECT MAX(last_update_date)
                           FROM CMP_USER_ROLE S
                           WHERE S.role_id = r.role_id
                           AND S.role_id = r.role_id
                           AND S.ACTIVE = 4 ))

Explain plan

-----------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                  |     1 |    37 |   154   (2)| 00:00:02 |
|*  1 |  FILTER                          |                  |       |       |            |          |
|*  2 |   TABLE ACCESS BY INDEX ROWID    | USER_ROLE        |     1 |    37 |    30   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN              | N_USER_ROLE_IDX6 |    27 |       |     3   (0)| 00:00:01 |
|*  4 |   FILTER                         |                  |       |       |            |          |
|   5 |    HASH GROUP BY                 |                  |     1 |    47 |   124   (2)| 00:00:02 |
|*  6 |     TABLE ACCESS BY INDEX ROWID  | USER_ROLE        |   159 |  3339 |   119   (1)| 00:00:02 |
|   7 |      NESTED LOOPS                |                  |    11 |   517 |   123   (1)| 00:00:02 |
|*  8 |       TABLE ACCESS BY INDEX ROWID| USER_ROLE        |     1 |    26 |     4   (0)| 00:00:01 |
|*  9 |        INDEX RANGE SCAN          | N_USER_ROLE_IDX5 |     1 |       |     3   (0)| 00:00:01 |
|* 10 |       INDEX RANGE SCAN           | N_USER_ROLE_IDX2 |   957 |       |    74   (2)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Statistics:

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
    3433602  consistent gets
          0  physical reads
          0  redo size
      58149  bytes sent via SQL*Net to client
       1260  bytes received via SQL*Net from client
        148  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       2199  rows processed
A: 

First thing is to analyze your table:

EXEC dbms_stats.gather_table_stats('YOUR_SCHEMA', 'CMP_USER_ROLE');

Do you still get the same execution plan?

The Time column in your execution plan looks as if your query does not look that expensive for Oracle optimizer.

Peter Lang
in sql+? it says dbms_stats command not found..
deming
please see update. I've provided statistics as well
deming
Please see my updated answer for SQL*Plus.
Peter Lang
great. I dont have privileges. bollocks
deming
Please try if `SELECT COUNT(*) FROM cmp_user_role` returns approximately the same as `SELECT num_rows FROM user_tables WHERE table_name='CMP_USER_ROLE'`.
Peter Lang
Do you have privileges to do `ANALYZE TABLE cmp_user_role COMPUTE STATISTICS`?
Peter Lang
nope....:( that's why I am looking to just optimize the query
deming
What about the two queries above?
Peter Lang
first query = 699415. second query = 697608. cmp_user_role is actually a view which points to user_role. so in second query I had to do `table_name = 'USER_ROLE'`
deming
A: 

Try running EXPLAIN PLAN on your query and see if it's doing a table scan.

I would guess that this clause would be problematic:

OR TO_CHAR(effective_to_date,'YYYY-Q') = '2010-2')

I think calling the function in the WHERE clause forces Oracle to scan every row, because it has to evaluate that function on every column in order to see if that row is part of the result set. You invalidate your indexes that way.

A better solution would be to do the search on an indexed column value that didn't require a function call to evaluate it. I'd recommend something like "DATE_COLUMN BETWEEN x AND y", where x and y are the start and end dates of the quarter. Make sure there's an index on DATE_COLUMN.

duffymo
I thought that as well. since function is being done on each get. but what would be the way to avoid that. should I search for quarters by date rather than using a function/
deming
@deming: Replacing the `TO_CHAR` by `>=` and `<=` (or `BETWEEN`) should allow to use an index, so I would try that.
Peter Lang
`TRUNCATE(effective_to_date) BETWEEN DATE '2010-01-01' AND DATE '2010-03-31'`
Vadim K.
... sorry got the dates wrong. Better yet, `effective_to_date >= DATE '2010-04-01' AND effective_to_date < DATE '2010-07-01'`
Vadim K.
You could add a function-based index for this expression.
PenFold
+1  A: 

Hmmmm, I'm having an attack of deja vu.

Anyway, here is the thing you need to work on:

3433602  consistent gets

Three million Logical IOs is going to chew up a lot of time, so what you need to do is reduce that number.

Your query consists of three sets of accesses against the same table. Each access consists of an index read followed by a table read. From your comment to Peter's question it seems your statistics are reasonably accurate (query returns 699415 rows, NUM_ROWS = 697608).

Tuning is a complicated activity, with lots of factors to consider. We could easily spend half a day going through all the things which might be wrong with this one query.

For instance, do you gather statistics for your indexes as well as the table (in earlier versions of Oracle the default is not to gather index stats)? If you have stats what is the clustering factor of those indexes? All your index accesses are scans so the clustering factor is relevant.

What we want is a low clustering factor because that means the index has to do less work to get the rows from the table. If the clustering factor is nearer to the number of entries in the index that is bad, whereas if it is closer to the number of blocks in the table that is good. Alas, given the number of LIOs you're experiencing, my money is on poor clustering factors. So you need to get more juice from the indexes.

Looking at your query, the columns in the outermost projection are used in the WHERE clause of the query and/or the sub-queries. Despite this fact you are using three different indexes and none of them provide all the information necessary to satisfy the criteria (hence the additional table reads and subsequent filtering). One tactic which can be very effective in these situations is to build a super index which contains all the necessary columns.

create index N_USER_ROLE_IDX23 on user_role    
       ( active
         , role_id
         , user_id
         , participant_code
         , effective_from_date
         , effective_to_date
         , last_update_date )

This leads with ACTIVE and ROLE_ID because those columns are used in all three sets of criteria. ( Incidentally your third query says this:

                       WHERE S.role_id = r.role_id
                       AND S.role_id = r.role_id

Is that correct? ) Anyway, the point of this index is that it satisfies all three WHERE clauses and the final projection, so it obviates the need to touch the table at all. Consequently it could dramatically reduce the number of consistent gets.

APC