views:

129

answers:

4

Running explain plan on this query I am getting Full table Access.

Two tables used are:

user_role:   803507 rows
cmp_role:    27 rows

Query:

SELECT 
   r.user_id, r.role_id, r.participant_code, MAX(status_id) 
  FROM 
    user_role r, 
    cmp_role c 
  WHERE 
    r.role_id = c.role_id 
    AND r.participant_code IS NOT NULL 
    AND c.group_id = 3 
    GROUP BY 
    r.user_id, r.role_id, r.participant_code 
    HAVING MAX(status_id) IN (SELECT b.status_id FROM USER_ROLE b 
                              WHERE (b.ACTIVE = 1 OR ( b.ACTIVE IN ( 0,3 )  
    AND SYSDATE BETWEEN b.effective_from_date AND b.effective_to_date 
                                    )) 
                             )

How can I better write this query so that it returns results in a decent time. Following are the indexes:

idx 1 = role_id
idx 2 = last_updt_user_id
idx 3 = actv_id, participant_code, effective_from_Date, effective_to_date
idx 4 = user_id, role_id, effective_from_Date, effective_to_date
idx 5 = participant_code, user_id, roke_id, actv_cd

Explain plan:

Q_PLAN
--------------------------------------------------------------------------------
  SELECT STATEMENT
    FILTER
      HASH GROUP BY
        HASH JOIN
          TABLE ACCESS BY INDEX ROWID ROLE
            INDEX RANGE SCAN N_ROLE_IDX2
          TABLE ACCESS FULL USER_ROLE
      TABLE ACCESS BY INDEX ROWID USER_ROLE
        INDEX UNIQUE SCAN U_USER_ROLE_IDX1
    FILTER
      HASH GROUP BY
        HASH JOIN
          TABLE ACCESS BY INDEX ROWID ROLE
            INDEX RANGE SCAN N_ROLE_IDX2
          TABLE ACCESS FULL USER_ROLE
      TABLE ACCESS BY INDEX ROWID USER_ROLE
        INDEX UNIQUE SCAN U_USER_ROLE_IDX1

I do not have enough priveleges to run stats on the table

Tried the following changes but it shaves off 1 or 2 seconds only:

WITH CTE AS (SELECT b.status_id FROM USER_ROLE b 
                                  WHERE (b.ACTIVE = 1 OR ( b.ACTIVE IN ( 0,3 )  
        AND SYSDATE BETWEEN b.effective_from_date AND b.effective_to_date 
                                        )) 
                                 )
    SELECT 
       r.user_id, r.role_id, r.participant_code, MAX(status_id) 
      FROM 
        user_role r, 
        cmp_role c 
      WHERE 
        r.role_id = c.role_id 
        AND r.participant_code IS NOT NULL 
        AND c.group_id = 3 
        GROUP BY 
        r.user_id, r.role_id, r.participant_code 
        HAVING MAX(status_id) IN (select * from CTE)
A: 
  1. Collect statistics for the tables
  2. explain plan for the query and show the results.
Matthew Watson
I do not have enough priveleges to run stats. and explain plan screen shot I wanted to post but i do not have enough points (10) required tp post a pic :(
Mehur
i've provided the explain plan
Mehur
+3  A: 

Firstly you have the subquery

SELECT b.status_id FROM USER_ROLE b 
WHERE (b.ACTIVE = 1 
        OR ( b.ACTIVE IN ( 0,3 )  
        AND SYSDATE BETWEEN b.effective_from_date AND b.effective_to_date )
      )

There is no way that you can do anything other than a full table scan to get that result. You may be missing a join, but not knowing what you expect your query to do, there's no way for us to tell.

Secondly, depending on the proportion of cmp_role records with a group_id of 3, and the proportion of user_role than match those roles, it may be better off doing the full scan there. If, say, 3 out of the 27 cmp_role records are in group 3, and 100,000 of the user_role records match those cmp_role records, then it can be more efficient doing a single scan of the table than doing 100,000 index lookups.

Gary
thanks. I am also looking at the option of archiving data in user_role table. How can I do this with a join?
Mehur
A: 

I think the following approach will work.I would have thought the subquery will be evaluated only once since it is not correlated - this doesnt seem to be the case.I tried a similar query (simple) against sales table in sh demo schema. I modified it to use a Materialized CTE approach and it ran in 1 second as opposed to 18 sec. See below for the approach.This was 10 times faster

with cte as (
select /*+materialize*/  max(amount_sold) from sales)
select prod_id,sum(amount_sold) from
sales
group by prod_id
having max(amount_sold) in(
select * from cte)
/

So in you case you materialize the subquery as

with CTE as (
    SELECT /*+ materialize */ b.status_id FROM USER_ROLE b 
                                  WHERE (b.ACTIVE = 1 OR ( b.ACTIVE IN ( 0,3 )  
        AND SYSDATE BETWEEN b.effective_from_date AND b.effective_to_date 
                                        )) 
                                 )
)

and select FROM CTE in main query

josephj1989
this approach quadripples the time in my case.
Mehur
i don't have your table so i can't be sure.But when I try a somewhat similar select against sh database (918843) records the cte query finishes sub seconds and the other query takes 18 seconds.In your case there is no way CTE query can quadruple times-at worst it should produce minimal degradation.Dis you make a type on Hint name or something?A query running against 900000 records(your case) should run in under 5 secs.
josephj1989
Sorry, in the morning I must have been doing something wrong when I posted the comment. I just tried now and it is taking around 16.5 seconds. I've edited the question with new query.
Mehur
Note you did not use /*+ materialize */ hint that does the impreovement
josephj1989
:) now I know why it was removed in my pl/sql developer. Adding that hint makes the query literally crawl.
Mehur
I think the difference is that in your query you are using materizlie hint on the max(amount_sold) however, where I am using it does not have max in the query
Mehur
Hello I am sure that is impossible - adding the materialize hint can in no way make it crawl it will only make it fasterthanks
josephj1989
A: 

So you have a query that currently takes 16,5 seconds and you want it to run faster. To do that, you need to know where those 16,5 seconds are spent on. The Oracle database is extremely well instrumented, so you can see in great detail what it is doing. You can check it this thread that I wrote on OTN Forums:

http://forums.oracle.com/forums/thread.jspa?messageID=1812597

Without knowing where your time is being spent, all efforts are just guesses ...

Regards, Rob.

Rob van Wijk
query that is taking long is the query inside `in` if I remove that then the whole thing runs fast
Mehur
That's not what I mean with determining where time is being spent. Please read the link and post a proper explain plan output and a tkprof snippet.
Rob van Wijk
@Rob I dont have enough privelege for tkprof but I've put in the explain plan and query in another question: http://stackoverflow.com/questions/3149293/optimize-oracle-query your post was informative
Mehur