views:

1366

answers:

7

The query:

SELECT tbl1.*
   FROM tbl1 
JOIN tbl2
     ON (tbl1.t1_pk  = tbl2.t2_fk_t1_pk
AND tbl2.t2_strt_dt <= sysdate
AND tbl2.t2_end_dt  >= sysdate)
JOIN tbl3 on (tbl3.t3_pk = tbl2.t2_fk_t3_pk
AND tbl3.t3_lkup_1 = 2577304
AND tbl3.t3_lkup_2 = 1220833)
where tbl2.t2_lkup_1   = 1020000002981587;

Facts:

  • Oracle XE
  • tbl1.t1_pk is a primary key.
  • tbl2.t2_fk_t1_pk is a foreign key on that t1_pk column.
  • tbl2.t2_lkup_1 is indexed.
  • tbl3.t3_pk is a primary key.
  • tbl2.t2_fk_t3_pk is a foreign key on that t3_pk column.

Explain plan on a database with 11,000 rows in tbl1 and 3500 rows in tbl2 shows that it's doing a full table scan on tbl1. Seems to me that it should be faster if it could do a index query on tbl1.

Explain plan on a database with 11,000 rows in tbl1 and 3500 rows in tbl2 shows that it's doing a full table scan on tbl1. Seems to me that it should be faster if it could do a index query on tbl1.

Update: I tried the hint a few of you suggested, and the explain cost got much worse! Now I'm really confused.

Further Update: I finally got access to a copy of the production database, and "explain plan" showed it using indexes and with a much lower cost query. I guess having more data (over 100,000 rows in tbl1 and 50,000 rows in tbl2) were what it took to make it decide that indexes were worth it. Thanks to everybody who helped. I still think Oracle performance tuning is a black art, but I'm glad some of you understand it.

Further update: I've updated the question at the request of my former employer. They don't like their table names showing up in google queries. I should have known better.

+1  A: 

You can only tell by looking at the query plan the SQL optimizer/executor creates. It will be at least partial based on index statistics which cannot be predicted from just the definition (and can, therefore, change over time).

SQL Management studio for SQL Server 2005/2008, Query Analyzer for earlier versions.

(Can't recall the right tool names for Oracle.)

Richard
+5  A: 

The easy answer: Because the optimizer expects more rows to find then it actually does find.

Check the statistics, are they up to date? Check the expected cardinality in the explain plan do they match the actual results? If not fix the statistics relevant for that step.

Histogramms for the joined columns might help. Oracle will use those to estimate the cardinality resulting from a join.

Of course you can always force index usage with a hint

Jens Schauder
+1  A: 

Try adding an index hint.

SELECT /*+ index(tbl1 tbl1_index_name) */ .....

Sometimes Oracle just doesn't know which index to use.

Barry
It's code, it doesn't give up because <hysterical female voice> I just don't know which one to pick!</hfv> it chooses to not use the index because the cost of doing so was HIGHER than the cost of doing something else.
So... giving Oracle chocolates was not a good idea?
Barry
+3  A: 

It would be useful to see the optimizer's row count estimates, which are not in the SQL Developer output you posted.

I note that the two index lookups it is doing are RANGE SCAN not UNIQUE SCAN. So its estimates of how many rows are being returned could easily be far off (whether statistics are up to date or not).

My guess is that its estimate of the final row count from the TABLE ACCESS of TBL2 is fairly high, so it thinks that it will find a large number of matches in TBL1 and therefore decides on doing a full scan/hash join rather than a nested loop/index scan.

For some real fun, you could run the query with event 10053 enabled and get a trace showing the calculations performed by the optimizer.

Dave Costa
Dave must be a damn good tuner, he wrote everything I was going to and more. ;-) Seriously we need the cardinality column. Use DBMS_XPLAN to get us something useful.
A: 

Apparently this query gives the same plan:

SELECT tbl1.*   
FROM tbl1 
JOIN tbl2 ON (tbl1.t1_pk  = tbl2.t2_fk_t1_pk)
JOIN tbl3 on (tbl3.t3_pk = tbl2.t2_fk_t3_pk)
where tbl2.t2_lkup_1   = 1020000002981587
AND tbl2.t2_strt_dt <= sysdate
AND tbl2.t2_end_dt  >= sysdate
AND tbl3.t3_lkup_1 = 2577304
AND tbl3.t3_lkup_2 = 1220833;

What happens if you rewrite this query to:

SELECT tbl1.*    
FROM  tbl1 
,     tbl2
,     tbl3  
where tbl2.t2_lkup_1   = 1020000002981587 
AND   tbl1.t1_pk  = tbl2.t2_fk_t1_pk 
AND   tbl3.t3_pk = tbl2.t2_fk_t3_pk 
AND   tbl2.t2_strt_dt <= sysdate 
AND   tbl2.t2_end_dt  >= sysdate 
AND   tbl3.t3_lkup_1 = 2577304 
AND   tbl3.t3_lkup_2 = 1220833;
Edwin
Actually, the Explain Plan output looks exactly the same.
Paul Tomblin
Your second example is what the query looked like when I started. I changed it to do joins, but it didn't change the explain plan.
Paul Tomblin
Oracle turns these into the same intermediate internal format anyway. So unless there is a bug with the parsing, it'll come up with the same plan.
WW
+2  A: 

Oracle tries to return the result set with the least amount of I/O required (typically, which makes sense because I/o is slow). Indexes take at least 2 I/O calls. one to the index and one to the table. Usually more, depending on the size of the index and tables sizes and the number of records returns, where they are in the datafile, ...

This is where statistics come in. Lets say your query is estimated to return 10 records. The optimizer may calculate that using an index will take 10 I/O calls. Let's say your table, according to the statistics on it, resides in 6 blocks in the data file. It will be faster for Oracle to do a full scan ( 6 I/O) then read the index, read the table, read then index for the next matching key, read the table and so on.

So in your case, the table may be real small. The statistics may be off.

I use the following to gather statistics and customize it for my exact needs:

begin

 DBMS_STATS.GATHER_TABLE_STATS(ownname
=> '&owner' ,tabname => '&table_name', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE,granularity
=> 'ALL', cascade  => TRUE); 

 -- DBMS_STATS.GATHER_TABLE_STATS(ownname
=> '&owner' ,tabname => '&table_name',partname => '&partion_name',granularity => 'PARTITION', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE, cascade 
=> TRUE);

 -- DBMS_STATS.GATHER_TABLE_STATS(ownname
=> '&owner' ,tabname => '&table_name',partname => '&partion_name',granularity => 'PARTITION', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE, cascade 
=> TRUE,method_opt  => 'for all indexed columns size 254');

end;
A: 

It looks like an index for tbl1 table is not being picked up. Make sure you have an index for t2_lkup_1 column and it should not be multi-column otherwise the index is not applicable.

(in addition to what Matt's comment) From your query I believe you're joining because you want to filter out records not to do JOIN which may increase cardinality for result set from tbl1 table if there are duplicate matches from . See Jeff Atwood comment

Try this, which uses exist function and join (which is really fast on oracle)

select *
  from tbl1 
 where tbl2.t2_lkup_1 = 1020000002981587 and
       exists (
         select *
           from tbl2, tbl3 
          where tbl2.t2_fk_t1_pk = tbl1.t1_pk and
                tbl2.t2_fk_t3_pk = tbl3.t3_pk  and
                sysdate between tbl2.t2_strt_dt and tbl2.t2_end_dt and
                tbl3.t3_lkup_1 = 2577304 and
                tbl3.t3_lkup_2 = 1220833);

dt