views:

132

answers:

7

I have 2 tables which have many records (say both TableA and TableB has about 3,000,000 records).vr2_input is a varchar input parameters enter by the users and I want to get the most 200 largest "dateField" 's TableA records whose stringField like 'vr2_input' .The 2 tables are joined as the following:

select * from(   
   select * from 
        TableA join TableB on TableA.id = TableB.id
        where  TableA.stringField like 'vr2_input' || '%'
        order by  TableA.dateField desc   
) where rownum < 201

The query is slow , I goggled that and found out that it is because "like" and "order by" involves the full table scan .However , I cannot found a solution to solve the problem . How can I tune this type of SQL? I have already create an index on TableA.stringField and TableA.dateField but how can I use the index feature in the select statement? The database is oracle 10g. Thanks so much!!

Update : I use iddqd 's suggestion and only select the fields that I want and run the explain plan . It cost about 4 mins to finish the query . IX_TableA_stringField is the index name of the TableA.srv_ref field .I run again the explain plan without the hint , the explain plan still get the same result.

EXPLAIN PLAN FOR
    select * from(
         select   
                 /*+ INDEX(TableB IX_TableA_stringField)*/ 
                  TableA.id,
                    TableA.stringField,
                    TableA.dateField,
                    TableA.someField2,
                   TableA.someField3,
            TableB.someField1,
            TableB.someField2,
            TableB.someField3,
                    from TableA 
                    join TableB  on  TableA.id=TableB.id
                    WHERE TableA.stringField  like '21'||'%'  
                 order by TableA.dateField  desc
        ) where rownum < 201

PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
Plan hash value: 871807846                                                                                                                                                                                                                                                                    

--------------------------------------------------------------------------------------------------------                                                                                                                                                                           
| Id  | Operation                       | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |                                                                                                                                                                      
--------------------------------------------------------------------------------------------------------                                                                                                                                                                           
|   0 | SELECT STATEMENT                |                      |   200 | 24000 |  3293   (1)| 00:00:18 |                                                                                                                                                                   
|*  1 |  COUNT STOPKEY                  |                      |       |       |            |          |                                                                                                                                                                                   
|   2 |   VIEW                          |                      |  1397 |   163K|  3293   (1)| 00:00:18 |                                                                                                                                                                             
|*  3 |    SORT ORDER BY STOPKEY        |                      |  1397 | 90805 |  3293   (1)| 00:00:18 |                                                                                                                                                              
|   4 |     NESTED LOOPS                |                      |  1397 | 90805 |  3292   (1)| 00:00:18 |                                                                                                                                                                      
|   5 |      TABLE ACCESS BY INDEX ROWID| TableA       |  1397 | 41910 |   492   (1)| 00:00:03 |                                                                                                                                                 
|*  6 |       INDEX RANGE SCAN          | IX_TableA_stringField |  1397 |       |     6   (0)| 00:00:01 |                                                                                                                                                         
|   7 |      TABLE ACCESS BY INDEX ROWID| TableB      |     1 |    35 |     2   (0)| 00:00:01 |                                                                                                                                                      
|*  8 |       INDEX UNIQUE SCAN         | PK_TableB   |     1 |       |     1   (0)| 00:00:01 |                                                                                                                                                            
--------------------------------------------------------------------------------------------------------                                                                                                                                                                           

Predicate Information (identified by operation id):                                                                                                                                                                                                                                      
---------------------------------------------------                                                                                                                                                                                                                                             

   1 - filter(ROWNUM<201)                                                                                                                                                                                                                                                                      
   3 - filter(ROWNUM<201)                                                                                                                                                                                                                                                                      
   6 - access("TableA"."stringField" LIKE '21%')                                                                                                                                                                                                                                                 
       filter("TableA"."stringField" LIKE '21%')                                                                                                                                                                                                                                                     
   8 - access(TableA"."id"="TableB"."id")       
A: 

Create indexes on the stringField and dateField columns. The SQL engine uses them automatically.

I have already create an index on TableA.stringField and TableA.dateField ,but it is very slow .
It is better to collect new statisctics after you did a ddl statement like create table, alter table or create index.
TTT
A: 

How many records does Table A has if it's the smaller table could you do the select on that table and then loop though the results retrieving the Table B records, as both the select and the sort are on TableA.

A good experiment would be to remove the join and test the speed on that also if allowed can you put the rownum < 201 as an AND clause on the main query. It's probable at the moment that the query is returning all rows to the outer query and then it's getting trimmed?

Jane T
A: 
select id from(   
   select /*+ INDEX(TableB stringField_indx)*/ TableB.id from 
        TableA join TableB on TableA.id = TableB.id
        where  TableA.stringField like 'vr2_input' || '%'
        order by  TableA.dateField desc   
) where rownum < 201

next:

SELECT * FROM TableB WHERE id iN( id from first query)

Please send stats and DDL of this tables.

iddqd
A: 

If you have enough memory you can hint the query to use hash join. Could you please attach the explain plan

Zoltan Hamori
+1  A: 

Do you need to select all columns (*)? The optimizer will be more likely to full scan if you select all columns. If you need all columns in output you may be better to select the id in your inline view and then join back to select other columns, which could be done with an index lookup. Try running an explain plan for both cases to see what the optimizer is doing.

Lord Peter
+2  A: 

You say it's taking about 4 minutes to run the query. The EXPLAIN PLAN output shows an estimate of 18 seconds. So the optimizer is probably far off on some of its estimates in this case. (It could still be choosing the best possible plan, but maybe not.)

The first step in a case like this is to get the actual execution plan and statistics. Run your query with the hint /*+ gather_plan_statistics */, then immediately afterwards execute select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')).

This will show the actual execution plan that was run, and for each step it will show the estimated rows, actual rows, and actual time taken. Post the output here and maybe we can say something more meaningful about your issue.

Without that information, my suggestion is to try out the following rewrite of the query. I believe it is equivalent since it appears that ID is the primary key of TableB.

select TableA.id,
       TableA.stringField,
       TableA.dateField,
       TableA.someField2,
       TableA.someField3,
       TableB.someField1,
       TableB.someField2,
       TableB.someField3,
  from (select * from(
         select   
                  TableA.id,
                    TableA.stringField,
                    TableA.dateField,
                    TableA.someField2,
                    TableA.someField3,
                    from TableA 
                    WHERE TableA.stringField  like '21'||'%'  
                 order by TableA.dateField  desc
          )
          where rownum < 201
       ) TableA
       join TableB  on  TableA.id=TableB.id
Dave Costa
+1, gather_plan_statistics is so cool. I can't believe I hadn't heard of it before.
jonearles
A: 

You can create one function index on tableA. That will return 1 or 0 based on the condition TableA.stringField like 'vr2_input' || '%' is satisfied or not. That index will make query run faster. The logic of the funtion will be

if (substr(TableA.stringField, 1, 9) = 'vr2_input' THEN return 1; else return 0;

Using actual column names instead of "*" may help. At least common column names should be removed.

Vishnu Gupta