tags:

views:

183

answers:

2

Will Oracle flatten multiple between clauses if they have overlapping data? In my application, users can create dynamic search terms, so it's possible that there could be overlapping data. Will Oracle optimize the SQL for me or do I have to calculate it before I create the SQL?

i.e.

Select id from xxtable WHERE id (BETWEEN 10 AND 20) or (BETWEEN 18 AND 30)

Will it be run "as is" or converted to:

Select id from xxtable WHERE id (BETWEEN 10 AND 30)

Thanks for your time.

+2  A: 

It depends on the query as to how the optimizer will handle it. You can test it in SQLPlus using autotrace and looking at the Predicate Information (this is done on Oracle version 10.2.0.3):

SQL>set autot traceonly
SQL>
  1  select l
  2    from (SELECT l
  3            FROM (SELECT LEVEL l
  4                    FROM dual CONNECT BY LEVEL < 20
  5                 )
  6         )
  7   where l between 5 and 10
  8*     or l between 7 and 15;

11 rows selected.

Elapsed: 00:00:00.21

Execution Plan
----------------------------------------------------------
Plan hash value: 2403765415

--------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |     1 |    13 |     2   (0)| 00:00:01 |
|*  1 |  VIEW                         |      |     1 |    13 |     2   (0)| 00:00:01 |
|*  2 |   CONNECT BY WITHOUT FILTERING|      |       |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("L">=5 AND "L"<=10 OR "L">=7 AND "L"<=15)
   2 - filter(LEVEL<20)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
        494  bytes sent via SQL*Net to client
        403  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         11  rows processed

No rewriting done by the optimizer in this case on the where criteria, but if we change it slightly:

SQL>
  1  select l
  2    from (SELECT l
  3            FROM (SELECT LEVEL l
  4                    FROM dual CONNECT BY LEVEL < 20
  5                 )
  6         )
  7   where l between 5 and 10
  8*     or l between 7 and 10;

6 rows selected.

Elapsed: 00:00:00.20

Execution Plan
----------------------------------------------------------
Plan hash value: 2403765415

--------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |     1 |    13 |     2   (0)| 00:00:01 |
|*  1 |  VIEW                         |      |     1 |    13 |     2   (0)| 00:00:01 |
|*  2 |   CONNECT BY WITHOUT FILTERING|      |       |       |            |          |
|   3 |    FAST DUAL                  |      |     1 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("L"<=10 AND ("L">=5 OR "L">=7))
   2 - filter(LEVEL<20)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          0  consistent gets
          0  physical reads
          0  redo size
        388  bytes sent via SQL*Net to client
        392  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          6  rows processed

On this one we see that the optimizer does recognize that both where criteria have the same upper bound. So it depends on the query as to how the optimizer will rewrite it.

Dougman
What a great answer! Thanks!I guess the safe way to do it then is for me to just make sure it gets done on my end.
davidemm
+3  A: 

It's worth looking at what happens when we run a trial against a real table with an index. This example table has 69,241 rows and a non-unique index on COL_3, with statistics.

Case 1: yer basic two overlapping BETWEEN clauses

SQL> set autotrace traceonly exp
SQL>
SQL> select * from big_table
  2  where col_3 between 0.8 and 1
  3  or col_3 between 0.9 and 1.1
  4  /

Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 14737 |   805K|   176   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| BIG_TABLE | 14737 |   805K|   176   (1)| 00:00:03 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COL_3"<=1.1 AND "COL_3">=0.9 OR "COL_3"<=1 AND
              "COL_3">=0.8)

SQL>

Upshot: The index is ignored and a full table scan ensues

Case 2: the BETWEEN clauses share an upper bound

SQL> select * from big_table
  2  where col_3 between 0.8 and 1.1
  3  or col_3 between 0.9 and 1.1
  4  /

Execution Plan
----------------------------------------------------------
Plan hash value: 1461639892

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |  7924 |   433K|   114   (0)| 00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BIG_TABLE |  7924 |   433K|   114   (0)| 00:00:02 |
|*  2 |   INDEX RANGE SCAN          | BIG3_IDX  |  7924 |       |    17   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL_3"<=1.1)
       filter("COL_3">=0.8 OR "COL_3">=0.9)

SQL>

Upshot: The index is used for the upper bound and a full table scan is avoided

Case 3: the BETWEEN clauses share an lower bound

SQL> select * from big_table
  2  where col_3 between 0.8 and 1.1
  3  or col_3 between 0.8 and 1.2
  4  /

Execution Plan
----------------------------------------------------------
Plan hash value: 3993303771

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 15146 |   828K|   176   (1)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| BIG_TABLE | 15146 |   828K|   176   (1)| 00:00:03 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("COL_3"<=1.2 OR "COL_3"<=1.1) AND "COL_3">=0.8)

SQL>

Upshot: The index is ignored and a full table scan ensues

Case 4: the two BETWEEN ranges are merged into a single clause

SQL> select * from big_table
  2  where col_3 between 0.8 and 1.1
  3  /

Execution Plan
----------------------------------------------------------
Plan hash value: 1461639892

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |  7924 |   433K|   114   (0)| 00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BIG_TABLE |  7924 |   433K|   114   (0)| 00:00:02 |
|*  2 |   INDEX RANGE SCAN          | BIG3_IDX  |  7924 |       |    17   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL_3">=0.8 AND "COL_3"<=1.1)

SQL>

Upshot: The index is used for both the upper and lower bounds

So, in summary, if the two BETWEEN clauses overlap and there is an index on the column then it could be worth the effort of merging them.

APC
+1 for highlighting the potential effect on performance
Jeffrey Kemp
I am using an index, so this was good to know. Thanks!
davidemm