views:

491

answers:

4

I have a process that is performing badly due to full table scans on a particular table. I have computed statistics, rebuilt existing indices and tried adding new indices for this table but this hasn't solved the issue.

Can an implicit type conversion stop an index being used? What about other reasons? The cost of a full table scan is around 1000 greater than the index lookup should be.

EDIT:

SQL statement:

select unique_key 
from src_table 
where natural_key1 = :1 
and natural_key2 = :2 
and natural_key3 = :3;
  • Cardinality of natural_key1 is high, but there is a type conversion.
  • The other parts of the natural key are low cardinality, and bitmap indices are not enabled.
  • Table size is around 1,000,000 records.

Java code (not easily modifiable):

ps.setLong(1, oid);

This conflicts with the column datatype: varchar2

+3  A: 

Make you condition sargable, that is compare the field itself to a constant condition.

This is bad:

SELECT  *
FROM    mytable
WHERE   TRUNC(date) = TO_DATE('2009.07.21')

, since it cannot use the index. Oracle cannot reverse the TRUNC() function to get the range bounds.

This is good:

SELECT  *
FROM    mytable
WHERE   date >= TO_DATE('2009.07.21')
        AND date < TO_DATE('2009.07.22')

To get rid of implicit conversion, well, use explicit conversion:

This is bad:

SELECT  *
FROM    mytable
WHERE   guid = '794AB5396AE5473DA75A9BF8C4AA1F74'

-- This uses implicit conversion. In fact this is RAWTOHEX(guid) = '794AB5396AE5473DA75A9BF8C4AA1F74'

This is good:

SELECT  *
FROM    mytable
WHERE   guid = HEXTORAW('794AB5396AE5473DA75A9BF8C4AA1F74')

Update:

This query:

SELECT  unique_key
FROM    src_table
WHERE   natural_key1 = :1
        AND natural_key2 = :2
        AND natural_key3 = :3

heavily depends on the type of your fields.

Explicitly cast your variables to the field type, as if from string.

Quassnoi
I just learnt a new word: sargable!
Tony Andrews
One could also create a function-based index on Trunc(date) as an alternative I suppose.
David Aldridge
@David: yes, but this would make your table one index heavier without any additional benefits. `TRUNC` is a continuous function, that is any continuous range on `TRUNC` can be expressed as a continuous range on the original date. `UPPER`, for instance, is not. A continuous range on `UPPER` (like `UPPER(col1) BETWEEN 'AAA' AND 'BBB'`) cannot be expressed by rewriting the original query, that's why an index on `UPPER` is useful and an index on `TRUNC` is not.
Quassnoi
+7  A: 

Hi parkr,

an implicit conversion can prevent an index from being used by the optimizer. Consider:

SQL> CREATE TABLE a (ID VARCHAR2(10) PRIMARY KEY);

Table created

SQL> insert into a select rownum from dual connect by rownum <= 1e6;

1000000 rows inserted

This is a simple table but the datatype is not 'right', i-e if you query it like this it will full scan:

SQL> select * from a where id = 100;

ID
----------
100

This query is in fact equivalent to:

select * from a where to_number(id) = 100;

It cannot use the index since we indexed id and not to_number(id). If we want to use the index we will have to be explicit:

select * from a where id = '100';

In reply to pakr's comment: There are lots of rules concerning implicit conversions. One good place to start is the documentation. Among other things, we learn that:

During SELECT FROM operations, Oracle converts the data from the column to the type of the target variable.

It means that when implicit conversion occurs during a "WHERE column=variable" clause, Oracle will convert the datatype of the column and NOT of the variable, therefore preventing an index from being used. This is why you should always use the right kind of datatypes or explicitly converting the variable.

From the Oracle doc:

Oracle recommends that you specify explicit conversions, rather than rely on implicit or automatic conversions, for these reasons:

  • SQL statements are easier to understand when you use explicit datatype conversion functions.
  • Implicit datatype conversion can have a negative impact on performance, especially if the datatype of a column value is converted to that of a constant rather than the other way around.
  • Implicit conversion depends on the context in which it occurs and may not work the same way in every case. For example, implicit conversion from a datetime value to a VARCHAR2 value may return an unexpected year depending on the value of the NLS_DATE_FORMAT parameter.
  • Algorithms for implicit conversion are subject to change across software releases and among Oracle products. Behavior of explicit conversions is more predictable.
Vincent Malgrat
Does it always prevent the index being used? Or sometimes?
parkr
Very nice the:select rownum from dual connect by rownum <= 1e6
FerranB
+1  A: 

You could use a function-based index.

Your query is:

select
    unique_key 
from
    src_table
where
    natural_key1 = :1

In your case the index isn't being used because natural_key1 is a varchar2 and :1 is a number. Oracle is converting your query to:

select
    unique_key 
from
    src_table
where
    to_number(natural_key1) = :1

So... put on an index for to_number(natural_key1):

create index ix_src_table_fnk1 on src_table(to_number(natural_key1));

Your query will now use the ix_src_table_fnk1 index.

Of course, better to get your Java programmers to do it properly in the first place.

Nick Pierpoint
This solved my problem. Without access to source code, I was able to create a function-based index that avoids the implicit type conversion.
parkr
+1  A: 

What happens to your query if you run it with an explicit conversion around the argument (e.g., to_char(:1) or to_number(:1) as appropriate)? If doing so makes your query run fast, you have your answer.

However, if your query still runs slow with the explicit conversion, there may be another issue. You don't mention what version of Oracle you're running, if your high-cardinality column (natural_key1) has values that have a very skewed distribution, you may be using a query plan generated when the query was first run, which used an unfavorable value for :1.

For example, if your table of 1 million rows had 400,000 rows with natural_key1 = 1234, and the remaining 600,000 were unique (or nearly so), the optimizer would not choose the index if your query constrained on natural_key1 = 1234. Since you're using bind variables, if that was the first time you ran the query, the optimizer would choose that plan for all subsequent runs.

One way to test this theory would be to run this command before running your test statement:

alter system flush shared_pool;

This will remove all query plans from the optimizer's brain, so the next statement run will be optimized fresh. Alternatively, you could run the statement as straight SQL with literals, no bind variables. If it ran well in either case, you'd know your problem was due to plan corruption.

If that is the case, you don't want to use that alter system command in production - you'll probably ruin the rest of your system's performance if you run it regularly, but you could get around it by using dynamic sql instead of bind variables, or if it is possible to determine ahead of time that :1 is non-selective, use a slightly different query for the nonselective cases (such as re-ordering the conditions in the WHERE clause, which will cause the optimizer to use a different plan).

Finally, you can try adding an index hint to your query, e.g.:

  SELECT /*+ INDEX(src_table,<name of index for natural_key1>) */
         unique_key
    FROM src_table
   WHERE natural_key1 = :1
     AND natural_key2 = :2
     AND natural_key3 = :3;

I'm not a big fan of index hints - they're a pretty fragile method of programming. If the name changed on the index down the road, you'd never know it until your query started to perform poorly, plus you're potentially shooting yourself in the foot if server upgrades or data distribution changes result in the optimizer being able to choose an even better plan.

Steve Broberg