views:

2106

answers:

4

Are there any good ways to objectively measure a query's performance in Oracle 10g? There's one particular query that I've been tuning for a few days. I've gotten a version that seems to be running faster (at least based on my initial tests), but the EXPLAIN cost is roughly the same.

  1. How likely is it that the EXPLAIN cost is missing something?
  2. Are there any particular situations where the EXPLAIN cost is disproportionately different from the query's actual performance?
  3. I used the first_rows hint on this query. Does this have an impact?
+1  A: 

AFAIK, EXPLAIN is using some database statistics to calculate the cost, so it can definitely differ from the actual performance.

haggai_e
+7  A: 

How likely is it that the EXPLAIN cost is missing something?

Very unlikely. In fact, it would be a level 1 bug :)

Actually, if your statistics have changed significantly from the time you ran the EXPLAIN, the actual query plan will differ. But as soom as the query is compliled, the plan will remain the same.

Note EXPLAIN PLAN may show you things that are likely to happen but may never happen in an actual query.

Like, if you run an EXPLAIN PLAN on a hierarchical query:

SELECT  *
FROM    table
START WITH
        id = :startid
CONNECT BY
        parent = PRIOR id

with indexes on both id and parent, you will see an extra FULL TABLE SCAN which most probably will not happen in real life.

Use STORED OUTLINE's to store and reuse the plan no matter what.

Are there any particular situations where the EXPLAIN cost is disproportionately different from the query's actual performance?

Yes, it happens very very often on complicate queries.

CBO (cost based optimizer) uses calculated statistics to evaluate query time and choose optimal plan.

If you have lots of JOIN's, subqueries and these kinds on things in your query, its algorithm cannot predict exactly which plan will be faster, especially when you hit memory limits.

Here's the particular situation you asked about: HASH JOIN, for instance, will need several passes over the probe table if the hash table will not fit into pga_aggregate_table, but as of Oracle 10g, I don't remember this ever to be taken into account by CBO.

That's why I hint every query I expect to run for more than 2 seconds in a worst case.

I used the first_rows hint on this query. Does this have an impact?

This hint will make the optimizer to use a plan which has lower response time: it will return first rows as soon as possible, despite the overall query time being larger.

Practically, it almost always means using NESTED LOOP's instead of HASH JOIN's.

NESTED LOOP's have poorer overall performance on large datasets, but they return the first rows faster (since no hash table needs to be built).

As for the query from your original question, see my answer here.

Quassnoi
+1, xlnt, insightful answer, as usual. So, do you sleep with the Oracle RDBMS source code under your pillow or something? ;-)
DCookie
@DCookie: no, I just was bitten by Tom Kyte when I was 8 :)
Quassnoi
With more complex queries, will the optimizer examine all possible paths or does it take the pragmatic cut-off option, accepting that it may potentially miss a better solution? I still miss the old rule-based optimiser, myself. You knew where you were with a rule-based optimizer (mutter, mumble)...
Mike Woodhouse
@Mike: with more complex queries optimizer just sucks. I personally hint every query that is going to run for more than 1 second. As for CBO, with CBO you can do everything you could do with RBO, and much more. See here for details: http://stackoverflow.com/questions/493492/oracle-10-optimizer-from-rule-to-cost-why
Quassnoi
I started reading this answer and thought, "this is damned good, I bet this is Quassnoi." And I was right!
Mark Harrison
Sometimes you need to look at v$sql_plan based on the sql_id of the executing query to see what plan it actually used. Sometimes, the plan it says it will use when you run the explain plan is not the plan it ultimately chooses. I am working with an issue like this right now. I can see both plans, I can see the difference, but I can't convince it not to use "MERGE JOIN CARTESIAN".
Grant Johnson
+1  A: 

In my experience EXPLAIN has been accurate and beneficial. If it wasn't it might not be the useful tool it is. When was the last time you analyzed the tables? I have seen where the Explain plan was nearly the same before and after an analyze, but the analyze made a huge performance gain.

northpole
+3  A: 

Q: Are there any good ways to objectively measure a query's performance in Oracle 10g?

And enabling Oracle trace in other environments isn't that difficult.

Q: There's one particular query that I've been tuning for a few days. I've gotten a version that seems to be running faster (at least based on my initial tests), but the EXPLAIN cost is roughly the same.

  • The actual execution of the statement is what needs to be measured. EXPLAIN PLAN does a decent job of predicting the optimizer plan, but it doesn't actually measure the performance.

Q:> 1 . How likely is it that the EXPLAIN cost is missing something?

  • Not very likely, but I have seen cases where EXPLAIN PLAN comes up with a different plan than the optimizer.

Q:> 2 . Are there any particular situations where the EXPLAIN cost is disproportionately different from the query's actual performance?

  • The short answer is that I've not observed any. But then again, there's not really a direct correlation between the EXPLAIN PLAN cost and the actual observed performance. It's possible for EXPLAIN PLAN to give a really high number for cost, but to have the actual query run in less than a second. EXPLAIN PLAN does not measure the actual performance of the query, for that you need Oracle trace.

Q:> 3 . I used the first_rows hint on this query. Does this have an impact?

  • Any hint (like /*+ FIRST_ROWS */) may influence which plan is selected by the optimizer.


The "cost" returned by the EXPLAIN PLAN is relative. It's an indication of performance, but not an accurate gauge of it. You can't translate a cost number into a number of disk operations or a number of CPU seconds or number of wait events.

Normally, we find that a statement with an EXPLAIN PLAN cost shown as 1 is going to run "very quickly", and a statement with an EXPLAIN PLAN cost on the order of five or six digits is going to take more time to run. But not always.

What the optimizer is doing is comparing a lot of possible execution plans (full table scan, using an index, nested loop join, etc.) The optimizer is assigning a number to each plan, then selecting the plan with the lowest number.

I have seen cases where the optimizer plan shown by EXPLAIN PLAN does NOT match the actual plan used when the statement is executed. I saw that a decade ago with Oracle8, particularly when the statement involved bind variables, rather than literals.

To get an actual cost for statement execution, turn on tracing for your statement. The easiest way to do this is with SQLPlus AUTOTRACE.

[http://asktom.oracle.com/tkyte/article1/autotrace.html][2]

Outside the SQLPlus environment, you can turn on Oracle tracing:

    alter session set timed_statistics = true;
    alter session set tracefile_identifier = here_is_my_session;
    alter session set events '10046 trace name context forever, level 12'
    --alter session set events '10053 trace name context forever, level 1'
    select /*-- your_statement_here --*/ ...
    alter session set events '10046 trace name context off'
    --alter session set events '10053 trace name context off'

This puts a trace file into the user_dump_dest directory on the server. The tracefile produced will have the statement plan AND all of the wait events. (The assigned tracefile identifier is included in the filename, and makes it easier to find your file in the udump directory)

    select value from v$parameter where name like 'user_dump_dest'

If you don't have access to the tracefile, you're going to need to get help from the dba to get you access. (The dba can create a simple shell script that developers can run against a .trc file to run tkprof, and change the permissions on the trace file and on the tkprof output. You can also use the newer trcanlzr. There are Oracle metalink notes on both.

spencer7593