views:

102

answers:

4

I have a very simple query that's giving me unexpected results. Hints on where to troubleshoot it would be welcome.

Simplified, the query is:

SELECT Obs.obsDate, 
       Obs.obsValue, 
       ObsHead.name
  FROM ml.Obs Obs 
  JOIN ml.ObsHead ObsHead ON ObsHead.hdId = Obs.hdId
 WHERE obs.hdId IN (53, 54)

This gives me a query cost of: 963. However, if I change the query to:

SELECT Obs.obsDate, 
       Obs.obsValue, 
       ObsHead.name
  FROM ml.Obs Obs 
  JOIN ml.ObsHead ObsHead ON ObsHead.hdId = Obs.hdId
 WHERE ObsHead.name IN ('BP SYSTOLIC', 'BP DIASTOLIC')

Although it (should) return the same data, the estimated cost shoots up to 17688. Where is the problem here likely to lie? Thanks.

Edit: The query plan says that the index on ObsHead.Name is being used for a range scan, and the table access on ObsHead only costs 4. There's another index on Obs.hdId that's being used for a range scan costing 94: it's the Nested Loops join between the tables that jumps up to 17K.

+1  A: 

There is probably an index on hdId (which there is if it's the primary key, which I suspect is the case) and not on name which means that the second query will have to do a full table scan.

klausbyskov
Nope, that's not what's happening. See my edit above for details.
SarekOfVulcan
@SarekOfVulcan: Are your statistics up-to-date? What is the difference in actual execution time?
klausbyskov
The statistics for those two table were last analyzed on January 28, so they're reasonably close to up-to-date. I'm not sure what the normal schedule for analyzing them here is -- we have a VAR who usually takes care of that for us. I'm not sure what the difference in execution time is -- I've been running the above as part of a much longer query, which took in excess of 8 hours yesterday.
SarekOfVulcan
+1  A: 

Costs are only useful for comparing different plans for one query; they're not so useful for comparing different queries.

You need to look at the plans and compare them in terms of the actions they perform.

I suspect the actual performance of these queries will be similar - however it would be interesting to know whether the first query uses a hash join, which might help things if the percentage of records in obs that are matched is significant.

Jeffrey Kemp
+1  A: 

I find the costs supplied by the optimizer to be interesting but not particularly useful. The best way I've found to compare queries is to run them and see how they perform relative to one another.

Share and enjoy.

Bob Jarvis
+1  A: 

As has already been stated, the plan's cost is not intended for comparing two different queries, only for comparing different paths for the same query.

This is only a guess, but in this case, the cardinality field of the plan might be more useful to you. If the index on OBSHEAD is not unique and the statistics were gathered using an estimate, then the optimizer may not know exactly how many rows to expect when querying that table. The cardinality will tell you whether this is true or not (ideally, you'll be seeing a cardinality of 2 for OBSHEAD).

Another suggestion is to check the statistics on OBS. It seems likely that is a table that grows frequently, in which case, January 28th is not recent enough to have gathered the statistics. Assuming monitoring is turned on for this table, the queries below can tell you if the statistics are stale and need to be refreshed.

select owner, table_name, last_analyzed, stale_stats
from all_tab_statistics
where owner = 'ML' and table_name = 'OBS';

select owner, index_name, last_analyzed, stale_stats
from all_ind_statistics
where owner = 'ML' and table_name = 'OBS';
Allan