views:

608

answers:

1

It seems that on some of the servers that we have, the cost of hash joins, group by's and order by's is too low compared to the actual cost. I.e. often execution plans with index range scans outperform the former, but on explain plan the cost shows up as higher.

Some further notes:

  1. I already set *optimizer_index_cost_adj* to 20 and it's still not good enough. I do NOT want to increase the cost for pure full table scans, in fact I wouldn't mind the optimizer decreasing the cost.
  2. I've noticed that *pga_aggregate_target* makes an impact on CBO cost estimates, but I definitely do NOT want to lower this parameter as we have plenty of RAM.
  3. As opposed to using optimizer hints in individual queries, I want the settings to be global.


Edit 1: I'm thinking about experimenting with dynamic sampling, but I don't have enough intimate knowledge to predict how this could affect the overall performance, i.e. how frequently the execution plans could change. I would definitely prefer something which is very stable, in fact for some of our largest clients we have a policy of locking the all the stats (which will change with Oracle 11g SQL Plan Management).

+2  A: 

Quite often when execution plans with index range scans outperform those with full scans + sorts or hash joins, but the CBO is picking the full scans, it's because the optimiser believes it's going to find more matching results than it actually gets in real life.

In other words, if the optimiser thinks it's going to get 1M rows from table A and 1000 rows from table B, it may very well choose full scans + sort merge or hash join; if, however, when it actually runs the query, it only gets 1 row from table A, an index range scan may very well be better.

I'd first look at some poorly performing queries and analyse the selectivity of the predicates, determine whether the optimiser is making reasonable estimates of the number of rows for each table.

EDIT: You've mentioned that the cardinality estimates are incorrect. This is the root cause of your problems; the costing of hash joins and sorts are probably quite ok. In some cases the optimiser may be using wrong estimates because it doesn't know how much the data is correlated. Histograms on some columns may help (if you haven't already got them), and in some cases you can create function-based indexes and gather statistics on the hidden columns to provide even better data to the optimiser.

At the end of the day, your trick of specifying the cardinalities of various tables in the queries may very well be required to get satisfactory performance.

Jeffrey Kemp
All the tables and indexes are analyzed. Most of the queries are using bind variables and are called thousands/millions of times, so they cannot be converted to use literals.
Andrew from NZSG
Just being 'analyzed' is not enough. You may need to look into which columns may benefit from histograms or may benefit from NOT having histograms. Rather than look at the cost estimates, look at the ROWS estimates and see if they are accurate.
Gary
Row estimates are incorrect (the cardinality is usually too low) but even then I get queries which favor hash joins over index range scans. In a couple of occasions I had to hint artificially low cardinality in order to force index range scans.
Andrew from NZSG