views:

79

answers:

3

The question is quite simple: What can I do to make sure that the SQL Server query optimizer has all the information it needs to choose the "best" query plan?

The background of this question is that recently we've run into more and more cases where SQL Server chooses a bad query plan, i.e., cases, where adding query hints, join hints or explicitly using temporary tables instead of "one big SQL" drastically improved performance. I'm actually quite surprised that the query optimizer delivers such a lot of bad results, so I'm wondering whether we did something wrong. No indexes are missing (according to query analyzer and common sense), and statistics are updated frequently by a maintainance task.

Let me emphasize that I am not talking about missing indexes here! I'm talking about the situation where there is a "good" and a "bad" query plan (given the current state of the DB), and SQL Server chooses a "bad" plan although the indexes present would allow it to use a "good" plan. I'm wondering whether there is some possibility to improve the results of the query optimizer without having to optimize all queries manually (with query hints or USE PLAN).

+1  A: 

SQL Server's query optimizer bases its plans on statistics. If the statistics are not up to date or indexes have been 'skewed' by many inserts or deletes, then the optimizer sometimes gets it wrong.

Most of the time it does a great job. Are your statistics up to date? Do you have a regular scheduled index maintenance job?

Are you falling foul of parameter sniffing?

Is your data selective enough for an index to be choosen?

Are your indexes fragmented?

I wouldn't use query HINTS unless you really have to as a last resort. They have a nasty habit of coming back to haunt you.

Mitch Wheat
Yes, they are. `UPDATE STATISTICS` was the first thing we tried...
Heinzi
Well, yes, the whole point of my question was that I want to avoid HINTS and rather help the query optimizer do its work right; so I'm completely with you on that. :-) Indexes and statistics are frequently being updated/reorganized, so that should not be the issue. About the data: Well, that depends on what the user enters; that's not really something I can control...
Heinzi
+2  A: 

There is more to how the query plan is generated than just indexes and statistic.

See 13 things you should know about statistics and the query optimizer for a good explanation of these.

Oded
Thanks for the link, that was an interesting read.
Heinzi
+1  A: 

Many times, I have found that an overly-large query can be optimized by breaking it into two or more queries, with the interim results stored in temporary tables. I know of no hard-and-fast rule of when a query becomes too big; simple “sequential” inner joins of 20 tables can be optimized properly, bizarrely arcane joins on 8 tables might get botched. When this happens, in general I try to “pull out” as much as I can into a first query to work over the “small” tables and return as small a data set as possible into a temp table, and then use that in a second query to process the “big” tables. (“Small” and “big” are, of course, entirely relative to your situation.)

The rule of thumb I’ve come up with to explain this is that some queries can just get too large or too complex for any “generic” one-size-fits-all algorithm to be able to produce an optimal query plan within a reasonable amount of time. In short, while by and large you can write queries as one big lump, it often makes more sense to break them into manageable chunks. (I know I’ve read articles on this subject in the past, but this subject doesn’t come up too often and I can’t recall when or where I read them.)

Philip Kelley
possibly because SQL Server stores only one query plan per batch.
Mitch Wheat
@Mitch: So, is there a useful alternative to splitting the queries? I've observed similar behaviors...
Heinzi