views:

120

answers:

6

I just read this blog post and, in short, it say that if an SQL server isn't doing a good enough job building query plans, then the last thing you want to do is start hard coding stuff. So, that got me thinking; how could you "hard code" stuff without hard coding stuff. (Yes, that's the way I tend to think.)

Is the following; 1) possible, 2) a good idea and 3) are there any tools to help DBAs do it?

You have a slow query and for some reason you don't like the query plan your DBMS is choosing. So you start playing around with it (on the dev sever) by forcing different things until you get a plan that's better. Then you try and decompose the DBMS's decision process to find why it didn't choose that one and then reverse engineer what it will take to get it to choose the better one.


Edit: The first answers seem to be answering a slightly different question than I was trying to ask.

The situation: you have a query that's not fast enough but you think you can make it faster (at this point in time) by hard coding parts of the query plan. However you don't want to do that in prod for any number of reason.

The (proposed) solution: Build and test hard coded query plans. When you find one that's fast enough, try and back out what the DBMS is looking at (statistics etc.) when it chooses not to use the better plan. Once you have that, use that as a guide for what to tune.

The question: Is the above a practical way to go about the task? Are their enough knobs to turn to make it work? Are their any tools that show that data was used to make query plan decisions?

+2  A: 

You can hardcode index names and optimization hints in your query.

Here is one query converted and a possible conversion

SELECT Customers.CompanyName, Sum (Orders.TotalAmount) As TotalAmount
FROM Customers
    INNER JOIN Orders
        ON Customers.ID = Orders.CustomerID
WHERE Customer.Country IN ('USA', 'Canada')

Now for the hardcoded hints

SELECT Customers.CompanyName, Sum (Orders.TotalAmount) As TotalAmount

FROM Customers WITH (NOLOCK, IdxCountry)
    INNER LOOP JOIN Orders (TABLOCK, IdxCustomerOrders)
        ON Customers.ID = Orders.CustomerID

WHERE Customer.Country IN ('USA', 'Canada')

The second query forces the execution plan to use the following

  • Customers has NOLOCK, and must use Index IDXCountry
  • Orders has TABLOCK, and must use Index IdxCustomersOrders
  • The JOIN HINT is now LOOP

Can you softcode Index and Join hints? Not as a part of a regular query. You can, however, build and execute Dynamic SQL for the purpose.

SQL Server is very good at query optimization, and the more you read about it, the better queries you will write. I think you should read up on Query Optimization and Index Tuning to understand this better

http://stackoverflow.com/questions/1332778/what-are-your-most-common-sql-optimizations

http://stackoverflow.com/questions/604879/how-can-i-learn-sql-server-index-tuning

To answer your question(s)

  1. Si Si. It is possible to hardcode without hardcoding.
  2. It is not a good idea to do this yourself unless you monitor performance and execution very regularly.
  3. More than tools for DBAs, it is a question of knowledge of SQL Server that will help. Read more on query optimization and it will automagically take you there.
Raj More
Based on your answer, I'm not sure you are answering the question I asked. see my forthcoming edit.
BCS
+3  A: 

First of all, as the commenter points out on the post you link to, there is a bit of FUD going on here -- to summarize for those that don't want to click the link: "Things can go bad... Buy my book!"

Second what you list in your question is not hard coding, it is using hard coding to figure out the best ways to work with a DB. This seems ok to me. As long as you don't leave the hard coded hints in there, things should be ok, SQL server can still change the optimization as data changes.

Hogan
Correction: "Things can go bad... Buy some other guys book!"
BCS
As to the second part: Exactly, it's: "hard coding" it without hard coding it. But that wasn't my question.
BCS
Or even "Things can go bad... buy some other guy's book so I can get some money from Amazon!"
Hogan
Of course then he says "Yes, we also cover this in our book SQL Server Internals and Troubleshooting"...
Hogan
+2  A: 

If the optimizer chooses a plan that you are not expecting, it's usually a sign that the underlying data structures are not formed properly. An index could be missing. You could be attempting a join across two columns with different data types. You could be searching or joining on a complex generated column. I've actually seen stuff like this before:

SELECT *
FROM
  t1
  INNER JOIN t2 ON LTRIM(RTRIM(t1.code)) = LTRIM(RTRIM(t2.code))

This is crying out for improvements in the underlying data structures!

Jeffrey L Whitledge
I do that all the time. But I have no control over the underlying tables
Joe Philllips
How can it be a "bad plan" if an index is missing? You may get bad performance, but that's not because of a bad plan, it's because you, the human, either deliberately or accidentally failed to build the index. Again, why is the "Plan" bad because you're joining on different datatypes? Seems more like a human error again, not the optimizer reaching a bad plan. And your code example yet again isn't a bad plan. No amount of hinting or stored outlines/profiles will fix that Select statement. Why? because it's not a "plan" problem.
Stephanie Page
@Stephanie Page - I took "bad plan" to mean "not the plan I expected". This could mean that an index that a person thought was there wasn't actually there. I didn't mean to impune the integrety of your optimizer!
Jeffrey L Whitledge
I'm an Oracle gal... so you weren't hurting my feelings. Just wanted to get us (the SO crowd, not just you and I) to the same understanding. Bad Plan means the index is there and SHOULD be used but isn't, etc. The Join order should be A->B then to -> C but is A->C then ->B.
Stephanie Page
@Stephanie Page - OK, point taken. I will correct the language of my answer. Thanks!
Jeffrey L Whitledge
To defend the original post, I could imagine some queries where the optimizer rejects a good/better plan because it requires something that in most case (but not this one) is really really slow. (OTOH that's not what I'm asking about)
BCS
A: 

Yes, you should figure out why it came up with the sub-optimal plan. This shouldn't be a hit or miss. You should manually figure out the best plan, looks for the differences and then for the reason. The number on cause of bad plans is invalid cardinality calculations. Skew will kill you.

** This is how you would do it in Oracle, you can tell me how SQL Server works with bind variables **

Say you have a Yes/No column and an index on the column because it winds up frequently in where clauses. Out of 10000 rows, one is yes, the other 99.99% of rows are no.

If you write your query with a bind variable as in: (Parameterized Query in SS speak)

SELECT * FROM table WHERE yn_col = :1

There's no way for the optimizer to cache a single plan for this query. If you ask for Yes, then an index is appropriate, a FTS wouldn't be too bad. IF you ask for No then a FTS would be appropriate but an index scan followed by table access by row-id would be a nightmare.

You have two choices, fix the plan at a FTS, that way it's always the same time. To the end user it looks consistent. But if you want to take advantage of the index, you'll need to write this as two queries without parameterizing it so that the optimizer will see this as 'Yes' and 'No' and if there's a histogram on the column values, the cardinality will be correctly calculated.

You have no other choice here, unless SQL Server has a capacity to cache bind variable sensitive plans as Oracle added in 11g.

Stephanie Page
+2  A: 

The fact that you're asking would (in most cases other than yours, I'm sure) suggest that you don't have enough experience with SQL Server to be in a position to micromanage the optimizer well. I and the SQL Server wizards I know (a couple of whom do this quite effectively - but always in edge cases, in my experience) would say that you'll end up discovering a better way to have written your query in the first place and your hints won't really end up being useful. But YMMV, as usual.

I'd spend my time rephrasing my queries and grokking query plans. And looking for the other things we all know but sometimes forget to check that can also cause surprises, like whether statistics are current, and whether we're testing against enough accurately representative data.

Hints and locking modes are the two adjustments people can start messing with too quickly when they start running out of other ideas. With the optimizer, it's especially important to know "why" what you're expecting isn't happening, and "why" the change you made caused the effect you observed.

le dorfier
OK, sounds reasonable. And that leads to a related question: would experimenting with hand tuned query plans be a good tool helping optimize a query: build he best plan you can and then back out a query that gives it?
BCS
Possibly. In my experience, coercing the query plan just demonstrates that the employment of a given index, for instance, doesn't end up reflecting the improvement in execution time or disk reads I expected. Then you're still left figuring out why, but you've narrowed it down.
le dorfier
+2  A: 

http://support.microsoft.com/default.aspx/kb/914288 This url points to how to create a stats only database. No concerns about tearing anything up but with current index statistics to be able to do some query plan analysis. It may help.

jl
That, plus a correctly selected subset of the real database, would make a powerful combo (I would think)
BCS