views:

117

answers:

4

I am not so proficient in TSql as of now (writing since last 4/5 months) but I have written many queries. Although I have given the outputs, sometimes I feel that the queries are not so optimized. I searched in google and found lot of stuffs about query optimization, and they ask to look into the query plan(actual & estimated) for the performance improvisation.

As I already said that I am very new to writing queries so it is becoming difficult for me to grasp those solutions. But I need to learn query optimization.

Can any body help me out initially how and where should I start from?

Searching in internet reveals that, SEEK is better than SCAN(May it be index or Table). How can I achieve a seek over a scan?

Then they says that ORDER BY clause i.e. sorting is more costly. Then what is the work around? How can I write effective query?

Can anybody explain me, with some examples, which kind of query is better over what and in what situation?

Edited

Dear All,

You all have answered and that will help me a lot. But what I intend to say is that, you all have practised a lot for becoming an expert. Once upon a time, I guess you all were like what I am now.So my humble request is how you all started for writing optimised query.I know that patience is needed and I will devote that. I apologise for any wrong statement of mine.

Thanks in advance

+1  A: 

Always ensure that you have indexes on your tables. Not too many and not too few.

Using sql server 2005, apply included columns in these indexes, they help for lookups.

Order by is costly, if not required, why sort a data table if it is not required.

Always filter as early as possible, if you reduce the number of joins, function calls etc, as early as possible, you reduce time taken over all

  • avoid cursors if you can
  • use temp tables/ table vars for filtering where possible
  • remote queries will cost you
  • queries with sub selects in the where clause can be hurtfull
  • table functions can be costly if not filtered

as always, there is no hard rule, and things should be taken on a per query basis.

Always create the query as understandle/readable as possible, and optimize when needed.

EDIT to comment question:

Temp tables can be used when you require to add indexes on the temp table (you cannot add indexes on var tables, except the pk). I mostly use var tables when i can, and only have the required fields in them as such

DECLARE @Table TABLE( FundID PRIMARY KEY )

i would use this to fill my fund group ids instead of having a join to tables that are less optimized.

I read a couple of articles the other day and to my surprise found that var tables are actually created in the tempdb

link text

Also, i have heard, and found that table UDFs can seems like a "black box" to the query planner. Once again, we tend to move the selects from the table functions into table vars, and then join on these var tables. But as mentioned earlier, write the code first, then optimize when you find bottle necks.

I have found that CTEs can be usefull, but also, that when the level of recursion grows, that it can be very slow...

astander
You need to be more specific than "Always ensure that you have indexes on your tables" !
Mitch Wheat
did you only read the first line, or the entire answer?
astander
@astander: the whole answer. I don't see the words 'covering index' or 'query workload' anywhere....
Mitch Wheat
ok, im not gonna get into a word fight with you. I think performance tuning is a big field, and more of an art than a science. If you can steer someone in the right direction, good, if you find my answer to be incorrect, then do as appropriate.
astander
priyanka.sarkar
Check the edited answer
astander
+2  A: 

Articles discussing Query Optimization issues are often very factual and useful, but as you found out they can be hard to follow. It is a bit like when someone is trying to learn the basics rules of baseball, and all the sports commentary he/she finds on the subject is rife with acronyms and strategic details about the benefits of sacrificing someone at bat, and other "inside baseball" trivia...

So you need to learn the basics first:

  • the structure(s) of the database storage
  • indexes' structure, the clustered and non clustered kind, the multi column indexes
  • the concept of covering a query
  • the selectivity of a particular column
  • the disadvantage of indexes when it comes to CRUD operations
  • the basic subtasks/strategies of a query: table or index scan, index seek, sorting, inner-outer merge etc.
  • the log file, the data recovery model.

The following links apply to MS SQL Server. If that is not the DBMS you are using you can try and find similar material for the system of your choice. In fact, so long as you realize that the implementation may vary, it may be useful to peruse the MS documention.
MS SQL storage structures
MS SQL pages and extents

Then as you started doing, learn the way to read query plans (even if not in fully understand at first), and all this should bring you to a level where you start to make sense of the more advanced books or articles on the topic. I do not know of tutorials for Query Plans on the Internet (though I'm quite sure they exist...), but the following methodology may be of use: Start with simple queries, review the query plan (if possible in a graphic fashion), start recognizing the most common elements: Table Scan, Index Seek, Sort, nested loops... Read the detailed properties of these instances: estimated nb of rows, cost percentage etc. When you find a new element that you do not know/understand, use this keyword to find details on the internet. Also: experiment a lot.

Finally you should remember that while the way the query is written and the set of indexes etc. provided cover a great part of optimization needs, there are other sources of optmization, for example the way hardware is put to use (a basic example is how by having the data file and the log file on separate physical disks, we can greatly improve CRUD performance).

mjv
Yes Sir, I agree with you. But do you know any links that is meant only for a novice to start with reading query plans. I know that many are available but all what I found is becoming difficult for me to fathom. Or atleast tell me how you started in your career?
priyanka.sarkar
See my edit in the "query plan" section. I do not know of tutorials, but I suggest a possible methodology to learn this. My career is a long story... I started with SQL at a time when there was less to learn; I tried to spare you from some of the newer tools and construts (ex: CTEs, spare indexes, partition...) because these are important but again should come after the "old" stuff. Another key thing was the abilty to experiment a lot.
mjv
+2  A: 

Searching in internet reveals that, SEEK is better than SCAN(May it be index or Table). How can I achieve a seek over a scan?

Add the necessary index -- if the incremental costs on INSERT and UPDATE (and extra storage) are an overall win to speed up the seeking in your queries.

Then they says that ORDER BY clause i.e. sorting is more costly. Then what is the work around? How can I write effective query?

Add the necessary index -- if the incremental costs on INSERT and UPDATE (and extra storage) are an overall win to speed up the ordering in your queries.

Can anybody explain me, with some examples, which kind of query is better over what and in what situation?

You already pointed out a couple of specific questions -- and the answers were nearly identical. What good would it do to add another six?

Run benchmark queries over representative artificial data sets (must resemble what you're planning to have in production -- if you have small toy-sized tables the query plans will not be representative nor meaningful), try with and without the index that appear to be suggested by the different query plans, measure performance; rinse, repeat.

It takes 10,000 hours of practice to be good at anything. Optimizing DB schemas, indices, queries, etc, is no exception;-).

Alex Martelli
You exaggerate. I've learned to make the perfect softboiled egg without 10,000 hours of practise.
Ben M
It's all Gladwell's fault -- guess he hasn't spent 10,000 hours writing provocative best-selling nonfiction yet;-).
Alex Martelli
+3  A: 

ORDER BY is a necessary evil - there's no way around it.

Refer to this question for solving index seek, scan and bookmark/key lookups. And this site is very good for optimization techniques...

OMG Ponies