views:

159

answers:

4
A: 

Hi Damien, Unfortunately the information you provide is not enough to give you a precise answer, but I think I can give you an useful hint. SQL Server allows you to view the query plan it uses uses to access the data; such plan will tell you in detail what is accessed, when, in which way, and how many rows are processed in each step. It also tells you how much time/resource consuming is each step, allowing you to find bottleneck easily.

To show the execution plan in Query Analyzer, open the Query Menu and click on "Show Execution Plan". Then run your first query and check the plan; in another window, run the second query and check the plan again. This way you can see what's the difference between them, which indexes (if any) are used and get a better understanding of SQL Server.

One hint: don't be discouraged if at the beginning everything seems complicated, it's just a matter of taking it slowly.

Finally, an useful resource for SQL Server (other than the MSDN, of course) is www.sqlservercentral.com, where you can find answers from users and experts. I hope this helps.

@diego:Thanks for the feedback. It was by using the execution plan that we figured out how to improve the performance in part - but I still want to understand *why* SQL server can optimise the first query itself and why it runs slowly with very little throughput compared to the split out queries.
Damien
+1  A: 

Hello,

I have had a look at your query and run some quick tests through it. As Diego mentioned, the execution plan in Query Analyzer is a great tool for troubleshooting this kind of thing.

First of all a note on indexes - when SQL recieves a query it will run through a series of steps - to simplify this process one of the main tasks here is to decide what is the most effecient way to retrieve the data. SQL will look at the table structure, statistics and indexes to determine what it believes will be the optimal path. It will ultimately choose one index to use to retrieve data (never multiple indexes).

The index is basically a lookup to where data is stored on the table (or in the case of the clustered index it actually defines HOW the data is stored on the disk). If your query is using a number of columns (either in the query, where or order clauses) then the query will need to retrieve these so that it can return your query. SQL will look at the index and then look at the order it needs to retrieve indexes in. Ideally SQL will be able to "seek" directly to the data you have requested (the most common alternative is a "scan", which basically means the entire index / structure has been scanned for the data). These terms are used in the above execution plan. There is a lot of additional complexity in the above (for example in some queries you may see bookmark lookups, where SQL uses an index to find the row and then perform a lookup to get any associated data) but that should be a good starting step.

As an aside here I will mention a site that i think is great for SQL performance info - www.sql-server-performance.com

Taking the above and applying to your information we can see that your main query is performing a clustered index "scan" - i.e. it searches the entire table to retrieve the information you requested. This means that no index was found that would allow it to directly seek to the information you required. To execute this query SQL would need to perform a filter on your fields 1 - 4, and then aggregate information (the MAX query) on the field and date columns.

As an extreme example, this index would allow a seek:

CREATE  INDEX [idx_field1234567] ON [dbo].[t_Table]([iSomeField1], [iSomeField2], [iSomeField3], [iSomeField4], [dtDate], [iField1],[iField2]) WITH  FILLFACTOR = 90,  PAD_INDEX  ON [PRIMARY]

However you should note a lot goes into creating an index, and generally you should avoid adding too many columns to an index and you should avoid too many indexes on a table. In SQL 2000 the index tuning wizard is a good deal to provide a decent baseline of indexes.

As Diego said, the above may seem daunting but the site above was a great reference for me.

Good luck!

Chris
+1  A: 

It depends. What indexes are on the table? What volume of rows are within the table? I've just created a sample which performed well but it may be very different to your scenario. As a rule, if the optimizer has trouble then the query needs to be simplified. What you have done maybe what is required. It depends. Here's the SQL I knocked up to see if I could find anything obvious with the show execution plan on.

set nocount on

GO

if object_id('tempdb..#MaxMinExample')is not null drop table #MaxMinExample

GO

create table #MaxMinExample([key] int identity(1,1) primary key clustered,iField1 int,iField2 int,dtDate datetime,iSomeField1 int,iSomeField2 int,iSomeField3 int,iSomeField4 int)

GO

--initial data set which we'll cartesian

insert into #MaxMinExample(iField1 ,iField2 ,dtDate ,iSomeField1 ,iSomeField2 ,iSomeField3 ,iSomeField4 ) values (1,2,getdate(),1,2,3,4) insert into #MaxMinExample(iField1 ,iField2 ,dtDate ,iSomeField1 ,iSomeField2 ,iSomeField3 ,iSomeField4 ) values (2,3,getdate()+1,4,5,6,7) insert into #MaxMinExample(iField1 ,iField2 ,dtDate ,iSomeField1 ,iSomeField2 ,iSomeField3 ,iSomeField4 ) values (3,4,getdate()+2,5,6,7,8) insert into #MaxMinExample(iField1 ,iField2 ,dtDate ,iSomeField1 ,iSomeField2 ,iSomeField3 ,iSomeField4 ) values (5,6,getdate()+3,6,7,8,9) insert into #MaxMinExample(iField1 ,iField2 ,dtDate ,iSomeField1 ,iSomeField2 ,iSomeField3 ,iSomeField4 ) values (6,7,getdate()+4,7,8,9,10)

GO

--create loads of data

declare @count int set @count=1 while (select count(*) from #MaxMinExample)<865830 begin insert into #MaxMinExample(iField1 ,iField2 ,dtDate ,iSomeField1 ,iSomeField2 ,iSomeField3 ,iSomeField4 ) select a.iField1+@count ,a.iField2+@count ,a.dtDate+@count ,a.iSomeField1+@count ,a.iSomeField2+@count ,a.iSomeField3+@count ,a.iSomeField4+@count from #MaxMinExample a cross join #MaxMinExample b set @count=@count+1 end

GO

--create the indexes

create index MaxMinExample_iSomeField1 on #MaxMinExample(iSomeField1) create index MaxMinExample_iSomeField2 on #MaxMinExample(iSomeField2) create index MaxMinExample_iSomeField3 on #MaxMinExample(iSomeField3) create index MaxMinExample_iSomeField4 on #MaxMinExample(iSomeField4) create index MaxMinExample_dtDate on #MaxMinExample(dtDate)

GO

declare @maxval1 int,@maxval2 int,@dtDateMin datetime,@dtDateMax datetime,@param1 int,@param2 int,@param3 int,@param4 int

select @param1=4,@param2=5,@param3=6,@param4=7

select @maxval1 = max(iField1), @maxval2 = max(iField2), @dtDateMin = Min(dtDate),@dtDateMax = Max(dtDate) from #MaxMinExample where iSomeField1=@param1 and iSomeField2=@param2 and iSomeField3=@param3 and iSomeField4=@param4

select top 1 @maxval1 = iField1 from #MaxMinExample where iSomeField1=@param1 and iSomeField2=@param2 and iSomeField3=@param3 and iSomeField4=@param4 order by iField1 DESC select top 1 @maxval2 = iField2 from #MaxMinExample where iSomeField1=@param1 and iSomeField2=@param2 and iSomeField3=@param3 and iSomeField4=@param4 order by iField2 DESC select top 1 @dtDateMin = dtDate from #MaxMinExample where iSomeField1=@param1 and iSomeField2=@param2 and iSomeField3=@param3 and iSomeField4=@param4 order by dtDate ASC select top 1 @dtDateMax = dtDate from #MaxMinExample where iSomeField1=@param1 and iSomeField2=@param2 and iSomeField3=@param3 and iSomeField4=@param4 order by dtDate DESC

Mark Baekdal - www.dbghost.com - SQL Developer

you might want to fix up the formatting....
Mitch Wheat
@Mark - thanks for the feedback, I've looked at your query on our SQL 2000 server and it performs as I would expect - i.e. the first query is the fastest and most effecient on the server. But on our live table I get the opposite (granted the index is different). Will look further into this...
Damien
A: 

In the end, the project moved to a new DWH and BI stack based on MicroStrategy so this became a none issue and the new DWH has a different schema and was done in SQL 2008 (but I never got to the bottom it :/)

Damien