views:

125

answers:

10

I have a query that I have been tuning for some time but I can't seem to get the execution time down much. In the execution plan everything looks like it is doing what it is supposed to, no large costs associated with any particular part of the query, everything is using index seek where it is supposed to. When I run the same query against a different client it runs fairly quickly but only returns 150k records. When I run it for my biggest client it returns 600k records and takes over ten minutes.

Could my issue be that with the number of records I'm returning it will be hard to get good performance or does what I described above seem within the ability of SQL Server?

A: 

Have you run the query through the index tuning wizard? Even if you're already using indexes it's possible that other indexes, particularly compound ones, could still improve performance.

It's also possible that breaking the query into parts with temp tables can improve things.

We run queries in our reporting system that routinely return 300,000 records joining 20 tables with many co-related subqueries and get sub-second response time.

Sam
Unfortunately this problem only happens on my live environment. I'm working on getting the test environment data restored with live data. When I tried running DTA on the live environment it started placing a lot of locks on tables and I had to kill it. Is there a way around this?
Abe Miessler
It's really not helpful to test performance without a realistic data set. You must do backups from the live environment on a regular basis, so restore one of those backups to a development server.
Sam
I'm testing it against realistic data, the live environment. I'm just not running DTA against that environment for performance reasons.
Abe Miessler
+1  A: 

You need to first determine how long the actual query is taking and then you can determine how long it is taking to return the entire 600K rows to the client (which you probably shouldn't be doing). Assuming each row of data is say 100 bytes, you are returning 60MB to the client. That is going to be painful.

Randy Minder
+3  A: 

Besides the number of records, what else is different on between the two systems:

  • RAM available for SQL
  • CPU cores
  • IO configuration (number of spindles in RAID, type of RAID, configuration of the LUN)
  • IO path (layout of logical and physical disks, location of database mdf/ndf/ldf files)
  • index fragmentation
  • load on the SQL Server
  • load on the host machine

When compared the two locations for SET STATISTICS IO ON:

  • does the 600k do about x4 times IO compared with the 150k location? Then the time difference can be entirely attributed to the differences in hardware.
  • Is the logical IO count on the lines of x4 times but the physical IO count differ wildly? Then you have a RAM issues (not enough to cache the database in memory).
  • Is the number of physical IOs close to the expected x4 times, but the time is very different? Then you are probably dealing with fragmentation.
  • Does the number of IOs differ significantly from the expected x4 times? Then you have a different plan probably driven by very different cardinality and estimates.

These are, of course, wild shots in the dark without proper data to back them up. Consider them a guess, not an authorithative solution.

Remus Rusanu
There is only one system. The database contains data for both clients, i'm just changing the parameters of my query to pull one clients info vs the others.
Abe Miessler
I assume you already compared the two plans. How about the statistics io comparison?
Remus Rusanu
I'm a bit new to reading IO statistics but from what I understand the main thing I want to avoid is Physical Reads, correct? While there are more logical reads in the long running query the Physical Reads look to be the same. I posted it below if you would like to take a look.
Abe Miessler
@Abe: Because it's on the same system it's entirely possible for the database with 600k records to be memory constrained where as the "smaller" one isn't. How much ram do you have versus how big is the larger database? If the bigger one doesn't fit into ram then it probably has to perform a lot of disk reads which is tremendously slower.
Chris Lively
+6  A: 

It might be the rows. But more likely it's that the client that generated 4x more data also has 4x more activity on the database in other areas. That means 4x more memory usage, 4x more disk io, and 4x more locks. Make sure the hardware is allocated appropriately.

But moving on, I'm trying to imagine what useful work you could do with a query that returns 600K records, or even just 150K. I'm assuming these are never shown to the end user, because even with paging that's way too many to be useful. And if not, we need consider how this data is used.

If this is for a batch process, perhaps 10 minutes is perfectly fine and there's no justification to spend your valuable time working on it further. If it's something that will be used in combination with another query, perhaps you need to bring in elements from that query sooner, to keep your result sets smaller.

Joel Coehoorn
Data analysis systems can often take hundreds of thousands of records as input data and prevent something useful to the user that is not raw data. It's also very often not performant to do this work in SQL server.
Sam
You make a very good point about the 600k records. That is something i'm looking into as we speak.
Abe Miessler
@Sam yeah, not saying there's nothing useful you can do with 600K records. Just very little that's useful _and_ also time sensitive _and_ that belongs in your db.
Joel Coehoorn
+1  A: 

At how many returned records should I begin expecting performance problems?

At one.

This query:

SELECT  COUNT(*)
FROM    myreallybigtable 

will return exactly one record, but you may need to wait for hours for it to complete.

The client-server I/O (which is probably the only thing that depends on and only on the number and size of the records returned) is usually one of the least important factors.

What really matters is the query plan which specifies how and in which order the records from the underlying tables are accesses, transformed and returned back.

So, as was suggested by the others, just post your query here and we will probably be able to tell you how to optimize it.

Quassnoi
A: 

Alright brace yourself.... This is the original stored proc that I inherited. Let me know if you'd like to see the changes I have made.

The section with the "--HERE" comment is where real performance problems begin.

ALTER PROCEDURE [dbo].MyProc
@LEAESIID int,
@AcademicYear varchar(10),
@ReportType varchar(16),
@SchoolESIID varchar(max),
-- 
@Grade varchar(8000),
@Gender varchar(8000),
@Race varchar(8000),
--
@AsOfDate datetime,
--
@UserID int
AS

create table #TempSchool
(
    SchoolESIID int
)
Create table #TempRace
(
    Race varchar(60)
)

declare @CALPADSSnpshtKey int
select @CALPADSSnpshtKey = 
    CALPADSSnpshtKey 
    from vwCertSnpshtRptngSnpsht cs2
    Where 
    -- Filter to correct snapshot
        rtrim(ltrim(cs2.AcdmcYearCode)) = rtrim(ltrim(@AcademicYear)) AND
        rtrim(ltrim(cs2.LEARptngEsiId)) in (select rtrim(ltrim(ParsedValue)) from dbo.tfnParseStringIntoTable(@LEAESIID, ',')) AND
        rtrim(ltrim(cs2.RptngTypeCode)) = rtrim(ltrim(@ReportType)) 

Insert into #TempSchool
select ParsedValue from dbo.tfnParseStringIntoTable(@SchoolESIID, ',')  
Insert into #TempRace
select ParsedValue from dbo.tfnParseStringIntoTable(@Race, ',') 

/*
Select query to pull back data from the reporting views.
Since this report is based around enrollment information, the primary table will be StuEnrlmt.
*/
declare 
    @UserLevel nVarchar(10),
    @ESILEAList nvarchar(max),
    @ESISchoolList nvarchar(max)

-- added by jackson chan 090109
-- program
    create table #tmpProgramSet (
        StuKey int
        , CALPADSSnpshtKey int
        , TitleIPartCMigrantFlag char(1)
        , SocioEconomicallyDisadvantagedFlag char(1)
        , SpecialEducationFlag char(1)
        , GiftedAndTalentedFlag char(1)
    )

    insert into #tmpProgramSet
    select se.StuKey,
        se.CALPADSSnpshtKey ,
        max(isnull(case sp.EduPgmCode
            when '135' then 'Y'
            else 'N'
        end, 'N')) as TitleIPartCMigrantFlag,
        max(case
            when 
                isnull(sp.EduPgmCode, 000) = 175 OR
                isnull(s.HighstEduLvlCode, 0) = 14
            then 'Y' 
            else 'N'
        end ) as SocioEconomicallyDisadvantagedFlag,
        max(isnull(case sp.EduPgmCode
            when '144' then 'Y'
            else 'N'
        end, 'N')) as SpecialEducationFlag,
        max(isnull(case sp.EduPgmCode
            when '127' then 'Y'
            else 'N'
        end, 'N')) as GiftedAndTalentedFlag
    from    dbo.vwStuEnrlmtRptngSnpsht se inner join 
            dbo.vwStuRptngSnpsht s on 
                se.StuKey = s.StuKey and
                se.CALPADSSnpshtKey = s.CALPADSSnpshtKey 
                inner join #TempSchool schl2 on 
                se.SchlAtndncEsiID = schl2.SchoolESIID   and
                se.LEARptngESIID = @LEAESIID left outer join
            dbo.vwStuPgmRptngSnpsht sp on 
                se.StuKey = sp.StuKey and
                se.CALPADSSnpshtKey = sp.CALPADSSnpshtKey
        group by se.StuKey, se.CALPADSSnpshtKey 
        order by se.StuKey;

--HERE
    Select distinct
        se.LEARptngEsiID,
        se.SchlAtndncEsiID  as SchlAtndncEsiID,
        se.SchlAtndncCDSCode as SchlAtndncCode,
        se.SchlAtndncName as SchlAtndncName,
        se.SchlAtndncType as SchlAtndncType,
        se.StuKey,
        s.StuIDStwdCal,
        isnull(s.StuLastOrSrnmLgl,'') + ', ' + isnull(s.StuFstNameLgl,'') + ' ' + isnull(s.StuMdlNameLgl,'') as StudentName,
        se.StuIDLcl,
        s.GndrCode,
        isnull(case
                when s.StuHspncEnctyIndctr = 'Y' then 'Hispanic'
                when s.StuEnctyMsngIndctr = 'Y' or s.StuRaceMsngIndctr = 'Y' then 'Missing'
                when s.EthnicityRaceCode2 is not null then 'Multiple' -- if a second race is populated,then Multiple
                else s.EthnicityRaceCode1
        end, 'Missing') as RaceEnthnicity,
        se.GrdLvlCode,
        isnull(
            case s.EngLangAcqstnStatStCode
                when 'EL' then 'Y'
                else 'N'
            end, 'N') as EnglishLearner,
        isnull(
            case
            when 
                -- if a value is null, set it to any value that will evaluate to false in the expression
                -- only students with valid information should be counted as Title III Eligible Immigrants
            -- disabled by jackson chan 12/08/09. Per defect 1605, Student Birth Country Special Circumstance Indicator is not a required field anymore
                --isnull(s.StuIneligSnorImgrntIndctr, 'Y') = 'N' AND
                isnull(s.StuEnrldUSSchlLessThanThreCumltvYrsIndctr, 'N') = 'Y' AND
                isnull(s.CntryCode, 'US') != 'US' AND
                isnull(se.EnrlmtStatCode, '0') = '10' AND
                -- Calculate age from birth date
                isnull(case 
                when datepart(month, s.StuBirDate) < datepart(month, getdate())
                    then datediff(year, s.StuBirDate, getdate())
                when datepart(month, s.StuBirDate) = datepart(month, getdate()) and datepart(day, s.StuBirDate) <= datepart(day, getdate()) 
                    then datediff(year, s.StuBirDate, getdate())
                else datediff(year, s.StuBirDate, getdate()) -1
                end , 0) between 3 and 21 AND
                isnull(se.GrdLvlCode, 'AD') != 'AD'
            then 'Y'
            else 'N'
            end, 'N') as TitleIIIEligibleImmigrantFlag,
        sp.SocioEconomicallyDisadvantagedFlag,
        isnull(case when s.EngLangAcqstnStatStCode in ('EL', 'RFEP')  AND s.EngLangArtsTestProfcyCode = 'N' then 'Y'
            else 'N'
        end, 'N') as LimitedEnglishProficientFlag,
        sp.TitleIPartCMigrantFlag,
        sp.SpecialEducationFlag,
        sp.GiftedAndTalentedFlag
    From 
        dbo.vwStuEnrlmtRptngSnpsht se
        inner join dbo.vwStuRptngSnpsht s on 
            se.StuKey = s.StuKey and
            se.CALPADSSnpshtKey = s.CALPADSSnpshtKey
        left join #tmpProgramSet sp on 
            se.StuKey = sp.StuKey and
            se.CALPADSSnpshtKey = sp.CALPADSSnpshtKey
        inner join #TempSchool schl on 
            se.SchlAtndncEsiID = schl.SchoolESIID   and
            se.LEARptngESIID = @LEAESIID
        inner join #TempRace r on 
            isnull(case
                when s.StuHspncEnctyIndctr = 'Y' then 'Hispanic'
                when s.StuEnctyMsngIndctr = 'Y' or s.StuRaceMsngIndctr = 'Y' then 'Missing'
                when s.EthnicityRaceCode2 is not null then 'Multiple' -- if a second race is populated,then Multiple
                else s.EthnicityRaceCode1
            end, 'Missing') = r.Race 
    Where
        -- Enrollments
        se.StuEsiRltnspExpctdSchlStartDate <= @AsOfDate AND (se.WithdrlDate is null OR se.WithdrlDate >= @AsOfDate) AND
        se.EnrlmtStatCode = '10' AND
        se.StuExitCatgCode != 'N470' AND -- no shows are not considered in active enrollment numbers
        -- Filter to correct snapshot
        se.CALPADSSnpshtKey = @CALPADSSnpshtKey AND
        -- User selection filters
        rtrim(ltrim(se.GrdLvlCode))                       in (select rtrim(ltrim(ParsedValue)) from dbo.tfnParseStringIntoTable(@Grade, ',')) AND
        rtrim(ltrim(s.GndrCode))                              in (select rtrim(ltrim(ParsedValue)) from dbo.tfnParseStringIntoTable(@Gender, ',')) 
Abe Miessler
A: 

Try putting indexes on your temp tables and see whether that helps.

HLGEM
I honestly didn't know you could do that. I've give that a shot. At the very least you taught me something today :)
Abe Miessler
A: 

Here are the IO statistics for the query that finishes in a reasonable amount of time:

Table '#tmpProgramSet_0000000017E2'. Scan count 11405, logical reads 36450, physical reads 0, read-ahead reads 61, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefStuExitCatg'. Scan count 1, logical reads 22810, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefEngLangArtsTestProfcy'. Scan count 1, logical reads 22810, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefEngLangAcqstnStatSt'. Scan count 1, logical reads 22810, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table '#TempRace_0000000017E1'. Scan count 1, logical reads 11405, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table '#TempSchool_0000000017E0'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'stu'. Scan count 10939, logical reads 47465, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table '#68CACE20'. Scan count 1, logical reads 13814, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefEductlSrvcInstn'. Scan count 1, logical reads 448, physical reads 0, read-ahead reads 372, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefEnrlmtStat'. Scan count 1, logical reads 27628, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'StuEnrlmt'. Scan count 2, logical reads 141994, physical reads 60, read-ahead reads 200, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefGrdLvl'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table '#4D22B3AB'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefGndr'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefCntryCode'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefRace'. Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
Table 'RefFedEnctyRaceCatg'. Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

and from my long running query

Table '#tmpProgramSet_0000000017F5'. Scan count 757266, logical reads 2418742, physical reads 0, read-ahead reads 158, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefStuExitCatg'. Scan count 1, logical reads 1514532, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefEngLangArtsTestProfcy'. Scan count 1, logical reads 1514532, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefEngLangAcqstnStatSt'. Scan count 1, logical reads 1514532, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TempRace__0000000017F4'. Scan count 1, logical reads 757266, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#TempSchool__0000000017F3'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'stu'. Scan count 586229, logical reads 2711554, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#065B3107'. Scan count 1, logical reads 637919, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefEductlSrvcInstn'. Scan count 1, logical reads 448, physical reads 0, read-ahead reads 332, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefEnrlmtStat'. Scan count 1, logical reads 1276828, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StuEnrlmt'. Scan count 2, logical reads 2692331, physical reads 1386, read-ahead reads 97737, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefGrdLvl'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#780D11B0'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefGndr'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefCntryCode'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefRace'. Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'RefFedEnctyRaceCatg'. Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Abe Miessler
+1  A: 

Looking at your IO stats:

Table 'RefStuExitCatg'. Scan count 1, logical reads 22810 ...
Table 'RefEngLangArtsTestProfcy'. Scan count 1, logical reads 22810 ... 
Table 'RefEngLangAcqstnStatSt'. Scan count 1, logical reads 22810 ...

vs.

Table 'RefStuExitCatg'. Scan count 1, logical reads 1514532, ...  
Table 'RefEngLangArtsTestProfcy'. Scan count 1, logical reads 1514532,...
Table 'RefEngLangAcqstnStatSt'. Scan count 1, logical reads 1514532, ...

the fast query has to read 22810 pages on all those 'Ref...' tables. By comparison the slow query has to read 1514532 pages. That is 1.5M vs. 22k which is 66 times more. So your slow database has way, way, way bigger data size difference than the 150k vs. 600k rows you are aware of. I would say this is a pretty good explanation of the difference.

Remus Rusanu
Very interesting. Are the number of pages being read based on the base table or on the set of joined tables? Is it possible that when I introduce joins for the slow client the number of pages required skyrockets while the other does not? Just to clarify there is only 1 DB. I am running the same query on the same DB with different params based on the client. The 150k and 600k i referred to are the number of records returned by the queries, not the total records in the DB.
Abe Miessler
The 'Ref' tables were scanned once and the scan had to read 22.8k and respectively 1.5M pages. That means that whatever filtering conditions apply to one client vs. other, the on one case you have about 66 times more records satisfying the filtering in the case of the 'slow' customer. Perhaps you can look into these three tables and the indexes and how they are used by the query (I know u posted the query, but is ... rather gargantuan of a query)
Remus Rusanu
A: 

Have you considered the physical design of the database at both sites? It is possible, that the client with large DB stores 'older' data on slower disks and it could be the reason of slowly running query..

toper