I am developing a system which periodically (4-5 times daily) runs a select statement, that normally takes less than 10 seconds but periodically has taken up to 40 minutes.
The database is on Windows Server 2008 + SQL Server 2008 R2; both 64bit.
There is a service on the machine running the database which polls the database and generates values for records which require it. These records are then periodically queried using a multi table join select from a service on a second machine written in C++ (VS 2010) using the MFC CRecordset class to extract the data. An example of the the query causing the problem is shown below.
SELECT DISTINCT "JobKeysFrom"."Key" AS "KeyFrom","KeysFrom"."ID" AS "IDFrom",
"KeysFrom"."X" AS "XFrom","KeysFrom"."Y" AS "YFrom","JobKeysTo"."Key" AS "KeyTo",
"KeysTo"."ID" AS "IDTo","KeysTo"."X" AS "XTo","KeysTo"."Y" AS "YTo",
"Matrix"."TimeInSeconds","Matrix"."DistanceInMetres","Matrix"."Calculated"
FROM "JobKeys" AS "JobKeysFrom"
INNER JOIN "JobKeys" AS "JobKeysTo" ON
("JobKeysFrom"."Key"<>"JobKeysTo"."Key") AND
("JobKeysFrom"."JobID"=531) AND
("JobKeysTo"."JobID"=531)
INNER JOIN "Keys" AS "KeysFrom" ON
("JobKeysFrom"."Key"="KeysFrom"."Key") AND ("JobKeysFrom"."Status"=4)
INNER JOIN "Keys" AS "KeysTo" ON
("JobKeysTo"."Key"="KeysTo"."Key") AND ("JobKeysTo"."Status"=4)
INNER JOIN "Matrix" AS "Matrix" ON
("Matrix"."IDFrom"="KeysFrom"."ID") AND ("Matrix"."IDTo"="KeysTo"."ID")
ORDER BY "JobKeysFrom"."Key","JobKeysTo"."Key"
I have tried the following
- checked the indexes and all seem correct and they are active and are being used according to the query
- the design advisor comes back with no suggestions
- I have tried defragging the indexes and data
- rebuilt the database from scratch by exporting the data and reimporting it in a new database.
- ran the profiler on it and found that when it goes wrong it seems to do many millions (up to 100 million) of reads rather than a few hundred thousand.
- ran the database on a different server
During the time it is running the query, I can run exactly the same query in the management studio window and it will be back to running in 10 seconds. The problem does not seem to be lock, deadlock, CPU, disk or memory related as it has done it when the machine running the database was only running this one query. The server has 4 processors and 16 gb of memory to run it in. I have also tried upgrading the disks to much faster ones and this had no effect.
It seems to me that it is almost as though the database receives the query, starts to process it and then goes to sleep for 40 minutes or runs the query without using the indexes.
When it takes a long time it will eventually finish and send the query results (normally about 70-100000 records) back to the calling application.
Any help or suggestions would be gratefully received, many thanks