ansaurus

Question

Answer 1

+1 A:

Providing an "answer" in order to try to work this through as this is something I'm particularly interested in.

I came across this MSDN article on how to see what is in the SQL Server cache. There is a query there that will show you how many data pages are cached by object - I've tweaked it just to include the index name as below:

SELECT count(*) AS cached_pages_count, obj.name, index_id, i.name AS IndexName
FROM sys.dm_os_buffer_descriptors AS bd 
    INNER JOIN 
    (
        SELECT object_id, object_name(object_id) AS name 
            ,index_id ,allocation_unit_id
        FROM sys.allocation_units AS au
            INNER JOIN sys.partitions AS p 
                ON au.container_id = p.hobt_id 
                    AND (au.type = 1 OR au.type = 3)
        UNION ALL
        SELECT object_id, object_name(object_id) AS name   
            ,index_id, allocation_unit_id
        FROM sys.allocation_units AS au
            INNER JOIN sys.partitions AS p 
                ON au.container_id = p.partition_id 
                    AND au.type = 2
    ) AS obj 
        ON bd.allocation_unit_id = obj.allocation_unit_id
    LEFT JOIN sysindexes i ON obj.object_id = i.id AND obj.index_id = i.indid
WHERE database_id = db_id()
GROUP BY obj.name, index_id, i.name
ORDER BY cached_pages_count DESC;

If you try the following steps, you should be able to see what is going on with regard to caching. Do these within your database (as opposed to e.g. master):

1) checkpoint + clear the cache down
2) run the above query and you should get probably get 1 record returned (for sysobjvalues), but nothing for Table1
3) now run the SELECT TOP 1 '1' FROM MyTable statement
4) rerun the above query and see what appears in the results now - you'll probably see record(s) for MyTable showing cached pages - make a note of that number

This should give you an indication as to the level of data caching that is happening for that initial SELECT. If you repeat the process through again but instead of the SELECT TOP statement, execute your sproc, and then see how much ends up in the cache when that is run - maybe comparing these results will indicate the relative amount of caching that's being done by the SELECT TOP 1 in comparison to the sproc call - and that relative amount could indicate the performance improvement.

This is very much "thinking out loud" stuff. I wouldn't have thought the TOP 1 would have really primed the cache significantly for the sproc call, but that's why I'm interested in this question!

I would have initially thought it was more to do with other factors (e.g. server/disk load). You could alternate between to the 2 scenarios for 3 or 4 iterations, one after the other, to double check whether the SELECT TOP approach is in fact consistently better (help minimise the risk of it being a one-off blip)

Hope this helps/gets the ball rolling.

Update:
Now you know it's not the SELECT TOP that's priming the cache, a good way to prime the cache is as AdrianBanks said. At least now you can explain what was unexpected/confusing the performance difference! Keep the above script in your library, it is useful for checking the state of the cache.

AdaTheDev 2010-02-04 10:18:47

Looks like I may have been wrong in assuming that my performance gains were do to the select. Take a look at the update to my post. Great post though. Very informative!

Abe Miessler 2010-02-04 22:59:59

Answer 2

+1 A:

Your update to your question tallies with what I would expect to happen. I can't see how running the SELECT 1... query could have any real performance benefit on the subsequent query.

As I understand it, SQL Server loads data pages (containing either table data or index data) into memory as it needs them when running queries. These are kept in memory unless they are explicitly cleared (using DBCC DROPCLEANBUFFERS - ie. remove any buffers (cached pages) in memory that have not been altered since loaded), or there is memory pressure (either low free memory on the machine or a maximum memory set on SQL Server). Because of this behaviour, it can be beneficial to warm-up a SQL Server database for use. When you subsequently run a query, the data needed to collect the query results may already be in memory. If it is, the query will execute faster as it will incur less IO.

The problem comes, however, in knowing what to pre-cache and therefore what queries to run. You could run a SQL trace on typical activity and then replay it to pre-cache data that gets used frequently. Without letting SQL Server hold a massive amount of allocated memory though, you are always going to have to read some things from disk (unless you have a small database). As you will never know what is cached and what isn't, relying on this behaviour for performance feels wrong.

I would concentrate my efforts into making the queries more efficient, by reading less data or using indices where possible. That will also give you general benefits as well as better performance from cold starts.

adrianbanks 2010-02-05 00:03:26

ansaurus

tags:

views:

answers:

Priming or warming cache in SQL Server

related questions