views:

69

answers:

3

I'm working on an interactive contact search page (contacts are returned via ajax as you type or select criteria). I want this page to be very responsive.

There is a complex set of rules to determine which contact records a given contact can see; these rules are rolled up into a user-defined function, DirectoryContactsByContact(@ContactID). I've optimized this function considerably but it's still a little expensive (1-2 seconds to execute), so to improve performance I'm thinking about something like this:

  • When the page loads, cache DirectoryContactsByContact for this user as a SQL table, e.g. cache_DirectoryContactsByContact_1
  • Perform the search against the cached table (checking each time to make sure it exists)
  • After a little while (say 30 minutes) kill the cache

It's OK if the data gets stale during this period, so I'm not concerned with invalidation.

Temporary tables don't last between requests, so it seems like I'd need to create the cache table as a permanent table; but then I'd need to be responsible for cleaning up old caches myself, which looks non-trivial at first glance.

Are there any mechanisms within SQL Server that would make this easier? Any advice on alternative approaches?

+2  A: 

How about whenever the page loads, inserting the results of your function to a permanent table, say SearchResults. This table would have fields like:

  • SearchingContactID
  • DirectoryContactID
  • CreateDate

You would search against this table. Then - daily or whenever - you would have a process to go through this table and delete whatever is there from more than a day or so back.

Sylvia
This isn't a bad idea. If I'm going to go to the trouble of rolling my own caching infrastructure, though, it makes sense (in my situation) to cache all of the data being queried rather than just the ContactIDs, so that I don't have to do any joining or any work at all while the user is typing.
Herb Caudill
+1  A: 

I don't want to cache the data in memory in .NET because (a) there's a lot of data, and (b) the search involves full-text indexes and joins and other stuff that SQL does well.

Does that mean that the searched data is 'a lot', or that the search result is a 'a lot'? How big is the output of DirectoryContactsByContact(@ContactID)? My presumption is that is a small result set, small enough to be usefull on the ASP side. If that's true, then you should cache in ASP the search result for a particular @ContactID, and resuse that cached result for the same repeated @ContactID until it expires from cache, then re-create it.

I'm not a big fan of caching results as tables in SQL. This approach turns reads into writes, thus slowing down the first hit even more. It offers stale data, it requires cleanup. But most importantly, from my experience, it always circumvent the real problem of inneficient queries due to improperly designed data model schema.

How confident are you that the DirectoryContactsByContact(@ContactID) response time cannot be further reduced? Where is the bottleneck? How did you measure it? Have you considered what schema changes could be done to serve this result faster?

Remus Rusanu
`DirectoryContactsByContact(@ContactID)` could contain over 200K rows. But (b) is a larger issue - once it's been narrowed down to what this contact is allowed to see, there's still some complex filtering that needs to be done that SQL is better suited to.
Herb Caudill
The business rules that `DirectoryContactsByContact(@ContactID)` enforces are too complicated to go into in this space; suffice it to say that there's a lot going on. Anyway there's about a 40x difference between querying a cached table and querying the function from a cold start (15 ms v 600 ms) and I doubt I can squeeze that kind of performance out by tweaking the query.
Herb Caudill
I see. I was reading that between the lines in your original post, but I had to ask to make sure. Your best bet is something like Sylvia suggests.
Remus Rusanu
A: 

I've ended up creating a rudimentary general-purpose framework for caching the results of a SQL function or view to a table.

    Public Sub CreateCacheTable(ByVal SourceView As String, ByVal FieldList As String)
        Dim CacheTable As String = GetCacheTableName(SourceView)
        If Not TableExists(CacheTable) Then
            Dim Sql As String = " Select ~FieldList~ Into ~CacheTable~ From ~SourceView~ ". _
                Replace("~CacheTable~", CacheTable). _
                Replace("~FieldList~", FieldList). _
                Replace("~SourceView~", SourceView)
            ExecuteNonQuery(cs, CommandType.Text, Sql)
        End If
    End Sub

    Public Function GetCacheTableName(ByVal SourceView As String)
        Dim Result As String = "_c_~SourceView~". _
            Replace("~SourceView~", SourceView). _
            Replace(".", "_"). _
            Replace(",", "_"). _
            Replace("[", ""). _
            Replace("]", ""). _
            Replace("(", ""). _
            Replace(")", "")
        Return Result
    End Function

    Public Sub CleanupCacheTables()
        ExecuteNonQuery(cs, CommandType.StoredProcedure, "CleanupCacheTables") 
    End Sub

When the page loads I do this:

        CleanupCacheTables()
        CreateCacheTable(SourceView, FieldList)

For example, if SourceView is DirectoryContactsByContact(123) this creates a table named _c_DirectoryContactsByContact_123.

Here's the SQL for CleanupCacheTables:

Create Procedure CleanupCacheTables as
    /* Finds all tables starting with _c_ that were created more than 30 minutes ago and drops them */
    Declare @TableName nvarchar(255)
    Declare CacheTableCursor Cursor for
        Select 
            TableName=name
        From SYS.OBJECTS
        Where Type_Desc = 'USER_TABLE'
        And Left(name,3)=  '_c_'
        And DateDiff(minute, create_date, GetDate())>30
    Open CacheTableCursor
    Fetch Next from CacheTableCursor into @TableName
    While @@FETCH_STATUS = 0 Begin
        Exec ('Drop Table ' + @TableName)
        Fetch Next from CacheTableCursor into @TableName
    End -- While
    Close CacheTableCursor
    Deallocate CacheTableCursor
Go

This is crude: There's no invalidation, and it probably won't scale to a lot of concurrent users and/or very large datasets. Nevertheless, in my case it's resulted in near-instantaneous results as the user types or selects search criteria, with very little overhead.

Herb Caudill