views:

40

answers:

1

You can read the backstory if you want, after changing my nonclustered index to a clustered index things started running a lot faster. The problem with the initial query plan taking 2-3 seconds still remain though. However, the keep plan query hint improved that quite a bit.

BACKSTORY

I have an inverted index where I store things I wish to lookup in a simple search implementation (though there is nothing simple about search).

When I enter a query like this "ma br" it will create this SQL

SELECT EntityType, EntityID FROM InvertedIndex WHERE Term LIKE @p0
INTERSECT
SELECT EntityType, EntityID FROM InvertedIndex WHERE Term LIKE @p1
--@p0 String --'ma%'
--@p1 String --'br%'

For every search term, there's another INTERSECT and SELECT

And for the most part this works very well. I've indexed the Term column properly and checked execution plans for potential bottlenecks. Right now the index contains about 150,000 rows and searches happen instantaneously, as expected.

What's a bit irritating though, is that typically the first query of a certain rank takes a lot longer to execute. I bet this is because the query optimizer is reevaluating the execution plan. But how should I deal with this? The more queries I run against the server, the less it stalls, but every other query takes about 2-3 seconds longer. This is not a big deal, but sometimes it's a lot longer and I just fail to see where this is coming from or how to deal with it. It should be lightning fast.

EDIT

Schema looks like this:

CREATE TABLE InvertedIndex (
    Term varchar(255) NOT NULL,
    Ordinal tinyint NOT NULL,
    EntityType tinyint NOT NULL,
    EntityID int NOT NULL
)

The two indexes are:

CREATE NONCLUSTERED INDEX IX_InvertedIndex ON InvertedIndex (Term) 
INCLUDE (Ordinal, EntityType, EntityID)

CREATE NONCLUSTERED INDEX IX_InvertedIndex_Reverse ON InvertedIndex (EntityType, EntityID) 

This stuff stays, what happens are insert and delete operations when index (InvertedIndex) needs to be updated and evertually a complete rebuild, will this affect the use of a QUERY PLAN?

Here's an example of the complete query it's really slow atm, 3-5 seconds and I can't work out why... The ORDER BY clause is meant to give words that match a certain position higher sort order (occur first in the result set) this however has become exponentially slower for each search term.

WITH Search AS (
    SELECT EntityType, EntityID FROM InvertedIndex WHERE Term LIKE @p0
    INTERSECT
    SELECT EntityType, EntityID FROM InvertedIndex WHERE Term LIKE @p1
    INTERSECT
    SELECT EntityType, EntityID FROM InvertedIndex WHERE Term LIKE @p2
)
SELECT p.PersonID
, p.FullName
, p.Email
, p.MobilePhone
, p.HomeAddress
, p.HomeCity
FROM Search AS s
INNER JOIN Person AS p ON p.PersonID = s.EntityID AND s.EntityType = @pPersonEntityType
ORDER BY (CASE WHEN @p3 IN (SELECT Ordinal FROM InvertedIndex WHERE Term LIKE @p0 AND EntityID = s.EntityID AND EntityType = s.EntityType) THEN 0 ELSE 1 END) + (CASE WHEN @p4 IN (SELECT Ordinal FROM InvertedIndex WHERE Term LIKE @p1 AND EntityID = s.EntityID AND EntityType = s.EntityType) THEN 0 ELSE 1 END) + (CASE WHEN @p5 IN (SELECT Ordinal FROM InvertedIndex WHERE Term LIKE @p2 AND EntityID = s.EntityID AND EntityType = s.EntityType) THEN 0 ELSE 1 END)
@p0 String --'ma%'
@p1 String --'br%'
@p2 String --'mi%'
@p3 Int32 --1
@p4 Int32 --2
@p5 Int32 --3

The point of the above query is to look up all terms in the InvertedIndex, then, for each search term there's an intersect, that's the logical conjuction that I wish to use to constrain the search. Ordinal represents the original position of the word when it was indexed. Each entry in the InvertedIndex represents a tuple and if the search term matches some item of this N-tuple it's considered a better match. That's why I need to do this funky order by with subqueries. But it's really slow.

ANSWER

If I change IX_InvertedIndex to a clustered index it improves the query speed by an order of magnitude (I don't know why though):

CREATE CLUSTERED INDEX IX_InvertedIndex ON InvertedIndex (Term) 
+2  A: 

If you are appending INTERSECT clauses then each query will be different. I suspect (based on what you've said) that eventually you have a cached plan for each number of INTERSECT clauses. Once you have the cached plan, it runs OK.

You could try plan guides, one for each number of INTERSECT clauses.

Otherwise, you could always have a single select and spool results in a temp table and self join. Not sure how it would run though or whether it's good idea.

Your index should also be on Term with INCLUDE for EntityType, EntityID to be covering too

Edit, after comment.

You could try KEEP PLAN or plan forcing too given the relative simplicity to help avoid what sounds like recompiles.

If it was SQL Server 2008, I'd suggest OPTIMISE FOR UNKNOWN

Finally, another thought: do datatypes match across the board?

Edit: You should change the index to (Term, EntityType, Ordinal, EntityID) with no INCLUDES. You are using all columns in JOINs or filters

You also need a primary key (Term? Ordinal) which should be unqiue and clustered too,. There is no advantage in not having one apart from poor performance and fragmented data

And change the query to this:

WITH Search AS
(
    SELECT Ordinal, EntityType, EntityID FROM InvertedIndex WHERE Term LIKE @p0 AND EntityType = @pPersonEntityType
    INTERSECT
    SELECT Ordinal, EntityType, EntityID FROM InvertedIndex WHERE Term LIKE @p1 AND EntityType = @pPersonEntityType
    INTERSECT
    SELECT Ordinal, EntityType, EntityID FROM InvertedIndex WHERE Term LIKE @p2 AND EntityType = @pPersonEntityType
)
SELECT
    p.PersonID, p.FullName, p.Email, p.MobilePhone, p.HomeAddress, p.HomeCity
FROM
    Search AS s
    INNER JOIN
    Person AS p ON p.PersonID = s.EntityID
    LEFT JOIN
    (SELECT 0 AS Ranking, @p3 AS RankOrdinal) O3
    LEFT JOIN
    (SELECT 0 AS Ranking, @p4 AS RankOrdinal) O4
    LEFT JOIN
    (SELECT 0 AS Ranking, @p5 AS RankOrdinal) O5
ORDER BY    --although, I can't see why you are doing + 
    ISNULL(O3.Ranking, 1) +
    ISNULL(O4.Ranking, 1) +
    ISNULL(O5.Ranking, 1)
gbn
Index is including EntityType, EntityID. But thanks for pointing that out. I didn't know about that last month. I'm gonna investigate plan guides. Though I wanna add that the execution plan seems to change more often than that. Just changing the search values, still the same number, but different values, can cause this abrupt halt.
John Leidegren
@John Leidegren: I meant plan forcing not plan guides, sorry. Plan guides are for when you can't change the SQL. Forcing is something that adds query hint.
gbn
I did try the OPTIMISE FOR UNKNOWN but ran into a wall when it was a SQL 2008 feature. I'll try KEEP PLAN next. I should avoid plan guides then? Haven't tried that yet...
John Leidegren
The only possible primary key would be (Term, Ordinal, EntityType, EntityID) i.e. all columns. I can't use INTERSECT with the Ordinal column becuase it's only meant to be used for relevance, the search is global, the entity type isn't always Person, in this example it is, so I can't filter the Search CTE on entity type. The addition was just an atempt to avoid having more than 1 sort column. But that didn't change the performance of the query one bit.
John Leidegren