views:

122

answers:

2

I have a query with about 6-7 joined tables and a FREETEXT() predicate on 6 columns of the base table in the where.

Now, this query worked fine (in under 2 seconds) for the last year and practically remained unchanged (i tried old versions and the problem persists)

So today, all of a sudden, the same query takes around 1-1.5 minutes.

After checking the Execution Plan in SQL Server 2005, rebuilding the FULLTEXT Index of that table, reorganising the FULLTEXT index, creating the index from scratch, restarting the SQL Server Service, restarting the whole server I don't know what else to try.

I temporarily switched the query to use LIKE instead until i figure this out (which takes about 6 seconds now).

When I look at the query in the query performance analyser, when I compare the ´FREETEXT´query with the ´LIKE´ query, the former has 350 times as many reads (4921261 vs. 13943) and 20 times (38937 vs. 1938) the CPU usage of the latter.

So it really is the ´FREETEXT´predicate that causes it to be so slow.

Has anyone got any ideas on what the reason might be? Or further tests I could do?

[Edit]

Well, I just ran the query again to get the execution plan and now it takes 2-5 seconds again, without any changes made to it, though the problem still existed yesterday. And it wasn't due to any external factors, as I'd stopped all applications accessing the database when I first tested the issue last thursday, so it wasn't due to any other loads.

Well, I'll still include the execution plan, though it might not help a lot now that everything is working again... And beware, it's a huge query to a legacy database that I can't change (i.e. normalize data or get rid of some unneccessary intermediate tables)

Query plan

ok here's the full query

I might have to explain what exactly it does. basically it gets search results for job ads, where there's two types of ads, premium ones and normal ones. the results are paginated to 25 results per page, 10 premium ones up top and 15 normal ones after that, if there are enough.

so there's the two inner queries that select as many premium/normal ones as needed (e.g. on page 10 it fetches the top 100 premium ones and top 150 normal ones), then those two queries are interleaved with a row_number() command and some math. then the combination is ordered by rownumber and the query is returned. well it's used at another place to just get the 25 ads needed for the current page.

Oh and this whole query is constructed in a HUGE legacy Coldfusion file and as it's been working fine, I haven't dared thouching/changing large portions so far... never touch a running system and so on ;) Just small stuff like changing bits of the central where clause.

The file also generates other queries which do basically the same, but without the premium/non premium distinction and a lot of other variations of this query, so I'm never quite sure how a change to one of them might change the others...

Ok as the problem hasn't surfaced again, I gave Martin the bounty as he's been the most helpful so far and I didn't want the bounty to expire needlessly. Thanks to everyone else for their efforts, I'll try your suggestions if it happens again :)

+1  A: 

This issue might arise due to a poor cardinality estimate of the number of results that will be returned by the full text query leading to a poor strategy for the JOIN operations.

How do you find performance if you break it into 2 steps?

One new step that populates a temporary table or table variable with the results of the Full Text query and the second one changing your existing query to refer to the temp table instead.

(NB: You might want to try this JOIN with and without OPTION(RECOMPILE) whilst looking at query plans for (A) a free text search term that returns many results (B) One that returns only a handful of results.)

Edit It's difficult to clarify exactly in the absence of the offending query but what I mean is instead of doing

SELECT <col-list>
FROM --Some 6 table Join
WHERE FREETEXT(...);

How does this perform?

DECLARE @Table TABLE
(
<pk-col-list>
)
INSERT INTO @Table
SELECT PK
FROM YourTable
WHERE FREETEXT(...)

SELECT <col-list>
FROM --Some 6 table Join including onto @Table
OPTION(RECOMPILE)
Martin Smith
What do you mean by 'How do you find performance if you break it into 2 steps?' ? That doesn'tquite make sense to me in the current context.I'll post the execution plan tomorrow at work, but it's a huge query (pagination with rownumber over two joined subqueries and a lot of joins), so there's definitely a lot of possible performance tuning to be made there. But as I said, I'm wondering why it takes roughly 50 times longer to execute all of a sudden when the query was last changed over a month ago (added an additional where clause) and worked fine up until 4 days ago...
Zenon
Yep would need to see the query plan to take a guess on that.
Martin Smith
ok i tried it with the ´DECLARE @Table´ and ´OPTION(RECOMPILE)´ and it takes 3 seconds, as does the original query.But I'll have to test it when the problem arises again.By the way, when I remove the constraints to search only active (status=3) rows from the query, so searching all rows (roughly a million) instead of the 19'000 rows, the query still takes 3 seconds, so I doubt it's because some database entries have been set to inactive over the last 3 days.
Zenon
A: 

Usually when we have this issue, it is because of table fragmentation and stale statistics on the indexes in question.

Next time, try to EXEC sp_updatestats after a rebuild/reindex.

See Using Statistics to Improve Query Performance for more info.

GalacticJello
Yes that's a possibility as well though it wouldn't explain why it works for a LIKE query running off the same statistics.
Martin Smith
FREETEXT and LIKE are using different statistics, I believe (FEETEXT is treated like a remote source, vs. actual index statistics for the LIKE).
GalacticJello
@GalacticJello Correct. FREETEXT in SQL2005 doesn't have any statistics available to make any cardinality estimates at all and will be unaffected by sp_updatestats (always assuming 1 row will be returned) hence the suggestion in my answer to split it out. So the point I was making is that it would be odd (though possible) that a query that relied more heavily on the statistics didn't encounter the same problem.
Martin Smith