views:

124

answers:

5

The below MSSQL2005 query is very slow. I feel like their ought to be a way to speed it up, but am unsure how. Note that I editted the inner join to use select statements to make it more obvious (to people reading this question) what is going on, though this has no impact on speed (probably the Execution plan is the same either way). Interestingly, I never actually use keywordvaluegroups for anything more than a count, but I'm not sure if there is a way to capitalize on this.

select top 1 cde.processPath as 'keywordValue', count(*) as 'total'
from dbo.ClientDefinitionEntry AS cde INNER JOIN dbo.KeywordValueGroups  AS kvg
ON cde.keywordGroupId = kvg.keywordValueGrpId
where kvg.[name] = @definitionName
group by cde.processPath
order by total desc

Edit: Apparently, people keep complaining about my use of subqueries. In fact, it makes no difference. I added them right before posting this question to make it easier to see what is going on. But they only made things more confusing, so I changed it not to use them.

Edit: Indexes in use:

ClientDefinitionEntry:
IX_ClientDefinitionEntry   |nonclustered located on PRIMARY|clientId, keywordGroupId

KeyWordValueGroups
IX_KeywordValueGroups      |nonclustered located on PRIMARY|keywordValueGrpId
IX_KeywordValueGroups_2    |nonclustered located on PRIMARY|version
IX_KeywordValueGroups_Name |nonclustered located on PRIMARY|name
+3  A: 

How does the execution plan looks like ? By having a look at it, you'll learn which part of the query takes the most time / resources.

Have you indexes on the columns where you filter on ? Have you indexes on the columns that you use for joining ? Have you indexes on the columns that you use for sorting ?

once you've taken a look at this, and the query is still slow, you can take a look at how your database / table is fragmented (dbcc showcontig), and see if it is necessary to rebuild the indexes.
It might be helpfull to have a maintenance plan which rebuilds your indexes on a regular basis.

Frederik Gheysels
+3  A: 

Run the query with this option on:

SET SHOWPLAN_TEXT ON

And add the result to the question.

Also check if your statistics are up to date:

SELECT 
    object_name = Object_Name(ind.object_id),
    IndexName = ind.name,
    StatisticsDate = STATS_DATE(ind.object_id, ind.index_id)
FROM SYS.INDEXES ind
order by STATS_DATE(ind.object_id, ind.index_id) desc

And information about indexes, table definitions and foreign keys would be helpful.

Andomar
+2  A: 

There really isn't enough information to know for sure. If you are having performance problems in that query, then the tables must have a non trivial amount of data and you must be missing important indexes.

Which indexes will definitely help depends deeply on how large the tables are, and to a lesser extent on the distribution of values in the KeywordGroupId and KeywordValueGrpId fields.

Lacking any other information, I would say that you want to make sure that dbo.KeywordValueGroups.[name] is indexed, as well as dbo.ClientDefinitionEntry.[keywordGroupId].

Because of the way the query is written, an index on dbo.KeywordValueGroups.[keywordValueGrpId] alone cannot help, but a composite index on [name], [keywordValueGrpId] probably will. If you have that index, you don't need a dedicated index on [name].

Based on gut-feeling alone, I might hazard that the index on [name] is a must, and that cde.keywordGroupId is likely important. Whether the composite index on [name], [keywordValueGrpId] would help, it depends on how many records are there with the same [name].

The only way to know for sure is to add the indexes and see what happens.

You also need to think about how often this query runs (so, how important is it to make it fast), and how often the underlying data changes. Depending on your particular circumstances, the increase in speed might not justify the added cost of maintaining the indexes.

Euro Micelli
+2  A: 

I'd make sure you had the following indexes.

The ID on KeywordValueGroups.

The Name on KeywordValueGroups.

The ID on ClientDefinitionEntry with an INCLUDE for the processPath.

CREATE INDEX [IX_ClientDefinitionEntry_Id_ProcessPath] ON [dbo].[ClientDefinitionEntry] ( [keywordGroupId] ASC ) INCLUDE ( [processPath]) ON [PRIMARY]
CREATE INDEX [IX_KeywordValueGroups_Id] ON [dbo].[KeywordValueGroups] ( [keywordValueGrpId] ASC )
CREATE INDEX [IX_KeywordValueGroups_Name] ON [dbo].[KeywordValueGroups] ( [name] ASC )

I'd also change the query to the following.

select top 1
 cde.processPath as 'keywordValue',
 count(*) as 'total'
from
 dbo.ClientDefinitionEntry AS cde
INNER JOIN
 dbo.KeywordValueGroups  AS kvg
ON
 cde.keywordGroupId = kvg.keywordValueGrpId
where
 kvg.[name] = @definitionName
group by
 processPath
order by
 total desc
Robin Day
Using "CREATE INDEX [IX_ClientDefinitionEntry_Id_ProcessPath] ON [dbo].[ClientDefinitionEntry] ( [keywordGroupId] ASC ) INCLUDE ( [processPath]) ON [PRIMARY]" made things go much faster. I really don't care how this impacts everything else in terms of speed, since this particular item was why it was slow.
Brian
The reason is that although you probably had an index on the id, the query plan would've shown that it then had to do a lookup on the actual table to find the value of processPath. If you do an INCLUDE query, it effectively adds that value to the index. The physical size of the index increases (your database gets bigger) but it means all the required data is retrieved from the index without ever having to hit the table itself.
Robin Day
Yeah, this is bringing back memories from my DB class.
Brian
What's the difference of 'including' the processPath field with the index, as opposed to making it a real part of the index ? Ok, I can understand that if you're creating a UNIQUE index , you might not want to do that, and indeed 'include' any additional fields, but what if it is not about a unique index ?
Frederik Gheysels
The processPath value has no relevence in finding that record in the table. The Id is the only item required. If you were to index on both columns then SQL would checking both values to find the record in question.
Robin Day
A: 

Not sure how many records we are talking about but this: order by total desc is on a calculated column meaning every calculation on every row will have to be done before the ordering can be done. Likely this is one of the things slowing it down, but I see no way out of that particular problem. Not a problem if you only have a few records after the join, but could be if there are lots of them.

I'd concentrate on indexing first. We often forget that when we create foriegn keys they are not automatically indexed. Check to see if both parts of the join are indexed.

Since you are passing a value in a parameter, you might also have a parameter sniffing problem. Google this for techniques to fix that.

HLGEM