views:

312

answers:

6

I have a stored procedure that takes a comma-delimited string of IDs. I split them and put them into a temporary table and pull out records from another table using where id IN [table]

Is it ok to use this same procedure when only one id is passed in for the param? I could write a second stored procedure that would do exactly the samething but instead do where id = @id.

I have MANY stored procedures where multiple IDs or just one could be passed in. Do I try to reuse the existing procedures or write new ones? Is the performance hit significant?

+1  A: 

You might like to try a JOIN instead of WHERE id IN - although I think you will get the same query plan.

So I assume you are doing

SELECT COl1, Col2, ... FROM MyTable WHERE id IN (SELECT id FROM @MyTempTable)

in which case the equivalent JOIN syntax would be

SELECT COl1, Col2, ... FROM MyTable AS T1 JOIN @MyTempTable AS T2 ON T2.id = T1.id

and in the second case whether there is 1, or many rows, it will be very effective provided [id] is indexed (I am assuming its the PK on your table, and using a Clustered Index).

(Beware that if you have DUP IDs in @MyTempTable you will wind up getting dups from MyTable as well :( )

For best performance it would be worth explicitly declaring [id] as the PK on your temporary table (but given it only holds a few rows it probably won't make much odds)

DECLARE @TempTable TABLE
(
    id int NOT NULL,
    PRIMARY KEY
    (
        id
    )
)
Kristen
Kristen, JOIN and the query with IN are not equivalent - if your list of IDs has duplicates, you will have diuplicates in your JOIN, but not in the query with IN
AlexKuznetsov
Yes, I did mention that viz: "Beware that if you have DUP IDs in @MyTempTable you will wind up getting dups from MyTable as well"
Kristen
A: 

You could use the same procedure, but use a conditional statement to determine whether use use the IN clause.

There is a performance hit with IN; the execution plan should detail this for you.

Ryan Emerle
I thought about this. But, to do it I would still need to split the string of IDs and create a temp table. Isnt doing all that string manipulation really the worst performing part of the sproc?
Vyrotek
A: 

As rde6173 says, perform a COUNT on the temporary table to determine which SELECT query to use.

Lazlow
My instinct is that the COUNT(*) will take longer than the hit than might be present in the case of an IN on a single row table!!
Kristen
+1  A: 

I wouldn't worry about the performance hit of in with only one item until I had observed a performance problem with it. The query optimizer is smart and may very well deal with the one item in, but even if it doesn't, your routines will probably be slowest elsewhere.

I would look at the performance of the string parsing, temp table creation and insertion into the temp table. Making those as fast a possible, will have a bigger effect on overall performance than wether you use in or = for the one item case.

automatic
add to that that maintaining several two almost identical procedures could easily become a maintenance nightmare, in your calling code as in the stored procedures themselve.
Lieven
A: 

Since you've specified that it's a comma-delimited list, you can do something like this in your sproc:

IF (CHARINDEX(',', @id) = 0)
BEGIN
    -- the @id parameter contains a single value
    SELECT *
    FROM your_table
    WHERE id = @id  -- maybe need to cast @id if the column isn't a string
END
ELSE
BEGIN
    -- the @id parameter contains a comma-delimited list
    -- only perform the expensive splitting logic at this point
    -- eg, SET @yourTempTable = dbo.SplitCommaDelimitedIDsIntoTable(@id)
    SELECT *
    FROM your_table
    WHERE id IN (SELECT id FROM @yourTempTable)
END
LukeH
Assuming Stored Procedure in MSSQL : is the optimiser likely to cache a query plan based on the first-use - which might not be representative, nor a good-plan for the other use? (that's fixable by EXEC'ing separate Sprocs for each case, but its all getting a bit "heavy" at that point)
Kristen
@Kristen, That's a good point, but I don't have access to SQL Server right now to check the generated plans. I'll look into it when I get a chance.
LukeH
I like your idea BTW ... likely to adopt it at some point :)
Kristen
A: 

When you create a temporary table (not a table variable), it has statistics. As such, the optimizer will determine the best plan, and the best plan for one ID might be the same as for 10 IDs, but for 50K IDs it may choose a different plan. So, I would not try to optimize it further unless you have performance concerns.

AlexKuznetsov