views:

1986

answers:

9

I have a stored procedure that has a optional parameter, @UserID VARCHAR(50). The thing is, there are two ways to work with it:

  1. Give it a default value of NULL, the have an IF...ELSE clause, that performs two different SELECT queries, one with 'WHERE UserID = @UserID' and without the where.
  2. Give it a default value of '%' and then just have the where clause use 'WHERE UserID LIKE @UserID'. In the calling code, the '%' wont be used, so only exact matches will be found.

The question is: Which option is faster? Which option provides better performance as the table grows? Be aware that the UserID column is a foreign key and is not indexed.

EDIT: Something I want to add, based on some answers: The @UserID parameter is not (necessarily) the only optional parameter being passed. In some cases there are as many as 4 or 5 optional parameters.

+5  A: 

What I typically do is something like

WHERE ( @UserID IS NULL OR UserID = @UserID )

And why isn't it indexed? It's generally good form to index FKs, since you often join on them...

If you're worried about query plan storage, simply do: CREATE PROCEDURE ... WITH RECOMPILE

Matt Rogish
+2  A: 

The only way to tell for sure is to implement both and measure. For reference there is a third way to implement this, which is what I tend to use:

WHERE (@UserID IS NULL OR UserId = @UserId)
Greg Beech
+1  A: 

Why not use:

where @UserID is null or UserID=@UserID

+ on maintainability and performance

Ovidiu Pacurar
Are you forcing a hard parse each and every time?
No- optimizier should factor that out.
Joel Coehoorn
+1  A: 

I'd defiantly go with the first because although it's less 'clever' it's easier to understand what's going on and will hence be easier to maintain.

The use of the special meaning default is likely to trip you up later with some unintended side-effect (documentation as to why you're using that default and it's usage is likely to be missed by any maintainer)

As to efficiency - unless you're looking at 1,000 users or more then it's unlikely to be sufficient an issue to override maintainability.

Cruachan
I could kiss you. But don't give in to bad habits because the data may be small... if he does it where it won't matter, he'll continue to do it were it does matter and be hosed.
Aye, I'll avoid the kissing but actually I meant the other way around - always go for clear obvious code unless performance issues *force* you otherwise AND always remember that machine cycles are far cheaper than coder cycles.
Cruachan
+1  A: 

First, you should create an index for UserID if you use it as a search criteria in this way.

Second, comparing UserID LIKE @UserID cannot use an index, because the optimizer doesn't know if you will give a @UserID parameter value that begins with a wildcard. Such a value cannot use the index, so the optimizer must assume it cannot create an execution plan using that index.

So I recommend:

  1. Create an index on UserID
  2. Use the first option, WHERE UserID = @UserID, which should be optimized to use the index.

edit: Mark Brady reminds me I forgot to address the NULL case. I agree with Mark's answer, do the IF and execute one of two queries. I give Mark's answer +1.

Bill Karwin
And when there is no User_ID, when he wants them all? You haven't fully answered the question. You handled the half that's simple.
Where UserId Like @UserId WILL use an index, if one is available, as long as the value of @UserID Does not include a wild card at the BEGINNING of the string value ... as in Where lastname Like '%higgins' If you had Where LastName Like 'Higgins' Or Where LastName Like 'Higgins%' Then its ok
Charles Bretana
@Charles: as I understand it, that would require recompiling the execution plan with respect to the given value of @UserID. Otherwise the optimizer may not be able to assume it can use the index.
Bill Karwin
No more so than when using Where UserId = @UserId for a new different value of @UserID... The optimizer has to look at the statistics to see how many Page IOs would be required to rtead those pages from disk using the index, as compared to doing a FTS... Using :Like, without leading wildcards,
Charles Bretana
it can traverse the index in exactly the same way as it would if you were using Where UserId = ....
Charles Bretana
Very good to know. I admit I don't use MS SQL much, as I do other database brands. Thanks!
Bill Karwin
A: 

I would sort-of go with option 1, but actually have two stored procedures. One would get all the users and one would get a specific user. I think this is clearer than passing in a NULL. This is a scenario where you do want two different SQL statements because you're asking for different things (all rows vs one row).

WW
I could do that, although: 1. The WHERE clause may still return more than one row, 2. There are additional optional parameters, and 3. It's easier on the logic in the code to have one SP and only pass the neccessary parameters.
Schmuli
I've used the NULL-means-everything approach before and been burnt when buggy code passes in NULL for everything. Then the database goes off and fetches a million rows.
WW
+2  A: 

The issue with having only one stored procedure is as mentioned quite well above that the SQL stores a compiled plan for the procedure, a plan for null is quite different to one with a value.

However, creating an if statement in the stored procedure will lead to the stored procedure being recompiled at run time. This may also add to the performance issues.

As mentioned elsewhere, this is suitable for a test and see approach, taking into account the if statement, an @UserID is null and two separate procedures.

Unfortunately, the speed of these approaches is going to vary greatly based on the amount of data and the frequency of the calls where the parameter is null vs the calls where the parameter is not. Again the number of parameters is also going to affect the efficacy of an approach that requires re-writing the procedures.

If you are using SQL 2005, you may get some mileage from the query plan hint option.

Correction: Sql 2005 and since has "statement-Level Recompilation" which store separate plans in cache for each statement in a procedure... So the old Pre-2005 policy of not putting multiple logic branch statements into a single stored procedure is no longer true... – Charles Bretana (i figure this was important enough to elevate from a comment)

Nat
Sql 2005 and since has "statement-Level Recompilation" which store separatye plans in cache for each statement in a procedure... So the old Pre-2005 policy of not putting multiple logic branch statements into a single stored procedure is no longer true...
Charles Bretana
+3  A: 

SQL Server 2005 and subsequent have something called "statement-level recompilation". Check out http://www.microsoft.com/technet/prodtechnol/sql/2005/recomp.mspx

Basically, each individual statement executed by the query processor gets it's own optimized plan, which is then stored in "Plan Cache" (This is why they changed the name from "Procedure-Cache")

So branching your T-SQL into separate statements is better...

Charles Bretana
A: 

Replace the single stored procedure with two. There's way to much room for the query optimizer to start whacking you with unintended consequence on this one. Change the client code to detect which one to call.

I bet if you had done it that way, we wouldn't need to be having this dialog.

And put an index on userid. Indexes are in there for a reason, and this is it.

le dorfier