views:

294

answers:

4

The question from quite a long time boiling in my head, that out of the following two stored procedures which one would perform better.

Proc 1

CREATE PROCEDURE GetEmployeeDetails @EmployeeId uniqueidentifier,
@IncludeDepartmentInfo bit

AS
BEGIN

    SELECT * FROM Employees 
    WHERE Employees.EmployeeId = @EmployeeId

    IF (@IncludeDepartmentInfo = 1)
    BEGIN
     SELECT Departments.* FROM Departments, Employees
     WHERE Departments.DepartmentId = Employees.DepartmentId 
        AND Employees.EmployeeId = @EmployeeId
    END
END

Proc 2

CREATE PROCEDURE GetEmployeeDetails @EmployeeId uniqueidentifier,
 @IncludeDepartmentInfo bit
AS
BEGIN

    SELECT * FROM Employees 
    WHERE Employees.EmployeeId = @EmployeeId

    SELECT Departments.* FROM Departments, Employees
    WHERE Departments.DepartmentId = Employees.DepartmentId 
    AND Employees.EmployeeId = @EmployeeId 
    AND @IncludeDepartmentInfo = 1

END

the only difference between the two is use of 'if statment'.

if proc 1/proc 2 are called with alternating values of @IncludeDepartmentInfo then from my understanding proc 2 would perform better, because it will retain the same query plan irrespective of the value of @IncludeDepartmentInfo, whereas proc1 will change query plan in each call

answers are really appericated

PS: this is just a scenario, please don't go to the explicit query results but the essence of example. I am really particular about the query optimizer result (in both cases of 'if and where' and their difference), there are many aspects which I know could affect the performance which I want to avoid in this question.

A: 

When you use the 'if' statement, you may run only one query instead of two. I would think that one query would almost always be faster than two. Your point about query plans may be valid if the first query were complex and took a long time to run, and the second was trivial. However, the first query looks like it retrieves a single row based on a primary key - probably pretty fast every time. So, I would keep the 'if' - but I would test to verify.

Ray
in essence first query may return multiple records not in current example, and the second query is kind of optional as well in proc 2 where SQL query optimizer itself would first match the constant criteria and would not go for more criteria matching if it does not match. Again these are just my assumptions about SQL query optimizer
scorpio
The only way to really know is to test against your real, live data and see what happens. Andomar's point is a good one - in most cases, the difference will be too small to worry about.
Ray
A: 

The performance difference would be too small for anyone to notice.

Premature optimization is the root of all evil. Stop worrying about performance and start implementing features that make your customers smile.

Andomar
I understand what you're saying, but it seems like that saying could be used as an excuse, similar to "My code is self-documenting, so I don't need to create documentation."
Mike C.
:) being smart does not mean you can not ask basic questions, no offence
scorpio
I agree with you Mike
scorpio
+2  A: 

From a consistency perspective, number 2 will always return 2 datasets. Overloading aside, you wouldn't have a client code method that may be returns a result, maybe not.

If you reuse this code, the other calling client will have to know this flag too.

If the code does 2 different things, then why not 2 different stored procs?

Finally, it's far better practice to use modern JOIN syntax and separate joining from filtering. In this case, personally I'd use EXISTS too.

SELECT
    D.*
FROM
    Departments D
    JOIN
    Employees E ON D.DepartmentId = E.DepartmentId
WHERE  
    E.EmployeeId = @EmployeeId 
    AND
    @IncludeDepartmentInfo = 1
gbn
I knew we can use two stored procs, and there is an elegant join statement (replace old fashioned query with join ), but my simple question is which will perform better out of two?
scorpio
@scorpio: It depends: small or wide table? large or small results? The overhead of the plan uncertainty will outweigh any advantage of not sending a small set of data.
gbn
OFF Topic: Better to use two stored procedures is to use two table functions.
Nitai Bezerra
gbn thanks for the reply.See from client code as well I think proc 2 is going to return two result set consistently, it depends the second result set will be empty or not.And take this scenario when the tables which contain more than 100k records and returning again 1000s of records which proc would be better?I have a practical scenario where database tables have huge number of records with more than 10 columns, I just asked the question by trying to give a simple example.
scorpio
@scorpio: I'd have 2 stored procs. As per Remus, the 2nd query may still be evaluated if in the WHERE. The IF changes the signature
gbn
A: 
SELECT Departments.* FROM Departments, Employees
    WHERE Departments.DepartmentId = Employees.DepartmentId 
    AND Employees.EmployeeId = @EmployeeId 
    AND @IncludeDepartmentInfo = 1

When SQL compiles a query like this it must be compiled for any value of @IncludeDepartmentInfo. The resulted plan can well be one that scans the tables and performs the join and after that checks the variable, resulting in unnecessary I/O. The optimizer may be smart and move the check for the variable ahead of the actual I/O operations in the execution plan, but this is never guaranteed. This is why I always recommend to use explicit IFs in the T-SQL for queries that need to perform very differently based on a variable value (the typical example being OR conditions).

gbn's observation is also an important one: from an API design point of view is better to have a consistent return type (ie. always return the same shaped and number of result sets).

Remus Rusanu