views:

71

answers:

5

Hello, I have a question addressed to sql guru.

There are two tables with almost identical structure.

Based on parameter passed into the stored procedure I need to collect data from one or another table.

How to do that in a best way?

Please do not suggest to combine those tables into single one - that is not appropriate.

I did it following (MS SQL Server 2008):

Select *
FROM
    String s
    JOIN (
        SELECT id
            ,TypeCode
            ,ProdNo         
                FROM Table1
        WHERE @param = 1 AND TypeCode= 'INV'

        UNION

        SELECT id
            ,TypeCode
            ,ProdNo
        FROM Table2
        WHERE @param = 2 AND TypeCode= 'INV'

    ) m ON m.Id = s.Id
WHERE s.Id = 256

but when I looked at execution plan I was surprised because it got data from both tables in parallel threads and only after that filtered by @param value.

I thought that filtering will be made on the first stage and data collected from single table.

Is there a way to make select only from one table without splitting query into two queries and using IF operator?

Thanks

+4  A: 

Could you just use a simple IF statement?

IF @Param = 1 
  BEGIN
    EXEC SQL
  END
ELSE IF @Param = 2
  BEGIN
    EXEC SQL
  END 
ELSE
  RAISERROR('Invalid Parameter', 16, 1)

Or alternatively you could build the query dynamically and execute it using the sp_executesql stored procedure.

DECLARE @Sql NVARCHAR(100)

SET @Sql = N'SELECT * FROM ('

IF @Param = 1
  SET @Sql = @Sql + N'SELECT 1 a, 2 b, 3 c'
ELSE IF @param = 2
  SET @Sql = @Sql + N'SELECT 4 a, 5 b, 6 c'
ELSE
  RAISERROR('Invalid Parameter', 16, 1)

SET @Sql = @Sql + ') tbl'
EXEC sp_executesql @sql
Chris Diver
Yes, of course it is possible and I pointed it out at the end of my post but 1) that will be impossible to extract into view2) I'm already have two parts of identical code. If I split it again than there will be 4 huge selects - it is not fun to supporting such codeSo I'm going to use it only if there is no other way to make it within single query.Anyway thank you
Cheburek
I've added an alternative way of doing it so you don't have the same code twice. HTH
Chris Diver
dynamic sql... I try to not use it where it is possible. I think you know why ;) IMHO it is better to have two static blocks of codeBut of course it's an alternative way. Thank you for it!
Cheburek
I don't think I've ever used it in production before. Not to say that I wouldn't should the need arise. It can be dangerous in the wrong hands but as long as you understand the perils and pitfalls you can safely write dynamic SQL. Used as above is perfectly safe, however if you decided to supply the table name as a parameter the code would be more readable and open to abuse.
Chris Diver
A: 

SQL Server's not that clever - when writing queries you should only ensure that you send the least amount of SQL to get the data you want (without sending superfluous statements), but also provide the most amount of information (via filters) where possible to give the query optimiser as many hints as possible about the data. As you've seen, it will execute all the SQL you send it.

So it sounds like you need to use dynamic-SQL from what I'm reading. This also gives you the benefit of being able to merge common parts of the SQL cutting down on the amount of duplication. For example, you could have (just taking your inner code -- you can wrap the rest of your stuff around it):

DECLARE @sql NVARCHAR(1000)

SET @sql = 'SELECT id, TypeCode, ProdCode'
IF @param = 1
   SET @sql = @sql + ' FROM table1'
IF @param = 2
   SET @sql = @sql + ' FROM table2'
SET @sql = @sql + ' WHERE TypeCode = ''INV'''

EXECUTE sp_ExecuteSQL @sql

Just be aware, if you're going to wind this into something more complicated, about little Bobby Tables: it's possible to abuse sp_ExecuteSQL and open gaping holes, but used correctly - with parameterised dynamic SQL - it's as good as a stored procedure.

Chris J
+1  A: 

First thing I'd suggest is putting the ID filter inside the union as well. I've also changed the UNION to UNION ALL. This avoids evaluating DISTINCT rows

Select * 
FROM 
    String s 
    JOIN ( 
        SELECT id 
            ,TypeCode 
            ,ProdNo          
                FROM Table1 
        WHERE @param = 1 AND TypeCode= 'INV' AND id = 256

        UNION ALL

        SELECT id 
            ,TypeCode 
            ,ProdNo 
        FROM Table2 
        WHERE @param = 2 AND TypeCode= 'INV' AND id = 256

    ) m ON m.Id = s.Id 
WHERE s.Id = 256 
Chris Bednarski
I agree with you about UNION ALL but disagree about id inside. As shown on execution plan on the first step sql server is enough smart to add id restriction to subqueries automatically if it was set in outer select explicitly.
Cheburek
Yes, UNION ALL faster than UNION even one of selects is always empty, THANKS, that was very useful for me
Cheburek
A: 

if you can create a tDUMMY table (with a single dummy row) give this a shot.

Select 
    * 
FROM 
    String s 
    JOIN ( 
        SELECT 
             id, TypeCode, ProdNo
        FROM 
             tDUMMY INNER JOIN Table1 ON TypeCode= 'INV'
        WHERE
             @param = 1

 UNION ALL

        SELECT 
             id, TypeCode, ProdNo
        FROM 
             tDUMMY INNER JOIN Table2 ON TypeCode= 'INV'
        WHERE
             @param = 2
     ) m ON m.Id = s.Id 
WHERE s.Id = 256

theoretically the query optimizer should first filter the tDUMMY table and then attempt the join. So if @param = 1 the second query should get out much faster (it will check 1 row of tDUMMY again, but it shouldn't check table2)

Note - i also made it UNION ALL (but it wouldn't have much of an impact) because one side will always return no rows anyway.

potatopeelings
just a sidenote - i'll definitely be trying this out once i get to a SQL server I have enough rights to. currently i'm stuck at a place where the only thing i have access to is notepad :-). meanwhile feel free to shoot this down if it doesn't sound right for some reason.
potatopeelings
I've created table and tested that query. Interesting idea but results are not so good. During execution selection from both tables (tDummy and Table1 or Table2) executes at the same time and after that tDummy filters with @param then results joined. So there is no any benefit. This query takes 60% while query without tDummy table takes only 40% (both executed in one window)But idea is really very interesting, thank you...
Cheburek
thanks for checking! the optimizer probably took the @param clause as a residual predicate :-(. the only other thing i can think of is to try making the tDUMMY a subquery, and forcing a hash join - SELECT id, TypeCode, ProdNo FROM (SELECT * from tDUMMY WHERE @param = 1) AS A INNER JOIN Table1 ON TypeCode= 'INV' OPTION (HASH JOIN). but i'm just guessing here (unless it works when i'll say i knew it all along :-))
potatopeelings
+3  A: 

you really need to read this Dynamic Search Conditions in T-SQL by Erland Sommarskog. You shouldn't worry about repeating code, this is not some homework assignment. Just worry about making the execution plan use an index. When making SQL code "pretty" the only thing to consider is indenting & case, any other changes can cause the query plan to be slower. I've seen trivial changes to a super fast query result in a super slow query. GO FOR SPEED (index usage) and duplicate code as necessary. also see: The Curse and Blessings of Dynamic SQL

You tagged the question sql-server-2008, so if you're running SQL 2008 SP1 CU5 (10.0.2746) and SQL 2008 R2 CU1 (10.50.1702) and later, there is a new behavior (as explained in the "Dynamic Search Conditions" article linked above) of OPTION(RECOMPILE) that does not appear in all versions of SQL 2008 or in 2005. This behavior basically evaluates the @Local_Variables values at runtime and recompiles the query accordingly. In your case, this should cause one half of your UNION to be eliminated when compiling them.

KM
That's really "must read" article. Very helpful.
Cheburek