views:

341

answers:

10

We use SQL Server 2005. All our data access is done through stored procedures. Our selection stored procedures always return multiple result sets.

For instance:

CREATE PROCEDURE hd_invoice_select(@id INT) AS
 SELECT * FROM Invoice WHERE InvoiceID = @id
 SELECT * FROM InvoiceItem WHERE InvoiceID = @id
 SELECT * FROM InvoiceComments WHERE InvoiceID = @id
 RETURN

Our application's data access layer builds an object graph based on the results (O/R Mapper style).

The problem I have is that we have many different invoice selection stored procs. They all return the same structure, only for different selection criteria. For instance, I also have:

CREATE PROCEDURE hd_invoice_selectAllForCustomer(@customerID INT) AS
 SELECT * FROM Invoice WHERE CustomerID = @customerID
 SELECT * FROM InvoiceItem WHERE InvoiceID IN 
  (SELECT InvoiceID FROM Invoice WHERE CustomerID = @customerID)
 SELECT * FROM InvoiceComments WHERE InvoiceID = @id
  (SELECT InvoiceID FROM Invoice WHERE CustomerID = @customerID)
 RETURN

and I have many others including:

hd_invoice_selectActive()
hd_invoice_selectOverdue()
hd_invoice_selectForMonth(@year INT, @month INT)

and I have the same pattern for a lot of concepts (Customers, Employees, etc)

We end up copying a lot of code and maintenance is really hard. When the "structure" of a concept changes, we have to go and fix all procs and it's very error prone.

So my question is: What is the best way to reuse the code in the scenario?

We came up with a solution that uses temp tables. But it's not very elegant. I'll let you share your ideas and if necessary I will post the detail of my solution in an upcoming post to get your comments on that approach.

Thanks

A: 

This is one of the main problems with stored procedures and why people don't like them.

I've never found or seen a way around it.

Garry Shutler
A: 

I have started to use stored procedures generated by Code Generators for my basic CRUD. I use stored procs for reports or complex SQL work.

I have a suggestion unrelated to your question as well - instead of using the IN clause, use the EXISTS clause in your SQL statements.

Raj More
Can you explain why do you think that EXISTS is better than IN in this case?
AlexKuznetsov
Using IN instead of EXISTS can cause a table scan of the sub-queried data. EXISTS can make better use of indexes. IN can also cause the query to search the list of items in the sub-query on every pass through the primary query.
Chris Porter
+2  A: 

The "best" way for this specific scenario would be to use some sort of code generation. Come up with some sort of convention and plug it into a code generator.

rball
A: 

I sometimes do it in two steps:

I come up with a list of InvoiceID. Then I call my stored procedure with this list as a parameter.

On 2005 we don't have table valued parameters, so I pack my list of IDs in a binary BLOB and submit it to SQL Server, as described here: Arrays and Lists in SQL Server 2005

You can also submit a list of IDs as a comma-separated list (somewhat slow) or as a concatenation of fixed width string representations (much faster).

AlexKuznetsov
Using this technique, how would you make it so the proc hd_invoice_selectForMonth(@year INT, @month INT) can find the matching IDs and then pack them and send them to hd_invoice_selectFromIDs(@ids blob)?
Sly
A: 

I've inherited an application that used the temp table approach before and I agree that it's very messy.

On that project we were able to remove a lot of the temp tables by replacing them with Views that contained the desired 'objects' we needed then we updated our stored procedures to query off of those views.

Perhaps that may work in your situation as well.

Andrew Hanson
+1  A: 

Have you tried putting more than 1 query parameter type in the list of parameters for your main proc? I only wrote the proc to cover the Invoice table, you will need to extend it for your additional tables.

CREATE PROCEDURE hd_invoice_select
(
    @id INT = NULL
    , @customerId INT = NULL
) AS
BEGIN
    SELECT * 
        FROM Invoice 
        WHERE 
            (
                @id IS NULL
                OR InvoiceID = @id
            )
            AND (
                @customerId IS NULL
                OR CustomerID = @customerId
            )
    RETURN
END

This proc can be called wide open by sending @id and @customerId as NULLs, for a specific InvoiceID based on @id with @customerId as NULL (or just leave it off all together), or for a specific customer based on @customerId leaving @id as NULL or exclude it from the query.

You also should look at views and Table-Valued User-Defined Functions. You can put these in your procs to wrap up some of the logic away from the procs so they can be shared and maintained in a single place. Having some of the logic in views/functions also allows you to deal with the data in a query window as if it were a table.

Chris Porter
Chris, this is basically the same solution we use, but with one slight change. In the WHERE clause we would use: CustomerID = ISNULL(@customerId, CustomerID)The reason why is this gives us the side effect of returning ALL items if the value is null. That way if we call EXEC hd_invoice_select() without parameters, we can get a full listing.This may not be something you want, but we find it very useful.Cheers!-Erick
Erick
Both of our approaches should result in the same record set. I've played some with using ISNULL instead of the OR version and can't remember why I ended up on the OR side. Theoretically they should optimize (almost) identically. I'll try on some of our bigger procs and see if I can find a noticable performance difference in either approach.
Chris Porter
Meant to include this on the last comment: I definitely appreciate the brevity of your approach.
Chris Porter
You may find this interesting - http://www.sommarskog.se/dyn-search-2005.html
Russ Cam
Thanks for you answer. I think this approach can work fine if you have a very small number of parameters (2 or 3). But it will be hard to use and understand such procedure has 15 optional parameters. Also, by giving a single proc, you hide the meaning and intent of the procedure.
Sly
And, if you have a lot of data, performance will be terrible.
erikkallen
@erikkallen, I've used this structure for very large tables without significant performance issues. I will argue that it won't perform as well as individual procs that are tuned for each type of data pull , but if you are looking to combine procs it can work very well.
Chris Porter
A: 

In some cases I use VIEWS to reuse "code". In cases as filters, active items, outdated things, and so on...

FerranB
A: 

Maybe you should learn to use joins. You could put the basic join of the three tables in a view and just query that with the sp handing the different parameters. Also, you should not in general use select * ever in production code. Only return the few columns you actually need in the circumstances and your whole system will be better performing. Plus you won't have unintended results when people change the structure on you.

HLGEM
@HLGEM: How will you return multiple resultsets using joins? I need the invoice, all the items of the invoice and all the comments for the invoice returned as 3 resultsets.
Sly
+1  A: 

I'm the person who asked this question in the first place. I'm answering my own question here to let you know the code reuse solution I use and to get your comments on that approach. If this answer gets a lot of up votes, I will select it as the final answer.

This approach works and is simple to use. I don’t know if it has a performance impact because it relies heavily on temporary tables.

For each concept in my application, I have one storec proc like this:

CREATE PROCEDURE hd_invoice_selectFromTempTable AS

 /* Get the IDs from an existing #TempInvoiceIDs temporary table */

 SELECT * FROM Invoice WHERE InvoiceID IN
     (SELECT ID FROM #TempInvoiceIDs)

 SELECT * FROM InvoiceItem WHERE InvoiceID IN 
     (SELECT ID FROM #TempInvoiceIDs)

 SELECT * FROM InvoiceComments WHERE InvoiceID IN
     (SELECT ID FROM #TempInvoiceIDs)

 RETURN

Then I create as many selection stored proc as I need:

CREATE PROCEDURE hd_invoice_select(@id INT) AS

 /* Fill #TempInvoiceIDs with matching IDs */
 SELECT id AS ID INTO #TempInvoiceIDs

 EXEC hd_invoice_selectFromTempTable
 RETURN

CREATE PROCEDURE hd_invoice_selectAllForCustomer(@customerID INT) AS

 /* Fill #TempInvoiceIDs with matching IDs */
 SELECT invoiceID as ID
 INTO #TempInvoiceIDs
 FROM Invoice WHERE CustomerID = @customerID

 EXEC hd_invoice_selectFromTempTable
 RETURN

CREATE PROCEDURE hd_invoice_selectAllActive AS

 /* Fill #TempInvoiceIDs with matching IDs */
 SELECT invoiceID as ID
 INTO #TempInvoiceIDs
 FROM Invoice WHERE Status = 10002

 EXEC hd_invoice_selectFromTempTable
 RETURN

What do you think of this approach? It is somewhat similar to AlexKuznetsov's answer but I use temp tables instead of a BLOB parameter.

Sly
when you pr0cedures use temp tables, they (procedures) recompile all the time, using up a lot of CPU - sometimes we need it, usually we don't. You can also submit a list of IDs as a comma-separated list (somewhat slow) or as a concatenation of fixed width string representations (much faster) - that does not require recompiles.
AlexKuznetsov
Very useful comment, thank you. Is the time to recompile the proc significant versus the total execution time? If a proc takes 2ms to recompile and 120ms to execute, then the recompilation time is not really significant. Right?
Sly
I like this approach, because it is straight forward using objects that are natural to SQL Server.
Shannon Severance
SQL Server 2008 now supports table-valued parameters for stored procedures. I haven't used them yet but it might help you to avoid temp tables: http://msdn.microsoft.com/en-us/library/bb510489.aspx
Chris Porter
I posted a variance of this answer using table-valued parameters.
Chris Porter
+1  A: 

Posting this as a second answer because it is a different approach. If you are using SQL Server 2008:

CREATE TYPE InvoiceListTableType AS TABLE 
(
    InvoiceId INT
);
GO

CREATE PROCEDURE hd_invoice_selectFromTempTable
(
    @InvoiceList InvoiceListTableType READONLY
)
AS
BEGIN
    SELECT * FROM Invoice WHERE InvoiceID IN
        (SELECT InvoiceId FROM @InvoiceList)

    SELECT * FROM InvoiceItem WHERE InvoiceID IN 
        (SELECT InvoiceId FROM @InvoiceList)

    SELECT * FROM InvoiceComments WHERE InvoiceID IN
        (SELECT InvoiceId FROM @InvoiceList)

    RETURN
END
GO

CREATE PROCEDURE hd_invoice_select(@id INT) AS
BEGIN
    DECLARE @InvoiceList AS InvoiceListTableType;

    SELECT id AS ID 
        INTO @InvoiceList

    EXEC hd_invoice_selectFromTempTable(@InvoiceList)
    RETURN
END
GO

CREATE PROCEDURE hd_invoice_selectAllForCustomer(@customerID INT) AS
BEGIN
    DECLARE @InvoiceList AS InvoiceListTableType;

    SELECT invoiceID as ID
        INTO @InvoiceList
        FROM Invoice WHERE CustomerID = @customerID

    EXEC hd_invoice_selectFromTempTable(@InvoiceList)
    RETURN
END
GO

CREATE PROCEDURE hd_invoice_selectAllActive AS
BEGIN
    DECLARE @InvoiceList AS InvoiceListTableType;

    SELECT invoiceID as ID
        INTO @InvoiceList
        FROM Invoice WHERE Status = 10002

    EXEC hd_invoice_selectFromTempTable(@InvoiceList)
    RETURN
END
GO
Chris Porter
That is very cool. Do you have any idea how this performs? Is the query optimizer doing a good job? Internally, is it using temp tables or is it really passing the set as a simple value.
Sly
I haven't had a chance to use them yet but have read up on them a little. I'm very anti-temp tables when possible so it seems like a more elegant solution as long as it performs well.
Chris Porter
I will refactor to use this approach as soon as we move to SQL 2008.
Sly