views:

868

answers:

6

One of the "best practice" is accessing data via stored procedures. I understand why is this scenario good. My motivation is split database and application logic ( the tables can me changed, if the behaviour of stored procedures are same ), defence for SQL injection ( users can not execute "select * from some_tables", they can only call stored procedures ), and security ( in stored procedure can be "anything" which secure, that user can not select/insert/update/delete data, which is not for them ).

What I don't know is how to access data with dynamic filters.

I'm using MSSQL 2005.

If I have table:

CREATE TABLE tblProduct (
   ProductID uniqueidentifier -- PK
   , IDProductType uniqueidentifier -- FK to another table
   , ProductName nvarchar(255) -- name of product
   , ProductCode nvarchar(50) -- code of product for quick search
   , Weight decimal(18,4)
   , Volume decimal(18,4)
)

then I should create 4 stored procedures ( create / read / update / delete ).

The stored procedure for "create" is easy.

CREATE PROC Insert_Product ( @ProductID uniqueidentifier, @IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
   INSERT INTO tblProduct ( ProductID, IDProductType, ... etc .. ) VALUES ( @ProductID, @IDProductType, ... etc ... )
END

The stored procedure for "delete" is easy too.

CREATE PROC Delete_Product ( @ProductID uniqueidentifier, @IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
    DELETE tblProduct WHERE ProductID = @ProductID AND IDProductType = @IDProductType AND ... etc ...
END

The stored procedure for "update" is similar as for "delete", but I'm not sure this is the right way, how to do it. I think that updating all columns is not efficient.

CREATE PROC Update_Product( @ProductID uniqueidentifier, @Original_ProductID uniqueidentifier, @IDProductType uniqueidentifier, @Original_IDProductType uniqueidentifier, ... etc ... ) AS BEGIN
   UPDATE tblProduct SET ProductID = @ProductID, IDProductType = @IDProductType, ... etc ...
      WHERE ProductID = @Original_ProductID AND IDProductType = @Original_IDProductType AND ... etc ...
END

And the last - stored procedure for "read" is littlebit mystery for me. How pass filter values for complex conditions? I have a few suggestion:

Using XML parameter for passing where condition:

CREATE PROC Read_Product ( @WhereCondition XML ) AS BEGIN
    DECLARE @SELECT nvarchar(4000)
    SET @SELECT = 'SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume FROM tblProduct'

    DECLARE @WHERE nvarchar(4000)
    SET @WHERE = dbo.CreateSqlWherecondition( @WhereCondition ) --dbo.CreateSqlWherecondition is some function which returns text with WHERE condition from passed XML

    DECLARE @LEN_SELECT int
    SET @LEN_SELECT = LEN( @SELECT )
    DECLARE @LEN_WHERE int
    SET @LEN_WHERE = LEN( @WHERE )
    DECLARE @LEN_TOTAL int
    SET @LEN_TOTAL = @LEN_SELECT + @LEN_WHERE
    IF @LEN_TOTAL > 4000 BEGIN
        -- RAISE SOME CONCRETE ERROR, BECAUSE DYNAMIC SQL ACCEPTS MAX 4000 chars
    END

    DECLARE @SQL nvarchar(4000)
    SET @SQL = @SELECT + @WHERE

    EXEC sp_execsql @SQL
END

But, I think the limitation of "4000" characters for one query is ugly.

The next suggestion is using filter tables for every column. Insert filter values into the filter table and then call stored procedure with ID of filters:

CREATE TABLE tblFilter (
   PKID uniqueidentifier -- PK
   , IDFilter uniqueidentifier -- identification of filter
   , FilterType tinyint -- 0 = ignore, 1 = equals, 2 = not equals, 3 = greater than, etc ...
   , BitValue bit , TinyIntValue tinyint , SmallIntValue smallint, IntValue int
   , BigIntValue bigint, DecimalValue decimal(19,4), NVarCharValue nvarchar(4000)
   , GuidValue uniqueidentifier, etc ... )

CREATE TABLE Read_Product ( @Filter_ProductID uniqueidentifier, @Filter_IDProductType uniqueidentifier, @Filter_ProductName uniqueidentifier, ... etc ... ) AS BEGIN
   SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume
   FROM tblProduct
   WHERE ( @Filter_ProductID IS NULL
            OR ( ( ProductID IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = @Filter_ProductID AND FilterType = 1 ) AND NOT ( ProductID IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = @Filter_ProductID AND FilterType = 2 ) )
      AND ( @Filter_IDProductType IS NULL
            OR ( ( IDProductType IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = @Filter_IDProductType AND FilterType = 1 ) AND NOT ( IDProductType IN ( SELECT GuidValue FROM tblFilter WHERE IDFilter = @Filter_IDProductType AND FilterType = 2 ) )
      AND ( @Filter_ProductName IS NULL OR ( ... etc ... ) ) 
END

But this suggestion is littlebit complicated I think.

Is there some "best practice" to do this type of stored procedures?

+6  A: 

For reading data, you do not need a stored procedure for security or to separate out logic, you can use views.

Just grant only select on the view.

You can limit the records shown, change field names, join many tables into one logical "table", etc.

Peter
Should I use views or functions?The functions are much more flexible I think. Is some reason why not use functions instead of views or isn't it important?
TcKs
Often the DBMS optimizer will understand views better than functions.
Joseph Bui
then again, it may not - in a large production database we had very significant performance improvements when we switched some larger complex queries from views to functions...
Steven A. Lowe
Agreed: testing and profiling is paramount. Personally, though, I think all functions should not depend on data in tables -- only because I like functional programming.
Joseph Bui
Using views alone will not protect against SQL injection.
Nack
"Using views alone will not protect against SQL injection." - How do you do SQL injection on a view that only has SELECT rights?
Peter
+5  A: 

First: for your delete routine, your where clause should only include the primary key.

Second: for your update routine, do not try to optimize before you have working code. In fact, do not try to optimize until you can profile your application and see where the bottlenecks are. I can tell you for sure that updating one column of one row and updating all columns of one row are nearly identical in speed. What takes time in a DBMS is (1) finding the disk block where you will write the data and (2) locking out other writers so that your write will be consistent. Finally, writing the code necessary to update only the columns that need to change will generally be harder to do and harder to maintain. If you really wanted to get picky, you'd have to compare the speed of figuring out which columns changed compared with just updating every column. If you update them all, you don't have to read any of them.

Third: I tend to write one stored procedure for each retrieval path. In your example, I'd make one by primary key, one by each foreign key and then I'd add one for each new access path as I needed them in the application. Be agile; don't write code you don't need. I also agree with using views instead of stored procedures, however, you can use a stored procedure to return multiple result sets (in some version of MSSQL) or to change rows into columns, which can be useful.

If you need to get, for example, 7 rows by primary key, you have some options. You can call the stored procedure that gets one row by primary key seven times. This may be fast enough if you keep the connection opened between all the calls. If you know you never need more than a certain number (say 10) of IDs at a time, you can write a stored procedure that includes a where clause like "and ID in (arg1, arg2, arg3...)" and make sure that unused arguments are set to NULL. If you decide you need to generate dynamic SQL, I wouldn't bother with a stored procedure because TSQL is just as easy to make a mistake as any other language. Also, you gain no benefit from using the database to do string manipulation -- it's almost always your bottleneck, so there is no point in giving the DB any more work than necessary.

Joseph Bui
First: You have propably truth. The check for "another user changed data after you got data and before you saved data" is worthless for deletion.Second: I think about updating a large columns. Like "ntext". In some cases, there can be a RTF text with images.
TcKs
Third: When I want to get 7 rows from tblProduct. Is good way to call "Read_Product" 7 times with various "@ProductID" value or should I do something another?
TcKs
For ntext, the data is often stored in a different block from the rest of the row, so yes, in that case you might want to have one stored procedure where you do not pass in the ntext column. Still, comparing one ntext value to a passed in value will be slower than just overwriting.
Joseph Bui
This requires testing and profiling. If you know you will never have more than a certain number (say 10), you can write a stored procedure that accepts 10 ID values, and uses NULL if you only need 7 this time. The select statement will include something like: "and ID in (arg1, arg2, arg3...)"
Joseph Bui
Sometimes, though, if you keep the connection open between all 7 calls, it is faster to just call the stored procedure with 1 ID 7 times. There is no point in making the stored procedure write dynamic SQL. You are just as likely to have errors in TSQL as in any other language.
Joseph Bui
A: 

In SQL 2005, it supports nvarchar(max), which has a limit of 2G, but virtually accepting all string operations upon normal nvarchar. You may want to test if this can fit into what you need in the first approach.

TimeSpace Traveller
+2  A: 

My suggestion is that you don't try to create a stored procedure that does everything that you might now or ever need to do. If you need to retrieve a row based on the table's primary key then write a stored procedure to do that. If you need to search for all rows meeting a set of criteria then find out what that criteria might be and write a stored procedure to do that.

If you try to write software that solves every possible problem rather than a specific set of problems you will usually fail at providing anything useful.

Tom H.
+2  A: 

your select stored procedure can be done as follows to require only one stored proc but any number of different items in the where clause. Pass in any one or combination of the parameters and you will get ALL items which match - so you only need one stored proc.

Create sp_ProductSelect
(
 @ProductID int = null,
 @IDProductType int = null,
 @ProductName varchar(50) = null,
 @ProductCode varchar(10) = null,
 ...
 @Volume int = null
)
AS
SELECT ProductID, IDProductType, ProductName, ProductCode, Weight, Volume FROM tblProduct'  
Where
  ((@ProductID is null) or (ProductID = @ProductID)) AND
  ((@ProductName is null) or (ProductName = @ProductName)) AND
  ...
  ((@Volume is null) or (Volume= @Volume))
Mauro
But with this way can I filter only for equality.
TcKs
I'd use this to getr a subset of data then allow dynamic filtering using the dataview within your code. This will provide you the most flexible solution without introducing potential security risks.
Mauro
+2  A: 

I disagree that create Insert/Update/Select stored procedures are a "best practice". Unless your entire application is written in SPs, use a database layer in your application to handle these CRUD activities. Better yet, use an ORM technology to handle them for you.

Terry Lorber
Agree - SPs are best for more complex operations, stuff that's more efficient/safe/easier to do directly on the database. There's almost nothing to gain with SPs for simple CRUD statements, better use a middle tier for that.
Joe Pineda