views:

507

answers:

3

I need to filter result sets from sql server based on selections from a multi-select list box. I've been through the idea of doing an instring to determine if the row value exists in the selected filter values, but that's prone to partial matches (e.g. Car matches Carpet).

I also went through splitting the string into a table and joining/matching based on that, but I have reservations about how that is going to perform.

Seeing as this is a seemingly common task, I'm looking to the Stack Overflow community for some feedback and maybe a couple suggestions on the most commonly utilized approach to solving this problem.

+2  A: 

I solved this one by writing a table-valued function (we're using 2005) which takes a delimited string and returns a table. You can then join to that or use WHERE EXISTS or WHERE x IN. We haven't done full stress testing yet, but with limited use and reasonably small sets of items I think that performance should be ok.

Below is one of the functions as a starting point for you. I also have one written to specifically accept a delimited list of INTs for ID values in lookup tables, etc.

Another possibility is to use LIKE with the delimiters to make sure that partial matches are ignore, but you can't use indexes with that, so performance will be poor for any large table. For example:

SELECT
     my_column
FROM
     My_Table
WHERE
     @my_string LIKE '%|' + my_column + '|%'

.

/*
    Name:  GetTableFromStringList
    Description: Returns a table of values extracted from a delimited list
    Parameters:
      @StringList - A delimited list of strings
      @Delimiter - The delimiter used in the delimited list

    History:
    Date  Name   Comments
    ---------- ------------- ----------------------------------------------------
    2008-12-03 T. Hummel Initial Creation
*/
CREATE FUNCTION dbo.GetTableFromStringList
(
    @StringList VARCHAR(1000),
    @Delimiter CHAR(1) = ','
)
RETURNS @Results TABLE
(
    String VARCHAR(1000) NOT NULL
)
AS
BEGIN
    DECLARE
     @string  VARCHAR(1000),
     @position SMALLINT

    SET @StringList = LTRIM(RTRIM(@StringList)) + @Delimiter
    SET @position = CHARINDEX(@Delimiter, @StringList)

    WHILE (@position > 0)
    BEGIN
     SET @string = LTRIM(RTRIM(LEFT(@StringList, @position - 1)))

     IF (@string <> '')
     BEGIN
      INSERT INTO @Results (String) VALUES (@string)
     END

     SET @StringList = RIGHT(@StringList, LEN(@StringList) - @position)
     SET @position = CHARINDEX(@Delimiter, @StringList, 1)
    END

    RETURN
END
Tom H.
Thanks for this. You might be interested in http://www.sommarskog.se/arrays-in-sql-2005.html in relation to performance related to numerous string splitting functions. There's some really good performance metrics in the performance appendix.
Brian Hasden
Thanks for the pointer. I've read a lot of Erland's stuff, but somehow missed this one. The use of the fixed-length strings with a numbers table is interesting.
Tom H.
Yeah, I still don't fully understand the more complex queries, but it's really interesting. I'm going to have to re-read it a few more times and hope it all clicks. I still have to believe there's a better way to do this outside of table variables, etc. Everything I read says there's not though.
Brian Hasden
A: 

I've been through the idea of doing an instring to determine if the row value exists in the selected filter values, but that's prone to partial matches (e.g. Car matches Carpet)

It sounds to me like you aren't including a unique ID, or possibly the primary key as part of values in your list box. Ideally each option will have a unique identifier that matches a column in the table you are searching on. If your listbox was like below then you would be able to filter for specifically for cars because you would get the unique value 3.

<option value="3">Car</option>
<option value="4">Carpret</option>

Then you just build a where clause that will allow you to find the values you need.


Updated, to answer comment.

How would I do the related join considering that the user can select and arbitrary number of options from the list box? SELECT * FROM tblTable JOIN tblOptions ON tblTable.FK = ? The problem here is that I need to join on multiple values.

I answered a similar question here.

One method would be to build a temporary table and add each selected option as a row to the temporary table. Then you would simply do a join to your temporary table.

If you want to simply create your sql dynamically you can do something like this.

SELECT * FROM tblTable WHERE option IN (selected_option_1, selected_option_2, selected_option_n)
Zoredache
How would I do the related join considering that the user can select and arbitrary number of options from the list box?SELECT * FROM tblTable JOIN tblOptions ON tblTable.FK = ?The problem here is that I need to join on multiple values.Maybe I'm missing something though. Please elaborate.
Brian Hasden
A: 

I've found that a CLR table-valued function which takes your delimited string and calls Split on the string (returning the array as the IEnumerable) is more performant than anything written in T-SQL (it starts to break down when you have around one million items in the delimited list, but that's much further out than the T-SQL solution).

And then, you could join on the table or check with EXISTS.

casperOne
That's what I am reading. It's a little concerning because I've had bad experiences with CLR stored procs. Not only were they a pain to get through the DBAs, but the CPU utilization was very high with a moderate number of transactions. Replaced the CLR sp with a CTE and everything was back to normal
Brian Hasden
CLR SPs should be reserved for interacting with the OS (dropping messages in a queue, etc). They are way too much maintenance hassle to use them unless it's absolutely necessary. The TSQL string splitting functions are blindingly fast compared to real world queries anyway.
Eric Z Beard
@Brian: Depending on what you are doing, CPU utilization is going to be high. I don't know what the stored proc was doing, so I can't say whether or not it was normal, or it could have been improved to not utilize so much CPU.
casperOne
@Eric: I completely disagree. T-SQL is more optimized for set operations, not for procedural, general purpose code. I can't see any way that the above answer is easier than a call to Split on the string class. I'll write a blog post on the performance aspects as well.
casperOne
I understand that a call to split string would be ideal, that TSQL is intended for set operations, and that academically a CLR function is the intended solution. Unfortunately, unless something drastic has changed, deploying and maintaining CLR code soley for this purpose isn't worth it.
Brian Hasden
@casperOne, TSQL is a perfectly capable procedural language. I split my time evenly between C# and TSQL, using whichever is most appropriate for the task. CLR stored procedures are not worth the hassle, unless you absolutely can't implement the solution any other way.
Eric Z Beard