views:

1902

answers:

8

I have a products table. Each row in that table corresponds to a single product and it's identified by a unique Id. Now each product can have multiple "codes" associated with that product. For example:

Id     |    Code
----------------------
0001   |   IN,ON,ME,OH
0002   |   ON,VI,AC,ZO
0003   |   QA,PS,OO,ME

What I'm trying to do is create a stored procedure so that I can pass in a codes like "ON,ME" and have it return every product that contains the "ON" or "ME" code. Since the codes are comma separated, I don't know how I can split those and search them. Is this possible using only TSQL?

Edit: It's a mission critical table. I don't have the authority to change it.

+4  A: 

The way you are storing data breaks normalization rules. Only a single atomic value should be stored in each field. You should store each item in a single row.

Mehrdad Afshari
Unfortunately, I didn't design it. It already exists so I have just have to deal with it.
If you are working on the system now, don't you have the authority to change it? Even if you do get a solution to your problem, it won't perform well for much more than small data sets.
JohnFx
I don't have the authority to change it. It's a mission critical table.
I think you can create a user defined function to do so but I don't have SQL Server at this machine to test. Someone will probably come up with the solution.
Mehrdad Afshari
udf is done already check msdn link in my response below
Charles Bretana
A: 

This might not be possible if you're stuck with that database design, but it would be a lot easier to put the codes into separate records in another table:

ProductCode
-----------
ProductID (FK to Product.ID)
Code (varchar)

The table might look like this:

ProductID    Code
-----------------
0001         IN
0001         ON
0001         ME
...

The query would look something like this (you'd have to pass in the Codes somehow - either as separate variables, or maybe a comma-separated string that you split in the proc):

select ProductID
from ProductCode
where Code in ('ON', 'ME')
Andy White
+7  A: 

You should be storing the codes in a separate table, since you have a many to many relationship. If you separate them, then you will easily be able to check.

It would be possible to do in the type of system you have now, but would require text searching of the columns, with multiple searches per row to work, which will have huge performance problems as your data grows.

If you try to go down you current path : You will have to break apart your input string, because nothing guarantees the codes on each record are in the same order (or contiguous) as the input parameter. Then you would have to do a

Code LIKE '%IN%'
AND Code Like '%QA%'

query with an additional statement for every code you are checking for. Very inefficient.

The UDF idea below is also a good idea. However, depending on the size of your data and the frequency of queries and updates, you may have issues there as well.

would it be possible to create an additional table that is normalized that is synchronized on a scheduled basis (or based on a trigger) for you to query against?

Jason Coyne
+3  A: 

Although all the previous posters are correct about the normalization of your db schema, you can do what you want using a "Table-Valued UDF" that takes a delimited string and returns a Table, with one row per value in the string... You can use this table as you would any other table in your stored proc , joining to it, etc... this will solve your immediate issue...

Here's a link to such a UDF: FN_Split UDF

Although the article talks about using it to pass a delimited list of data values in to a stored proc, you can use the same UDF to operate on a delimited string stored in a column of an existing table....

Charles Bretana
+4  A: 

Everybody else seems very eager to tell you that you should not do this, although I don't see any explicit explanation for why not.

Apart from breaking the normalization rules, the reason is that you'll do a table-scan through all rows, since you can't have an index on the individual "values" in that column.

Simply put, there's no way for the database engine to keep some kind of quick-list of which rows contains the code 'AC', unless you either break it up into a separate table, or put it in a column by itself.

Now, if you have other criteria in your SELECT statements that will limit the number of rows down to some manageable number, then perhaps this will be ok, but otherwise I would, if you can, try to avoid this solution and do what others have already told you, split it up into a separate table.

Now, if you're stuck with this design, you can do a search using the following type of query:

...
WHERE ',' + Code + ',' LIKE '%,AC,%'

This will:

  • Match 'ON,VI,AC,ZO'
  • Not match 'ON,VI,TAC,ZO'

I don't know if the last one is a viable option in your case, if you only have 2-letter codes, then you can use just this:

...
WHERE Code LIKE '%AC%'

But again, this will perform horribly unless you limit the number of rows using other criteria.

Lasse V. Karlsen
Not if that's all of my answer that you read, no, that is entirely correct. Please re-read my SQL and consider why I wrote it the way I did.
Lasse V. Karlsen
Spoiler for inobservant readers: he adds commas to the start and end of the `Code` field, *then* he uses the LIKE predicate on it.
Bill Karwin
And it will perform horribly, but not because of the fact that I add commas to the either side of the LIKE clause.
Lasse V. Karlsen
Agreed. I call this pattern "Jaywalking" because it's trying to avoid the intersection! :-)
Bill Karwin
@lassevk: You're right. My fault. I deleted my comment to prevent further confusion.
Mehrdad Afshari
+2  A: 
Irawan Soetomo
of all the responses to this question, yours is the only *answer*
thinkhard
of all the postings i made, you are the 1st to comment. thx! :)
Irawan Soetomo
A: 

I agree with other posters here that you should look carefully into schema normalization, but I also know that shortcuts are part of life.

Here's a sample function written in Sybase dialect that does what you do:

ALTER FUNCTION "DBA"."f_IsInStringList"( IN @thisItem char(2), IN @thisList varchar(4000) )
RETURNS INTEGER
DETERMINISTIC
BEGIN


DECLARE is_member bit;
DECLARE LOCAL TEMPORARY TABLE tmp (thisItem  char(2)) ;
DECLARE @tempstring varchar(10);
DECLARE @count integer;

IF LENGTH(TRIM(@thisList)) > 0 THEN

    WHILE LENGTH(TRIM(@thisList)) > 0  LOOP
       -- loop over comma-separated list and stuff members into temp table
       IF LOCATE ( @thisList, ',' , 1) > 0 THEN

           SET @count = LOCATE ( @thisList, ',' , 1);
           SET @tempstring = SUBSTRING ( @thisList, 1,@count-1 );

           INSERT INTO tmp ( thisItem  ) VALUES (  @tempstring );
           SET @thisList = STUFF ( @thisList, 1, @count, '' )

        ELSE

            INSERT INTO tmp ( thisItem  ) VALUES ( @thisList );
            SET @thisList = NULL;

        END IF;

    END LOOP ;

END IF;

IF EXISTS (SELECT * FROM tmp WHERE thisItem   = @thisItem ) THEN
    SET is_member = 1;
ELSE
    SET is_member = 0 ;
END IF ;

    RETURN is_member;
END

You can then build a simple query to check whether a value occurs in your comma-separated string:

select * from some_table t 
         WHERE f_IsInStringList('OR', t.your_comma_separated_column) = 1 OR
               f_IsInStringList('ME', t.your_comma_separated_column) = 1
Vincent Buck
A: 

More than 1 year old question, but still thought it will be useful. You can use the FIND_IN_SET function of MySql. I am not sure whether other DBMSs support it or not.

You can use this function as follows:

SELECT * FROM `table_name` WHERE FIND_IN_SET('AC', `Code`) > 0
kumar