views:

89

answers:

4

I have 2 columns in my table, called TaskSet and SkillsSelected.

The sample data as follow:

TaskSet                        | SkillsSelected
--------------------------------------------------
SK000001, SK000004, SK000002   | SK000001, SK000002, SK000003
SK000002                       | SK000002, SK000003, SK000004

As you can see it's using comma to separate the data. I want a query that will give me the record that is not from the TaskSet that is not exist in the SkillsSelected so in this case will return:

SK000003 
SK000003, SK000004
+3  A: 

This is not the best way to store the data. Why is it all in one record? It should be more normalized.

I agree on this one. But the data is getting from the flat file which nothing I can do about this.
dewacorp.alliances
But your not reading a flat file, which means you are taking the flat file and inserting into a SQL table. At that point, you should normalize.
Why has this answer received so many votes when it doesn't answer the question raised? A data model change is not trivial. No disrespect to JFreedman, but this should have been a comment like the one `David Hall` made. I realize that the account didn't have enough rep to leave comments.
OMG Ponies
A: 

I'd sent this specification to whoever designed the table..truly awful!

I'd do this using a CLR stored procedure, it's possible with a curser but the code would be awful.

Paul Creasey
+6  A: 

The best way to deal with comma separated lists in SQL Server is to create a UDF that returns a table type. See this link for details. MS documentation claims that a CLR UDF is faster, but here's an actual comparison of the two options in use.

Once that is in place, you can use:

SELECT t.*
  FROM TABLE t
 WHERE EXISTS(SELECT value 
                FROM dbo.split(t.taskset)
              INTERSECT
              SELECT value 
                FROM dbo.split(t.skillsselected))

Reference:

OMG Ponies
Is there a particular reason for using a CLR function rather than a plain tsql UDF? Also the link you gave is just plain SQL - this link details creating CLR UDFs http://davidhayden.com/blog/dave/archive/2006/04/18/2917.aspx
David Hall
Thanks I will take this approach
dewacorp.alliances
@OMG Ponies Thanks - I figured it was for performance reasons, just wanted to check. I've always done the CSV split using a plain UDF, might have to switch to using CLR in the future though.
David Hall
@David: Found a comparison: http://blogs.lessthandot.com/index.php/DataMgmt/DBProgramming/split-string-in-sql-server-2005-clr-vs-t
OMG Ponies
@OMG Ponies - Cheers for the comparison, good reading. (and there is a +1 on the way as soon as I have more upvotes)
David Hall
@OMG Ponies: Thanks for code snippet. Helpful in understanding a few points. So an upvote from me. But i still dont get the result. It gives the original table back (which based on the code logic looks correct but is not the required result). Since INTERSECT is used and column 2 in the original table has a value from column 1 it will always give all the rows. Am i missing something here?
stackoverflowuser
@stackoverflowuser: I forgot the correlation between the intersected result, and the table wanting to retrieve the values based on that intersected list.
OMG Ponies
+3  A: 

First, implement the CLR Split() function that you get in the Samples on the SQL 2005 installation media.

SELECT t.*, s.value
FROM yourTable t
CROSS APPLY
dbo.Split(SkillsSelected) s

EXCEPT

SELECT t.*, s.value
FROM yourTable t
CROSS APPLY
dbo.Split(TaskList) s
;

Alternatively, you could do it using a table of numbers, which would work sufficiently well, and doesn't force you to turn on CLR.

[Edit] Edited to include t.* in the list. This should really be changed to use the PK of t, or some other identifying feature.

[Edit] Sorry - had them the wrong way around.

And now that the question has changed, let's address the concatenation too:

SELECT t.*, 
STUFF(
(
    SELECT ', ' + value
    FROM 
    (
    SELECT s.value
    FROM
    dbo.Split(SkillsSelected) s
    EXCEPT
    SELECT s.value
    FROM
    dbo.Split(TaskList) s
    ) v
    FOR XML PATH('')
),1,2,'') AS MissingSkills
FROM yourTable AS t;
Rob Farley
That works well BUt if the record has more than 1 rows. If I got the the function for return from the table to the comma delimited. But still how do return 1 row at the time. If you know what I mean?
dewacorp.alliances
It works!!! Thanks. I am glad I am not using cursor :)
dewacorp.alliances