tags:

views:

157

answers:

5

I'm running SQL Server 2008, and trying to run some queries on some poorly stored data that I'd rather not go through and try to clean up (millions of rows to deal with). What I have is a log table with some FK data as a comma-separated string in one table, and an integer PK in another table. I'd like to be able to join the FK list of integers to the proper PK ints in the primary table. My query is something like this:

SELECT
    L.*
FROM MyLogs L (NOLOCK)
    INNER JOIN Details D (NOLOCK) ON L.SearchValue = D.ID

In the above query, L.SearchValue contains "123,456,7890", while D.ID is an integer primary key. Is there any way to join these together in any sort of efficient manner, or should I just accept my fate and start working on a script to break all of these comma separated values into separate rows?

A: 

Very very very bad design, and this will be very poor perfomance, but you can use CharIndex function

   Select L.*FROM MyLogs L (NOLOCK)    
     Join Details D (NOLOCK) 
         On CharIndex(Cast(D.ID as varChar(6)), L.SearchValue) > 0

And this will prodcude some false joins as 12 will be in '34, 312, 455'. To eliminate these joins you have to make query even more complex, by including delimiter

    Select L.*FROM MyLogs L (NOLOCK)    
     Join Details D (NOLOCK) 
         On CharIndex(Cast(', ' + D.ID as varChar(6)) + ',', 
                             ', ' + L.SearchValue + ',') > 0

... But I'd recommend that you start working on a script to break all of these comma separated values into separate rows.

Charles Bretana
Not really. Say D.ID is 1, and L.SearchValue is "34199"?
Stu
@Stu, you caught me between Edits...
Charles Bretana
A: 

The script to break this horribly-stored data will ultimately be much more efficient than writing a hacky SQL query that attempts to join on those values. i'm not sure that a join is even the way to go. Probably some form of subquery would work better as plan A.

Paul Sasik
A: 

Can the number of commas vary in each row?

If not, you could create a temp table with a set number of rows and split the data up in there, the use that temp table in your query.

kevchadders
A: 

You need one of these (modify to do cast to ints)

http://stackoverflow.com/questions/314824/t-sql-opposite-to-string-concatenation-how-to-split-string-into-multiple-recor

Then, you're down to

SELECT    
L.*
FROM MyLogs L (NOLOCK),    
Details D (NOLOCK) 
Where D.ID In fn_WhatEverYouCallYourSplitFunction(L.SearchValue, ',')

And no, that is not efficient. At all.

Stu
A: 

Try this:

SELECT L.*
FROM MyLogs L (NOLOCK)
    INNER JOIN Details D (NOLOCK) ON CONTAINS(',' + L.SearchValue + ',', ',' + Convert(varchar,D.ID) + ',')

It will be horribly ineffecient, but unlike Daniels answer it wont break if lets say SearchValue contains (123,234,345) and D.ID is 23 (Which would be true for 2 rows)

Adding the commas fixes that.

TJMonk15