views:

395

answers:

5

I've been beating my head on the desk trying to figure this one out. I have a table that stores job information, and reasons for a job not being completed. The reasons are numeric,01,02,03,etc. You can have two reasons for a pending job. If you select two reasons, they are stored in the same column, separated by a comma. This is an example from the JOBID table:

Job_Number     User_Assigned     PendingInfo

1              user1             01,02

There is another table named Pending, that stores what those values actually represent. 01=Not enough info, 02=Not enough time, 03=Waiting Review. Example:

Pending_Num    PendingWord

01             Not Enough Info
02             Not Enough Time

What I'm trying to do is query the database to give me all the job numbers, users, pendinginfo, and pending reason. I can break out the first value, but can't figure out how to do the second. What my limited skills have so far:

select Job_number,user_assigned,SUBSTRING(pendinginfo,0,3),pendingword
from jobid,pending
where
    SUBSTRING(pendinginfo,0,3)=pending.pending_num and
    pendinginfo!='00,00' and
    pendinginfo!='NULL'

What I would like to see for this example would be:

Job_Number  User_Assigned   PendingInfo   PendingWord       PendingInfo  PendingWord

1           User1           01            Not Enough Info   02           Not Enough Time

Thanks in advance

+1  A: 

If changing the schema is an option (which it probably should be) shouldn't you implement a many-to-many relationship here so that you have a bridging table between the two items? That way, you would store the number and its wording in one table, jobs in another, and "failure reasons for jobs" in the bridging table...

Dwaine Bailey
+3  A: 

You really shouldn't store multiple items in one column if your SQL is ever going to want to process them individually. The "SQL gymnastics" you have to perform in those cases are both ugly hacks and performance degraders.

The ideal solution is to split the individual items into separate columns and, for 3NF, move those columns to a separate table as rows if you really want to do it properly (but baby steps are probably okay if you're sure there will never be more than two reasons in the short-medium term).

Then your queries will be both simpler and faster.


However, if that's not an option, you can use the afore-mentioned SQL gymnastics to do something like:

where find ( ',' |fld| ',', ',02,' ) > 0

assuming your SQL dialect has a string search function (find in this case, but I think charindex for SQLServer).

This will ensure all sub-columns begin and start with a comma (comma plus field plus comma) and look for a specific desired value (with the commas on either side to ensure it's a full sub-column match).


If you can't control what the application puts in that column, I would opt for the DBA solution - DBA solutions are defined as those a DBA has to do to work around the inadequacies of their users :-).

Create two new columns in that table and make an insert/update trigger which will populate them with the two reasons that a user puts into the original column.

Then query those two new columns for specific values rather than trying to split apart the old column.

This means that the cost of splitting is only on row insert/update, not on _every single select`, amortising that cost efficiently.


Still, my answer is to re-do the schema. That will be the best way in the long term in terms of speed, readable queries and maintainability.

paxdiablo
Yes, recreating would be the best way, but that is not an option at this point. This is the situation I have to work with. Sorry, forgot to mention that this is MS SQL 2008
lp1
Then tell your bosses it's a bad idea then try the `charindex` option. And, if the performance sucks, you can then tell your bosses you were right and they were idiots, hopefully with more tact than I'm usually capable of :-)
paxdiablo
I will give that charindex a try.
lp1
+1  A: 

I hope you are just maintaining the code and it's not a brand new implementation.
Please consider to use a different approach using a support table like this:

JOBS TABLE
jobID | userID
--------------
1     | user13
2     | user32
3     | user44
--------------

PENDING TABLE
pendingID | pendingText
---------------------------
01        | Not Enough Info
02        | Not Enough Time
---------------------------

JOB_PENDING TABLE
jobID | pendingID
-----------------
1     | 01
1     | 02
2     | 01
3     | 03
3     | 01
-----------------

You can easily query this tables using JOIN of subqueries.
If you need retro-compatibility on your software you can add a view to reach this goal.

Cesar
A: 

Have a look at a similar question I answered here

;WITH Numbers AS 
( 
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 0)) AS N
    FROM JobId
), 
Split AS 
( 
    SELECT JOB_NUMBER, USER_ASSIGNED, SUBSTRING(PENDING_INFO, Numbers.N, CHARINDEX(',', PENDING_INFO + ',', Numbers.N) - Numbers.N) AS PENDING_NUM
    FROM JobId
    JOIN Numbers ON Numbers.N <= DATALENGTH(PENDING_INFO) + 1 
    AND SUBSTRING(',' + PENDING_INFO, Numbers.N, 1) = ','
) 
SELECT *
FROM Split JOIN Pending ON Split.PENDING_NUM = Pending.PENDING_NUM

The basic idea is that you have to multiply each row as many times as there are PENDING_NUMs. Then, extract the appropriate part of the string

Chris Bednarski
A: 

Chris, that did the trick. Thanks again

Can you accept my answer? Thanks. Click on the tick next to it.
Chris Bednarski
Chris,Sorry, been away. I wasn't trying to be rude, but I don't see a place to accept the answer.