views:

73

answers:

2

I have inherited a piece of SQL that I am working on for a reporting system. The system is centered around Purchase Orders, and when they are Created, Transmitted, Acknowledged, and Confirmed. These states are a progressive state. The piece of SQL that I am reviewing is shown below and is from the WHERE clause:

OR (CreateJob.endtime is NULL and DATEDIFF(hour, buy_date, getdate()) > Vendor.expected_create_hours)
OR (TransmitJob.endtime is NULL and DATEDIFF(hour, CreateJob.endtime, getdate()) > Vendor.expected_transmit_hours)
OR (AcknowledgeJob.endtime is NULL and DATEDIFF(hour, TransmitJob.endtime, getdate()) > Vendor.expected_acknowledge_hours)
OR (ConfirmJob.endtime is NULL and DATEDIFF(hour, AcknowledgeJob.endtime, getdate()) > Vendor.expected_confirm_hours)

What I have found is that it is possible for a CreateJob to not have an end time because the job failed. However a job can execute more then one Purchase Order, so there are times when a given PO actually was created, but the job does not receive an end time because a sibling later failed. This creates a scenario where the states still progress for a PO, but it still shows up on this problem report because the CreateJob.endtime is NULL.

So first there are some obvious bugs with the system and architecture above, but that is a separate issue from what I am tackling at the moment.

It seems that I could fix the report though, by basically adding the progressive checks as AND statements. So the first check for the CreateJob changes too:

(     (CreateJob.endtime is NULL and DATEDIFF(hour, buy_date, getdate()) > Vendor.expected_create_hours) 
  AND (TransmitJob.endtime is NULL and DATEDIFF(hour, CreateJob.endtime, getdate()) > Vendor.expected_transmit_hours) 
  AND (AcknowledgeJob.endtime is NULL and DATEDIFF(hour, TransmitJob.endtime, getdate()) > Vendor.expected_acknowledge_hours) 
  AND (ConfirmJob.endtime is NULL and DATEDIFF(hour, AcknowledgeJob.endtime, getdate()) > Vendor.expected_confirm_hours))

Which is ugly and begins to obfuscate things into oblivion. Is there some way I can do the equivalent of the following in SQL? Basically some kind of macro or #define system?

CreateFailed = (CreateJob.endtime is NULL and DATEDIFF(hour, buy_date, getdate()) > Vendor.expected_create_hours)
TransmitFailed = (TransmitJob.endtime is NULL and DATEDIFF(hour, CreateJob.endtime, getdate()) > Vendor.expected_transmit_hours)
AcknowledgeFailed = (AcknowledgeJob.endtime is NULL and DATEDIFF(hour, TransmitJob.endtime, getdate()) > Vendor.expected_acknowledge_hours)
ConfirmFailed = (ConfirmJob.endtime is NULL and DATEDIFF(hour, AcknowledgeJob.endtime, getdate()) > Vendor.expected_confirm_hours)


OR (CreateFailed AND TransmitFailed AND AcknowledgeFailed AND ConfirmFailed)
OR (TransmitFailed AND AcknowledgeFailed AND ConfirmFailed)
OR (AcknowledgeFailed AND ConfirmFailed)
OR (ConfirmFailed)

If there is a common term or name for what I am trying to do maybe someone could add it to the title?

+2  A: 

A side point - Is the code snippet you've provided an extract from your production code? If so, you can replace the whole block with just the ConfirmFailed condition, since it's common to all the WHERE conditions you've shown and can satisfy it on its own.

Whilst it doesn't directly answer the question you asked, an approach to simplifying your code would be to calculate the values for each Failed indicator once in a pre-processing step - depending on your environment this might be a CTE, a temporary table or a view. That way, all the logic is in one place - this is particularly useful if the value is needed more than once in the query - in the SELECT and WHERE clauses for example.

Ed Harper
Thanks for the post Ed. This is an extract from the production code. I don't think I can quite replace them all with the ConfirmFailed due to the DatDiff portion of the logic, but I may be able to simplify it down.I like the idea of calculating the values once and storing them. That would make the query itself much more clean.
Andrew Dunaway
+1  A: 

[is] there is a common term or name for what I am trying to do...?

Abstraction?

You could hide each of your CreateFailed = ..., TransmitFailed = ..., etc logic into its own VIEW or CTE, then your main query could merely see if a Job exists in one of these tables e.g. (pseudo SQL):

SELECT Jobs.job_ID, ...
  FROM Jobs
 WHERE EXISTS (
               SELECT * 
                 FROM CreatedFailedJobs AS T1
                WHERE Jobs.job_ID = T1.job_ID
               )
       OR EXISTS (
                  SELECT * 
                    FROM TransmitFailedJobs AS T1
                   WHERE Jobs.job_ID = T1.job_ID
                  )
       OR EXISTS (
                  SELECT * 
                    FROM AcknowledgeFailedJobs AS T1
                   WHERE Jobs.job_ID = T1.job_ID
                  )
       OR EXISTS (
                  SELECT * 
                    FROM ConfirmFailedJobs AS T1
                   WHERE Jobs.job_ID = T1.job_ID
                  );

Here's the CTE idea expanded (pseudo SQL):

WITH CreateFailedJobs (job_ID, ...)
AS
(
 SELECT Jobs.job_ID, ...
   FROM ...
  WHERE CreateJob.endtime is NULL 
        AND DATEDIFF(hour, .SomeTable.buy_date, getdate()) 
               > Vendor.expected_create_hours
), 
TransmitFailedJobs (job_ID, ...)
AS
(
 SELECT Jobs.job_ID, ...
   FROM ...
  WHERE TransmitJob.endtime is NULL 
        AND DATEDIFF(hour, CreateJob.endtime, getdate()) 
               > Vendor.expected_transmit_hours
), 
AcknowledgeFailedJobs (job_ID, ...)
AS
(
 SELECT Jobs.job_ID, ...
   FROM ...
  WHERE AcknowledgeJob.endtime is NULL 
        AND DATEDIFF(hour, TransmitJob.endtime, getdate()) 
               > Vendor.expected_acknowledge_hours
), 
ConfirmFailedJobs (job_ID, ...)
AS
(
 SELECT Jobs.job_ID, ...
   FROM ...
  WHERE ConfirmJob.endtime is NULL 
        AND DATEDIFF(hour, AcknowledgeJob.endtime, getdate()) 
               > Vendor.expected_confirm_hours)
)
SELECT Jobs.job_ID, ...
  FROM Jobs
 WHERE EXISTS (
               SELECT * 
                 FROM CreatedFailedJobs AS T1
                WHERE Jobs.job_ID = T1.job_ID
               )
       OR EXISTS (
                  SELECT * 
                    FROM TransmitFailedJobs AS T1
                   WHERE Jobs.job_ID = T1.job_ID
                  )
       OR EXISTS (
                  SELECT * 
                    FROM AcknowledgeFailedJobs AS T1
                   WHERE Jobs.job_ID = T1.job_ID
                  )
       OR EXISTS (
                  SELECT * 
                    FROM ConfirmFailedJobs AS T1
                   WHERE Jobs.job_ID = T1.job_ID
                  );

I'm not sure that this would be good for performance but readability is the aim and anyhow this report can be run offline.

onedaywhen