views:

31

answers:

2

My company needs to write a query that determines when data is being entered inaccurately. We have a table in our db called "ProductChanges". It looks (something) like this -

Product    User     ChangeDate         OldValue     NewValue
2344       John     24/01/10 10:00         2344         4324
6435       Greg     28/01/10 13:30          543         6435
6532       Tony     29/01/10 09:45       <NULL>          234

When OldValue is <NULL>, it means that it is a new product record. We expect there will be a number of changes within the first hour of a record creation, as all the data is updated. But any changes after that are considered to mean the initial record was inaccurately recorded.

SO...what I am trying to come up with is a query that returns all product records created within the last month, that have any subsequent changes recorded an hour after the initial record was created. For output, we want the initial record plus all subsequent changes. Eg. -

Product    User     ChangeDate         OldValue     NewValue
6532       Tony     29/01/10 09:45       <NULL>          234
6532       Fred     01/02/10 11:37          234         4324

Thanks in advance!

Update: We are running this database on SQL Server 2000

A: 
select * from pc where product in (select distinct a.product from pc a, pc b
where a.product = b.product and a.changedate > dateadd(hh, 1, b.changedate))
order by product, changedate
+1  A: 

Looks pretty straightforward, if I'm understanding correctly - just get the entry dates and join with any changes that happened more than a hour later.

SELECT pc.Product, pc.User, pc.ChangeDate, pc.OldValue, pc.NewValue
FROM
(
    SELECT Product, ChangeDate AS EntryDate
    FROM ProductChanges
    WHERE OldValue IS NULL
) e
INNER JOIN ProductChanges pc
    ON pc.Product = e.Product
WHERE EXISTS
(
    SELECT 1
    FROM ProductChanges
    WHERE Product = e.Product
    AND ChangeDate > DATEADD(HOUR, 1, e.EntryDate)
)
AND e.EntryDate >= @BeginDate
AND e.EntryDate <= @EndDate

We only need the second join because you mention you want the entire history for all of the "problem" sets; if you just want the IDs, you can make this query more efficient by eliminating the middle join and just selecting e.Product.

Aaronaught
Ah should have specified - I'm using SQL Server 2000, so can't use with...
CraigS
Ah, no problem, I'll make it a subquery for you.
Aaronaught
Cheers mate, excellent
CraigS
Sensational - worked a treat
CraigS