views:

152

answers:

4

Earlier this week I ask a question about filtering out duplicate values in sequence at run time. Had some good answers but the amount of data I was going over was to slow and not feasible.

Currently in our database, event values are not filtered. Resulting in duplicate data values (with varying timestamps). We need to process that data at run time and at the database level it’s to time costly ( and cannot pull it into code because it’s used a lot in stored procs) resulting in high query times. We need a data structure that we can query that has this data store filtered out so that no additional filtering is needed at runtime.

Currently in our DB

  • 'F07331E4-26EC-41B6-BEC5-002AACA58337', '1', '2008-05-08 04:03:47.000'
  • 'F07331E4-26EC-41B6-BEC5-002AACA58337', '0', '2008-05-08 10:02:08.000'
  • 'F07331E4-26EC-41B6-BEC5-002AACA58337', '0', '2008-05-09 10:03:24.000’ (Need to delete this) **
  • 'F07331E4-26EC-41B6-BEC5-002AACA58337', '1', '2008-05-10 04:05:05.000'

What we need

  • 'F07331E4-26EC-41B6-BEC5-002AACA58337', '1', '2008-05-08 04:03:47.000'
  • 'F07331E4-26EC-41B6-BEC5-002AACA58337', '0', '2008-05-08 10:02:08.000'
  • 'F07331E4-26EC-41B6-BEC5-002AACA58337', '1', '2008-05-10 04:51:05.000'

This seems trivial, but our issue is that we get this data from wireless devices, resulting in out of sequence packets and our gateway is multithreaded so we cannot guarantee the values we get are in order. Something may come in like a '1' for 4 seconds ago and a '0' for 2 seconds ago, but we process the '1' already because it was first in. we have been spinning our heads on how to implement this. We cannot compare data to the latest value in the database because the latest may actually not have come in yet, so to throw that data out we'd be screwed and our sequence may be completely off. So currently we store every value that comes in and the database shuffles itself around based off of time.. but the units can send 1,1,1,0 and its valid because the event is still active, but we only want to store the on and off state ( first occurrence of the on state 1,0,1,0,1,0).. we thought about a trigger, but we'd have to shuffle the data around every time a new value came in because it might be earlier then the last message and it can change the entire sequence (inserts would be slow).

Any Ideas?

Ask if you need any further information.

[EDIT] PK Wont work - the issue is that our units actually send in different timestamps. so the PK wouldn't work because 1,1,1 are the same.. but there have different time stamps. Its like event went on at time1, event still on at time2, it sends us back both.. same value different time.

+1  A: 

If I understand correctly, what you want to do is simply prevent the dupes from even getting in the database. If that is the case, why not have a PK (or Unique Index) defined on the first two columns and have the database do the heavy lifting for you. Dupe inserts would fail based on the PK or AK you've defined. You're code (or stored proc) would then just have to gracefully handle that exception.

mikeymo
A: 

Thanks for the response. A PK Won't work - the issue is that our units actually send in different timestamps. So the PK wouldn't work because 1,1,1 are the same.. but they have different time stamps. It's like event went on at time1, event still on at time2, it sends us back both.. same value different time.

Bobby
A: 

Here's an update solution. Performance will vary depending on indexes.

DECLARE @MyTable TABLE
(
  DeviceName varchar(100),
  EventTime DateTime,
  OnOff int,
  GoodForRead int
)

INSERT INTO @MyTable(DeviceName, OnOff, EventTime)
SELECT 'F07331E4-26EC-41B6-BEC5-002AACA58337', 1, '2008-05-08 04:03:47.000' 
INSERT INTO @MyTable(DeviceName, OnOff, EventTime)
SELECT 'F07331E4-26EC-41B6-BEC5-002AACA58337', 0, '2008-05-08 10:02:08.000' 
INSERT INTO @MyTable(DeviceName, OnOff, EventTime)
SELECT 'F07331E4-26EC-41B6-BEC5-002AACA58337', 0, '2008-05-09 10:03:24.000'
INSERT INTO @MyTable(DeviceName, OnOff, EventTime)
SELECT 'F07331E4-26EC-41B6-BEC5-002AACA58337', 1, '2008-05-10 04:05:05.000' 

UPDATE mt
SET GoodForRead = 
CASE
  (SELECT top 1 OnOff
   FROM @MyTable mt2
   WHERE mt2.DeviceName = mt.DeviceName
     and mt2.EventTime < mt.EventTime
   ORDER BY mt2.EventTime desc
  )
  WHEN null THEN 1
  WHEN mt.OnOff THEN 0
  ELSE 1
END
FROM @MyTable mt
    -- Limit the update to recent data
--WHERE EventTime >= DateAdd(dd, -1, GetDate())

SELECT *
FROM @MyTable

It isn't hard to imagine a filtering solution based on this. It just depends on how often you want to look up the previous record for each record (every query or once in a while).

David B
A: 

Thanks David.. Soultion seems to be working well and fast enough to continue development..

Bobby