ansaurus

Question

[SQL] Retreive/update rows with a minimal deviation in a certain column value

Answer 1

A:

For all rows::.

update yourtable set  date_added=date_added-'01';

for a specific row add a where clause

halocursed 2009-10-08 10:58:49

Well this update statement isn't the hard part, the hard part is in fact the where clause to fetch the specific rows.

pbean 2009-10-08 11:27:05

do you only want to update the last row or all the rows containing some specific value or before the time you fixed the insert delay

halocursed 2009-10-08 13:06:48

Answer 2

A:

The best way I can think of is writing an external script to do that. It's tricky to determine which columns are correct and which should be updated without having more control over the grouping. Pseudo-code:

all_rows = SELECT * FROM table ORDER BY date
last_date = NULL
rows_to_update = []
for row in all_rows:
    if last_date is NULL or row.date - last_date > X seconds:
        set date to last_date for all rows from rows_to_update
        last_date = row.date
        rows_to_update = []
    else if row.date != last_date:
        rows_to_update += row

Alternatively, something like this could work, but you might need more than one run if want to handle cases where all three dates are different and you want to normalize two of them to the first one.

UPDATE
   tbl t,
   (SELECT
        t.date,
        (SELECT min(date)
         FROM tbl
         WHERE timestampdiff(SECOND,date,t.date) BETWEEN 1 AND 3) AS new_date
    FROM tbl t) t2
SET t.date=t2.new_date
WHERE t.date=t2.date AND t2.new_date IS NOT NULL

Lukáš Lalinský 2009-10-08 11:15:09

I ran the SELECT part of your proposed UPDATE query against the table and it seems like it lists all the right (new) dates for all values, so it's exactly what I wanted. I already tried a similar query but it returned a couple of thousand rows too many, while this one returns exactly the right amount.

pbean 2009-10-08 12:00:22

Answer 3

A:

due to lag in insertion

Why don't you get the date for insert before inserting/updating the first row and use that for all the other rows?

Johannes Rudolph 2009-10-08 12:05:25

I do that now (that was the fix) but there are still old rows in the database with the dates mixed up. In fact, a full fix would introduce a new database layout but at this moment we can't do that.

pbean 2009-10-08 13:08:50

Answer 4

A:

Hi, try this:

Assuming you have this structure:

create table tbl(id int identity, dt datetime)
insert into tbl (dt) values('2009-10-08 12:23:01')
insert into tbl (dt) values('2009-10-08 12:23:01')
insert into tbl (dt) values('2009-10-08 12:23:02')
insert into tbl (dt) values('2009-10-08 12:23:05')
insert into tbl (dt) values('2009-10-08 12:23:05')
insert into tbl (dt) values('2009-10-08 12:23:06')

This query will only show the last item of each set that's 1 second late:

select distinct A.* from tbl A
join (select * from tbl) AS T on datediff(ss, T.dt, A.dt) = 1

Using that in conjunction with an UPDATE statement, you get this:

update tbl set dt = (select top 1 dt from tbl where tbl.id < A.id order by tbl.id desc)
from tbl A
join (select * from tbl) AS T on datediff(ss, T.dt, A.dt) = 1

And that updates the last record of each set to the date above it, giving the results:

1           2009-10-08 12:23:01.000
2           2009-10-08 12:23:01.000
3           2009-10-08 12:23:01.000
4           2009-10-08 12:23:05.000
5           2009-10-08 12:23:05.000
6           2009-10-08 12:23:05.000

Its quick and dirty and unoptimized, but for a once-off data-scrub it should work.

Remember to back up!

Wez 2009-10-08 13:18:39

ansaurus

tags:

views:

answers:

[SQL] Retreive/update rows with a minimal deviation in a certain column value

related questions