ansaurus

Question

How can I delete pairs of matching rows?

Answer 1

+1 A:

depending on how much data there is, I would just do it the brute force way.

select all the negative total rows into a temp table

use a cursor to go through each row, then query the database for a single match (using maybe max() on a timestamp, order number, or whatever primary key you might have. Delete that one "matching" row.

then delete all the negative rows

No doubt you can use a subquery and do it in one statement, but by the time I figured it out and tested it, I would have the job done using the above :)

MikeW 2009-11-12 19:11:26

Answer 2

A:

What is the shared identifier that links the 2 rows? Without this, you can't because you have nothing to link rows

Anyway, it would be something like

DELETE MyTable
WHERE EXISTS (
    SELECT * FROM MyTable M2
    GROUP BY M2.LinkID
    HAVING SUM(M2.ValueCol) < 0 AND MyTable.KeyCol = M2.KeyCol
    )

gbn 2009-11-12 19:11:36

Answer 3

A:

I'd run the inner SELECT, without the wrapping DELETE once, to see that the data looks ok, before executing, but I'm pretty sure this'd be fine

DELETE FROM
   orders
WHERE
   orderID IN (
       SELECT
          orderID
       FROM (
          SELECT 
             MIN(orderID) orderID, total
          FROM
             orders
          WHERE
             total IN (
                SELECT
                   total * -1
                FROM
                   orders
                WHERE
                   total < 0
             )
          GROUP BY
             total
      )derived
   )

DELETE FROM
    orders
WHERE
    total < 0

David Hedlund 2009-11-12 19:20:47

Answer 4

A:

Data cleanup tasks are painful no matter what. From what you've described, there is not enough information to fully automate this task. This is typical for data cleanup.

First you need to have a talk with your immediate manager and let him know the magnitude of the problem. It's not your fault the data is all screwed up, and it will take time to fix it without losing any valid information and without interrupting the sales operations.

The most important tip about data cleanup is that it's more trouble than it's worth to try to automate fully. Your strategy should be to reduce the problem by taking care of the easy cases, until you can do the remainder manually. There will always be complex edge cases, and trying to handle them all with clever SQL is an exercise in diminishing returns.

Take care of the low-hanging fruit, where the negative "correction" has a valid order number, so you can make a strong correlation to the order it is intended to cancel.
Create a correlation between the remaining negatives and the most recent single order rows with the same quantity. Use other columns to correlate them if you can, for instance if the correction is entered by the same salesperson who entered the original order.
The next stage would be to delete negatives where the order number is valid, but it maps to multiple rows that sum up to the total value.
Then start on matching negatives without order numbers to multiple rows that sum up to the value in the correction. This can be tricky to automate, but by this time the number of negatives might be few enough that you can do it manually, by eyeballing them one by one.

The other tip is that SQL Anywhere appears to have a multi-table DELETE syntax. I don't use SQL Anywhere, but I found this in the online docs:

Syntax

DELETE [ row-limitation ] 
  [ FROM ] [ owner.]table-expression
  [ FROM table-list [,...] ]
  [ WHERE search-condition ]
  [ ORDER BY { expression | integer } [ ASC | DESC ], ... ]
  [ OPTION( query-hint, ... ) ]

It looks like the first FROM clause lists the table you want to delete rows in. The second FROM clause allows you to do joins for purposes of restricting the rows. Since you're likely to be doing self-joins, remember that you need to give an alias (aka correlation name) in the first FROM to avoid ambiguity.

Bill Karwin 2009-11-12 19:29:19

ansaurus

tags:

views:

answers:

How can I delete pairs of matching rows?

related questions