views:

95

answers:

4

I have a table containing some names and their associated ID, along with a snapshot:

snapshot, systemid, name[, some, other, columns]

I need to identify all the unique names that a systemid has had across all snapshots, but only where there has been at least once change.

For example, with the data:

'DR1', 0, 'MOUSE_SPEED'
'DR1', 1, 'MOUSE_POS'
'DV8', 0, 'MOUSE_BUTTONS'
'DV8', 1, 'MOUSE_POS'
'DR6', 0, 'MOUSE_BUTTONS'
'DR6', 1, 'MOUSE_POS'
'PP2', 0, 'MOUSE_SPEED'
'PP2', 1, 'MOUSE_POS'

...I'd like a query that will return (in any order):

0, 'MOUSE_SPEED'
0, 'MOUSE_BUTTONS'

Additionally, it would be useful to have the inverse - a list of systemids that have remained stable across all snapshots (that is, where the name has never changed).

I am using PostgreSQL v8.4.2.

EDIT: Updated to reflect comments (sorry for the original less-than-perfect post, I am new here!).

A: 

PostgreSQL has the EXCEPT operator, which I recall is pretty much the same as MINUS (such as in Oracle), so maybe something like would work?

select id, name
from some_table
where snapshot = '1' and id in ('1', '2', '0')
except
select id, name
from some_table
where snapshot = '2' and id in ('1', '2', '0')

If you have multiple shapshots, you could try concatenating them all into one long sequence of EXCEPTs, or you could write a procedure to handle them iteratively, such as (pseudocode):

for i = 1 to maX(snapshot)-1 loop
    results := diff_query(i, i+1)  //the query above, but inside a procedure or something
    forall records in results loop
        /* do your processing  here */
    end loop
end loop

This really seems like the sort of thing to use set operators for.

FrustratedWithFormsDesigner
A: 

Following is with SQL Server but it does not use any SQL Server specific constructs. It should be portable to postgresql.

SQL Statement

SELECT  DISTINCT t1.id, t1.name
FROM    @Table t1
        INNER JOIN (
          SELECT  t.id 
          FROM    (
                    SELECT  DISTINCT id, name
                    FROM    @Table
                  ) t
          GROUP BY t.id 
          HAVING COUNT(*) > 1
        ) t2 ON t2.id = t1.id

Test data

DECLARE @Table TABLE (snapshot INTEGER, id INTEGER, name VARCHAR(32))

INSERT INTO @TABLE
SELECT 1, 0, 'MOUSE_SPEED'
UNION ALL SELECT 1, 1, 'MOUSE_POS'
UNION ALL SELECT 1, 2, 'KEYBOARD_STATE'
UNION ALL SELECT 2, 0, 'MOUSE_BUTTONS'
UNION ALL SELECT 2, 1, 'MOUSE_POS'
UNION ALL SELECT 2, 2, 'KEYBOARD_STATE'
UNION ALL SELECT 3, 0, 'MOUSE_SPEED'
UNION ALL SELECT 3, 1, 'MOUSE_POS'
UNION ALL SELECT 3, 2, 'KEYBOARD_STATE'
Lieven
Sorry, the snapshot field is a string, such as 'PP16', 'PP16B', 'DS21'. There isn't a guaranteed ordering of them -- I don't mind which order changes were made, only that they were made and the snapshots where they were changed.
Noah
@Noah, if you can not establish the order of the snapshots from its value then you can not establish in which it changed. For example if PP16 is MOUSE_POS, PP16B is MOUSE_POS and DS21 is MOUSE_BUTTONS then you can not conclude in which one it changed. I can claim that it changed from MOUSE_POS to MOUSE_BUTTONS back to MOUSE_POS, but equally I can claim that it started as MOUSE_POS, remained that and changed to MOUSE_BUTTONS (other combinations exist).
Unreason
@Noah: I have updated the query to match you latest edits.
Lieven
Thank you - that is perfect. Sorry that it took a while for me to explain it properly.
Noah
@Noah, don't worry, you'll get better at it with each question you ask ;)
Lieven
A: 
select distinct s1.snapshot, s1.id, s1.name from snapshot s1, snapshot s2   
where s1.snapshot != s2.snapshot   
and s1.id = s2.id   
and s1.name != s2.name
geff_chang
Updated SQL statement.
geff_chang
Sorry, there a few a few other columns that I didn't list in the example (I don't care about them in this context) so DISTINCT didn't filter them out.
Noah
A: 

For the changed ones:

SELECT t1.snapshot, t1.systemid
FROM table t1
GROUP BY t1.snapshot, t1.systemid
HAVING min(t1.name) <> max(t1.name)

would give you the snapshot and id of the ones that have changed

For the ones that remained the same

SELECT t1.snapshot, t1.systemid
FROM table t1
GROUP BY t1.snapshot, t1.systemid
HAVING min(t1.name) = max(t1.name)

Joining the values back to the first or last query can be done with joined subquery or correlated subquery

Joined (example with names that changed)

SELECT t2.snapshot, t2.systemid, t2.name
FROM table t2
     JOIN (
           SELECT snapshot, systemid
           FROM table 
           GROUP BY snapshot, systemid
           HAVING min(name) <> max(name) ) t1
     ON t2.snapshot = t1.snapshot AND t2.systemid = t1.systemid

Correlated (example with names that remained the same)

SELECT t2.snapshot, t2.systemid, t2.name
FROM table t2
WHERE t2.name IN (
           SELECT t1.name
           FROM table t1
           WHERE t2.snapshot = t1.snapshot AND t2.systemid = t1.systemid
           GROUP BY t1.name
           HAVING COUNT(DISTINCT t1.name) = 1 ) 

If you don't need the snapshot for the inverse query then

SELECT DISTINCT t2.systemid, t2.name

and rest the same.

Queries are not validated, but I hope the approaches are clear

Unreason