tags:

views:

372

answers:

4

I have a database where I store objects. I have the following (simplified) schema

CREATE TABLE MyObjects
(
  UniqueIdentifier Id;
  BigInt           GenerationId;
  BigInt           Value;
  Bit              DeleteAction;
)

Each object has a unique identifier ("Id"), and a (set of) property ("Value"). Each time the value of the property for an object is changed, I enter a new row into this table with a new generation id ("GenerationId", which is monotonically increasing). If an object is deleted, then I record this fact by setting the "DeleteAction" bit to true.

At any point in time (generation), I would like to retrieve the state of all of my active objects!

Here's an example:

Id    GenerationId Value  DeleteAction
1        1          99       false
2        1          88       false
1        2          77       false
2        3          88       true

Objects in generations are:

  1: 1 {99}, 2 {88}
  2: 1 {77}, 2 {88}
  3: 1 {77}

The key is: how can I find out the row for each unique object who's generation id is closest (but not exceeding) to a given generation id? I can then do a post-filter step to remove all rows where the DeleteAction field is true.

A: 

Not sure whether that's standard SQL, but in Postgres, you can use the LIMIT flag:

 select GenerationId,Value,DeleteAction from MyObjects 
    where Id=1 and GenerationId < 3 
    order by GenerationId
    limit 1;
Martin v. Löwis
The point of the question was not having to specify Id=1, but rather return all Ids that match the generation requirement.
Philipp Schmid
Unfortunately, that wasn't clear from the question.
Martin v. Löwis
+4  A: 

This works in MS SQL

SELECT id,value
FROM Myobjects
INNER JOIN ( 
     SELECT id, max(GenerationID) as LastGen 
     FROM MyObjects
     WHERE GenerationID <= @Wantedgeneration
     Group by ID)
    On GenerationID = LastGen
WHERE DelectedAction = false
Johan Buret
Yup, that should do the trick. Atleast it's a good starting point. +1 ;)
sindre j
+2  A: 

My version uses a joint of the table MyObjects against a subset of itself, created by a subquery, and containing only the last generation for each object:

SELECT O.id,generation,value FROM 
     MyObjects O, 
     (SELECT id,max(generation) AS max_generation FROM MyObjects 
     WHERE generation <= $GENERATION_ID GROUP BY id) AS TheMax WHERE 
            TheMax.max_generation = generation AND O.deleted is False
     ORDER BY generation DESC;

In the above query, the GENERATION_ID is hardwired. A way to parametrize it is to write a function:

CREATE OR REPLACE FUNCTION generation_objects(INTEGER) RETURNS SETOF MyObjects AS
  'SELECT O.id,generation,value,deleted FROM 
       MyObjects O, 
       (SELECT id,max(generation) AS max_generation FROM MyObjects 
       WHERE generation <= $1 GROUP BY id) AS TheMax WHERE 
              TheMax.max_generation = generation AND O.deleted is False;'
  LANGUAGE SQL;

Now, it works. With this table:

> SELECT * FROM MyObjects;          
 id | generation | value | deleted 
----+------------+-------+---------
  1 |          1 |    99 | f
  2 |          2 |    88 | f
  1 |          3 |    77 | f
  2 |          4 |    88 | t
  3 |          5 |    33 | f
  4 |          6 |    22 | f
  3 |          7 |    11 | f
  2 |          8 |    11 | f

I get:

> SELECT * FROM generation_objects(1) ORDER by generation DESC;
 id | generation | value | deleted 
----+------------+-------+---------
  1 |          1 |    99 | f

> SELECT * FROM generation_objects(2) ORDER by generation DESC;
 id | generation | value | deleted 
----+------------+-------+---------
  2 |          2 |    88 | f
  1 |          1 |    99 | f

> SELECT * FROM generation_objects(3) ORDER by generation DESC;
 id | generation | value | deleted 
----+------------+-------+---------
  1 |          3 |    77 | f
  2 |          2 |    88 | f

And then, at the following generation, object 2 is deleted:

> SELECT * FROM generation_objects(4) ORDER by generation DESC;
 id | generation | value | deleted 
----+------------+-------+---------
  1 |          3 |    77 | f
bortzmeyer
+1  A: 

Here's the working version:

SELECT MyObjects.Id,Value
FROM Myobjects
INNER JOIN 
(      
  SELECT Id, max(GenerationId) as LastGen
  FROM MyObjects
  WHERE GenerationId <= @TargetGeneration
  Group by Id
) T1
ON MyObjects.Id = T1.Id AND MyObjects.GenerationId = LastGen
WHERE DeleteAction = 'False'
Philipp Schmid