tags:

views:

77

answers:

3

I have two tables :

T_STOCK: primary key is id, seller, and some others fields let say a and b. T_FLOW: primary key is (id + startdate), and some others fields, for example c and d.

I want a query that returns all the columns for each records from T_STOCK regarding a specific seller, but completed with the columns (startDate, c and d) from the T_FLOW table.

The relation between T_STOCK and T_FLOW is based on the id attribute. Everytime a record with a specific ID exists in T_STOCK, at least one record exist in T_FLOW for this ID.

However, it may happen that more than one record exist in T_FLOW. In this case, I must consider only the most recent one (i.e. the one with max(startDate)).

In others words, if we have the following tables content:

+---------------------+
|       T_STOCK       |
+----+--------+---+---+
| ID | SELLER | a | b |
+----+--------+---+---+
| 01 | foobar | 1 | 2 |
+----+--------+---+---+
| 02 | foobar | 3 | 4 |
+----+--------+---+---+
| 03 | foobar | 5 | 6 |
+----+--------+---+---+

+---------------------------+
|           T_FLOW          |
+----+------------+----+----+
| ID |  StartDate |  c |  d |
+----+------------+----+----+
| 01 | 01/01/2010 |  7 |  8 |
+----+------------+----+----+
| 02 | 01/01/2010 |  9 | 10 |
+----+------------+----+----+
| 02 | 07/01/2010 | 11 | 12 |
+----+------------+----+----+
| 03 | 03/01/2010 | 13 | 14 |
+----+------------+----+----+
| 03 | 05/01/2010 | 15 | 16 |
+----+------------+----+----+

The result of the query must be :

+----+--------+---+---+------------+----+----+
| ID | SELLER | a | b |  startDate |  c |  d |
+----+--------+---+---+------------+----+----+
| 01 | foobar | 1 | 2 | 01/01/2010 |  7 |  8 |
+----+--------+---+---+------------+----+----+
| 02 | foobar | 3 | 4 | 03/01/2010 | 11 | 12 |
+----+--------+---+---+------------+----+----+
| 03 | foobar | 5 | 6 | 01/01/2010 | 15 | 16 |
+----+--------+---+---+------------+----+----+

How do I write my query then?

+3  A: 
SELECT  *
FROM    t_stock s
JOIN    (
        SELECT  f.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY startDate DESC) AS rn
        FROM    t_flow f
        ) f
ON      f.id = s.id
        AND f.rn = 1

Here's a solution that does not use the analytic functions:

SELECT  *
FROM    t_stock s
JOIN    t_flow f
ON      (f.id, f.startDate) =
        (
        SELECT  id, MAX(startDate)
        FROM    t_flow fi
        WHERE   fi.id = s.id
        GROUP BY
                id
        )
Quassnoi
Is there any solution to write this query without using `row_number() over (partition ...` ?
romaintaz
A: 
SELECT  DISTINCT
        s.*
       ,FIRST_VALUE(f.startdate)
        OVER (PARTITION BY f.id ORDER BY f.startdate DESC) startdate
       ,FIRST_VALUE(f.c)
        OVER (PARTITION BY f.id ORDER BY f.startdate DESC) c
       ,FIRST_VALUE(f.d)
        OVER (PARTITION BY f.id ORDER BY f.startdate DESC) d
FROM    t_stock s, t_flow f
WHERE   f.id = s.id
Jeffrey Kemp
+1  A: 

You get the most recent T_FLOW record using analytics, as shown by Quassnoi, or using:

select id, max(startdate) last_start_date from t_flow group by id;

You can then join this with your T_STOCK table - something like:

select
    s.*,
    f.*
from
    t_stock s
        inner join t_flow f on
                f.id = s.id
            and (f.id, f.startdate) in
                (
                select
                    id,
                    max(startdate) laststartdate
                from
                    t_flow
                group by
                    id
                )
Nick Pierpoint
You need one more join to get the other columns from `t_flow`.
Quassnoi
I as mid-edit :) - and still doesn't work
Nick Pierpoint
This should be `(f.id, f.startdate) IN …`
Quassnoi