views:

386

answers:

2

I'd like to join on a subquery / derived table that contains a WITH clause (the WITH clause is necessary to filter on ROW_NUMBER() = 1). In Teradata something similar would work fine, but Teradata uses QUALIFY ROW_NUMBER() = 1 instead of a WITH clause.

Here is my attempt at this join:

-- want to join row with max StartDate on JobModelID
INNER JOIN (
    WITH AllRuns AS (
        SELECT *,
            ROW_NUMBER() OVER (PARTITION BY JobModelID ORDER BY StartDate DESC) AS RowNumber
        FROM Runs
    )
    SELECT * FROM AllRuns WHERE RowNumber = 1
) Runs
ON JobModels.JobModelID = Runs.JobModelID

What am I doing wrong?

A: 

You could use multiple WITH clauses. Something like

;WITH AllRuns AS ( 
        SELECT  *, 
                ROW_NUMBER() OVER (PARTITION BY JobModelID ORDER BY StartDate DESC) AS RowNumber 
        FROM    Runs 
),
Runs AS(
        SELECT  * 
        FROM    AllRuns 
        WHERE   RowNumber = 1
)

SELECT  *
FROM    ... INNER JOIN ( 
        Runs ON JobModels.JobModelID = Runs.JobModelID 

For more detail on the usages/structure/rules see WITH common_table_expression (Transact-SQL)

astander
Ugly, but it seems to work. What's the fundamental issue here - lack of support for WITH clauses in derived tables? Also, I was able to eliminate the second WITH clause by joining on `(SELECT * FROM AllRuns WHERE RowNumber = 1)`.
jnylen
I would disaggree on the fact that multiple sattements in a with clause is UGLY. It is slightly different sintac to multiple layered sub selects, **which in most cases makes it extremely hard to read.**
astander
I mean ugly in the sense that the SQL for doing the subquery's task (filtering the `Runs` table to get one row per `JobModelID`) is now split between the beginning of the query and the subquery.
jnylen
A: 

Adding a join condition is probably less efficient, but usually works fine for me.

INNER JOIN (
    SELECT *,
           ROW_NUMBER() OVER 
           (PARTITION BY JobModelID 
           ORDER BY StartDate DESC) AS RowNumber
      FROM Runs
    ) Runs
ON JobModels.JobModelID = Runs.JobModelID 
AND Runs.RowNumber = 1
Adam Bernier
Any idea how much less efficient this would be?
jnylen
@jnylen, Have you made any attempt to profile this query? Execution plans, Profiler, Client Statistics?
astander
@jnylen: I agree with astander. My rule of thumb: if the query comes back in a reasonable amount of time, it's ok. If not, I start looking for hotspots and try to refactor. I've never had to refactor code like what I've shown above.
Adam Bernier
I don't have enough data where it would matter yet (still in development phase). The amount of data will probably stay pretty small anyway, so I was just looking for something subjective.
jnylen