views:

2367

answers:

3

When you perform a left join in TSQL (MSSQL SERVER) is there any guarantee which row will return with your query if there are multiple rows on the right?

I'm trying to use this to exploit an ordering on the right table.

so

Select ColA, ColB, ColC 
from T
Left Outer Join 
   (Select ColA, ColB, ColC 
   from T--CLARIFIED, this is a self join.
   Order by TopColumn Desc) AS OrderedT(ColA, ColB, ColC) 
   On T.ColA = OrderedT.ColA

I would expect to retrieve all the ColA's in Table, and all the first row in the set of ColA results for my left join based on my ordering.

Is there any guarantee made on this by the language or server?

A: 

Not that simple. The LEFT JOIN returns all matching right-hand rows. So the question on guarantee here is not really relevant. You'd have to do something with a subquery to get the single row you need, using TOP 1 in the subquery.

David M
But I can't use Top 1 as an aggregate. This is solely to solve the problem of fetching the top row FOR EACH ColA which I am trying to avoid a cursor to do.
Spence
@David M: "The LEFT JOIN returns all matching right-hand rows" - isn't that the wrong way around?
Mitch Wheat
No. A LEFT JOIN returns all rows from the left, and **all matching rows** from the right.
David M
A left outer join will return one row for every row in the table on the left, with the joined columns set to null or a value. Regardless of the number of rows int he right hand side, it will only return a row for every row on the left. AFAIK.
Spence
Sorry, not correct. If there are multiple matching rows on the right, multiple rows will be returned.
David M
+1  A: 

A LEFT JOIN returns all left hand rows satisfying any WHERE criteria regardless of whether there is a matching row on the right (on the join key(s)). Columns in the right table will be returned as NULL where there is no match on join key.

Mitch Wheat
Understood, I've clarified my question in that it is a self join. As such the query will not have nulls in the right hand side. Very valid point though.
Spence
+3  A: 

I believe you need this...

select T.ColA, T.ColB, T.ColC 
from T
inner join
   (select ColA, max(TopColumn) MaxTopColumn
   from T
   group by ColA) OrderedTable
   on T.ColA = OrderedTable.ColA and T.TopColumn = OrderedTable.MaxTopColumn

Fairly common query for versioned tables, requires an inner join to a max query.

The table name "Table" doesn't help matters, I've renamed it T.

polyglot
Wont that complain that TopColumn isn't included in the group by? Or because its in aggregate form it's ok
Spence
Is the second part of the join necessary as well? Wouldn't the join on the Cola = ColA be sufficient?
Spence
You aggregate by the non-aggregate columns in a query. I don't have a sql server to hand but I have written this query dozens of times on MSQLServer and Sybase and this is the correct structure.The downside is the aggregate query is a table scan (or full index scan) and there is no way around it, the performance isn't perfect, but for what you are asking this is the query.
polyglot
What you are trying to do is return the rows from T, and restrict the rows from T on the basis of MaxTopColumn. Without specifying it as a join column T will not be restricted.
polyglot
Table scan makes sense, but I'm working with an existing schema, so cheers for your help :).
Spence
No, I'm trying to self join T with the topmost row for it's value. As such the query of ordered table will return one row for every distinct ColA (key) value and as such the second join predicate is unnecessary. Unless I've missed something.
Spence
Spence, I thought you wanted an entire row of the table, given a column A, and a maximum value B (let it be the incrementing version number of A for instance)So if you want all the A_id's, with the A maximum versions (from column B), and the value of C which is stored only against the maximum version A, then that's the query. But maybe my interpretation was ambiguous :)!
polyglot