ansaurus

Question

LEFT OUTER JOIN (gives extra rows) problem

Answer 1

+2 A:

select distinct * from @tb1 left outer join @tb2 ON c1 = c2

Gennady Shumakher 2009-11-11 09:08:49

From 100 ms EXECUTION time, this shot it up to 83seconds.

Theofanis Pantelides 2009-11-11 09:20:29

You need to give us more detail of what you're trying to achieve

Murph 2009-11-11 09:29:50

How big is the table that took 83 seconds? Are you using the LIKE operator in your real query?

Wez 2009-11-11 09:31:38

See comment on top

Theofanis Pantelides 2009-11-11 09:35:05

13,166,165 ROWS, with 200,000+ /day

Theofanis Pantelides 2009-11-11 09:36:07

Answer 2

+2 A:

Try useing

select DISTINCT * from @tb1 left outer join @tb2 ON c1 = c2

astander 2009-11-11 09:09:08

That would work in this simplified case, but I'm pretty sure in reality his rhs table has more columns - and then distinct won't help.

Tor Haugen 2009-11-11 09:18:46

I aggree, and in that case then he needs to decide what makes the row "distinct", or will have to display multiple rows with the id from table 1

astander 2009-11-11 09:20:54

how about select distinct on ( col, col)

xenoterracide 2009-11-11 09:27:30

Answer 3

+1 A:

If you want to keep just single rows on the left hand side, you'll need to decide what you want to show on the right, for each unique value on the left. If you want to show a count, for example, you could do this:

select b1.c1, x.c from @tb1 b1 
left outer join 
(
  select c2, count(*) as c 
  from @tb2
  group by c2
) as x 
ON b1.c1 = x.c2

or if you just want one occurence of values from c2:

select b1.c1, x.c2 from @tb1 b1 
left outer join 
(
  select c2
  from @tb2
  group by c2
) as x 
ON b1.c1 = x.c2

davek 2009-11-11 09:11:15

The example above is only an example, and as such each table only has one column.

Theofanis Pantelides 2009-11-11 09:17:24

@Davek - thanks for this answer. I'm joining 5 tables with > 250K records and this answer was exactly what I needed.

TMG 2010-05-23 18:49:23

Answer 4

+2 A:

Hmm, the query is doing what its supposed to since there are duplicate records (or at least duplicate identifiers) in the right hand table.

To get the effect you want something like:

SELECT * FROM @tb1 LEFT OUTER JOIN (SELECT DISTINCT c2 FROM @tb2) t2 ON @tb1.c1 = t2.c2

If that isn't sufficient you'll need to explain the requirement in a bit more detail.

Murph 2009-11-11 09:11:39

Answer 5

+5 A:

Sorry, but your thinking is skewed.

Think about it this way: if you only want one single row from tb2 for each row in tb1, which one should the server choose? The fact is that from the definition of a join, every row in the right-hand-side table that matches the left-hand-side row is a match and must be included.

You'll have to ensure tbl2 has distinct values for c2 before the join. Murph's suggestion might do it, provided your SQL variant supports DISTINCT [column] (not all do).

Tor Haugen 2009-11-11 09:12:13

Yeah, I was afraid of that.

Theofanis Pantelides 2009-11-11 09:15:45

This was absolutely right, and as such I have altered my thinking and hence my ON statement to be unique, alleviating the problem.

Theofanis Pantelides 2009-11-11 10:01:32

ansaurus

tags:

views:

answers:

LEFT OUTER JOIN (gives extra rows) problem

related questions