tags:

views:

1466

answers:

6

hi i need to select only the first row from a query that joins tables A and B, on table B exist multiple records with same name. there are not identifiers in any of the two tables. i cannt change the scheme either because i do not own the DB

TABLE A
NAME


TABLE B
NAME
DATA1
DATA2

Select Distinct A.NAME,B.DATA1,B.DATA2 
From A 
Inner Join B on A.NAME = B.NAME

this gives me



 NAME         DATA1   DATA2
    sameName   1        2
    sameName   1        3
    otherName  5        7
    otherName  8        9

but i need to retrieve only one row per name

  NAME      DATA1   DATA2
    sameName   1        2
    otherName  5        7

i was able to do this by adding the result into a temp table with a identity column and the select the Min Id per name.

the problem here is that i require to do this in one single statement.

this is a DB2 database thanks

A: 

Not sure if this will solve your problem or not, but you could try using the GROUP BY clause and group by one of the name columns.

DB2 Group by tutorial

Brett Bender
+5  A: 

Using a GROUP BY may get you part way there, but beware. If you do something like this:

Select A.NAME, min(B.DATA1), min(B.DATA2) 
From A Inner Join B on A.NAME = B.NAME 
Group by A.NAME;

You will get the result you are looking for:

  NAME      DATA1   DATA2
  sameName   1        2    
  otherName  5        7

But only because of the data you are testing with. If you change the data, so that instead of:

otherName  8        9

you had:

otherName  8        4

It would return:

  NAME      DATA1   DATA2
  sameName   1        2    
  otherName  5        4

Note that otherName does not return DATA1 and DATA2 from the same record!

Update: A self-join with a comparison on one of the data values may help you, such as:

SELECT a.*, b.* FROM a,b 
   LEFT JOIN b b2 ON b.name = b2.name AND b.data2 < b2.data2 
   WHERE a.name = b.name AND b2.data2 IS NOT NULL;

However, this will only work if the values in DATA2 are unique per NAME.

Chris Thornhill
There's a small typo on the last statement. It should be "IS NOT NULL" instead of "IN NOT NULL".
mbp
A: 

EDIT: This is completely wrong. Leaving it so if anyone else thinks they same, they may learn a thing too :D

Select Distinct A.NAME,B.DATA1,B.DATA2 
From A 
Inner Join B on A.NAME = B.NAME

does not give you

 NAME         DATA1   DATA2
    sameName   1        2
    sameName   1        3
    otherName  5        7
    otherName  8        9

Distinct means it cannot return the same name twice...

edit: Having a quick look around at some DB2 sounds like a bug, you will need to group by I think. Sounds like a pretty huge bug to have though. :s

jim
Yeah, confusing! Always thought that distinct will return only unique data.
Galilyou
distinct will return unique rows, not fields - combination of NAME/DATA1/DATA2 will be unique. No bug here.
Arvo
A: 

If you can add to a temp table and then query from that, you can do it in one go.

WITH T AS (temp table select), RN AS (select min row-numbers from T) SELECT T.NAME, T.DATA1, T.DATA2 FROM T INNER JOIN RN on T.row_number = RN.row_number

There are many other ways to write this, but that's how I've been doing similar things.

Tanktalus
A: 

Try to dedupe B like this

SELECT  A.NAME, bb.DATA1, bb.DATA2 
FROM    A 
JOIN    B bb
ON      A.NAME = B.NAME
WHERE   NOT EXISTS (SELECT  *
                    FROM    B
                    WHERE   NAME = bb.NAME
                            AND (DATA1 > bb.DATA1
                                OR DATA1 = bb.DATA1 AND DATA2 > bb.DATA2))

Add more OR clauses if more DATAx columns exist.

If A contains duplicates too, simply use DISTINCT as in the OP.

wqw
+2  A: 

This will work:

with temp as (
    select A.NAME, B.DATA1, B.DATA2, 
     row_number() over (partition by A.NAME order by A.NAME) as rownum
    from TABLEA A inner join TABLEB B
    on A.NAME = B.NAME
)
select NAME, DATA1, DATA2 from temp where rownum = 1

If you want to select the least value of data1 and within it data2, then use this variation:

with temp as (
    select A.NAME, B.DATA1, B.DATA2, 
     row_number() over (partition by A.NAME order by B.DATA1, B.DATA2) as rownum
    from TABLEA A inner join TABLEB B
    on A.NAME = B.NAME
)
select NAME, DATA1, DATA2 from temp where rownum = 1

Both the queries will give one row per name.

Rashmi Pandit