ansaurus

Question

Joining two tables using column which has duplicate values, master -lookup tables

Answer 1

+1 A:

SELECT DISTINCT X.id, X.cosub, Y.div
FROM X
LEFT OUTER JOIN Y ON Y.cosub = X.cosub
-- WHERE  xxxx here for optional where condition
-- ORDER BY xxxx  here for optional ordering clause

I'm not 100% sure you need DISTINCT (it would be good to avoid it), it depends if the small table has duplicates. The text of the question seems to imply no such duplicates, but then the example give shows dups...

Also beware that in case the table Y has multiple div values for a given cosub (i.e. several records with conflicting div values for a given cosub), the above query would result in showing several rows in the results list, one per different value (but repeating the data from table X).

Finally, the snippet proposed used and LEFT OUTER JOIN, which would allow the result to include records with just the data from table X (and null values in lieu of the fields values normally coming from Y), in case a given X record has a cosub value not found in Y. The alternative is to use simply JOIN, which will have the effect of EXCLUDING any such record from the results list (i.e. the result would then only includes records from X provided they have a cosub that exists in Y)

mjv 2009-12-14 03:55:42

we dont need distinct for table X as the value id is unique/primary key only cosub has repeating duplicates.table y has duplicates in cosub so i will have to use distinct for it.

2009-12-14 04:01:48

@nazer555 I suggest you remove these duplicates in Y before proceeding, allowing you to avoid the "DISTINCT" in the query which could significantly slow things down.

mjv 2009-12-14 04:05:04

all the values found in cosub of X has a matching value in the cosub column of table Y. every instance of cosub value has different div values and that makes it very difficult to join

2009-12-14 04:15:14

What criteria would you like to join on then ? Would you like to have as many rows as there are distinct values for a given cosub ? would you like to take the smallest value of div [for a given cosub]? The biggest one? Or maybe their average? Would you like the first n possible values of div? All of this is possible, you just need define what the requirement should be. The snippet given would produce the output shown in the example, but maybe this examples fails to show cases of cosubs with multiple div values, and the desired output.

mjv 2009-12-14 04:21:12

i'll be getting back to the client to clarify this tomorrow.

2009-12-14 04:46:55

Answer 2

+1 A:

Use:

   SELECT DISTINCT
          x.id,  
          x.cosub,
          y.div
     FROM TABLE_x x
LEFT JOIN (SELECT t.cosub,
                  t.div,
                  --other columns
             FROM TABLE_Y t) y ON y.cosub = x.cosub

From the comments I've read, you need to pre-process the records in TABLE_Y to get the correct div/etc values before returning a resultset.

OMG Ponies 2009-12-14 05:20:20

This is almost right, but you need to add a 'distinct' to the second inner select.

Jon Wilson 2009-12-18 18:36:13

Updated to add `DISTINCT`

OMG Ponies 2009-12-18 18:59:24

ansaurus

tags:

views:

answers:

Joining two tables using column which has duplicate values, master -lookup tables

related questions