views:

100

answers:

4

I just came across a query that does an inner join on a non-distinct field. I've never seen this before and I'm a little confused about this usage. Something like:

SELECT distinct all, my, stuff
FROM myTable
INNER JOIN myOtherTable
   ON myTable.nonDistinctField = myOtherTable.nonDistinctField
(WHERE some filters here...)

I'm not quite sure what my question is or how to phrase it, or why exactly this confuses me, but I was wondering if anyone could explain why someone would need to do an inner join on a non-distinct field and then select only distinct values...? Is there ever a legitimate use of an inner join on a non-distinct field? What would be the purpose? And if there's is a legitimate reason for such a query, can you give examples of where it would be used?

A: 

I think one should join tables only based on relations between them (with distinct columns). Probably one should have a second look at the database design when one is writing a query like that.

ydobonmai
A primary key isn't the only kind of unique (or, in his nomenclature, distinct) field.
Adam Robinson
Ok, I was just asking a question. I didn't mean only that. More than that, I didnt mean to answer that. I should have put that as a comment in the question, rather than answering that. But you would have been in hurry for downvoting then as well anyway. :-) Thanks.
ydobonmai
+2  A: 

I can't seem to come up with any valid reasons where it would make sense to do what you're asking. At least not without extreme constraints on the system.

Joe Philllips
+1  A: 

I can think of a reason--the non-distinct field is a foreign key.

The field is distinct in the foreign table, but not in the first table.

For example, let's say you're cleaning up an old crufty duplicate-ridden mailing list and you have already fixed the country field so that instead of storing country name, you store country ID. You join on countryid to country to get the country data, and now you can store additional data in the country table.

You now get the normalization benefits of country but still distinct values from the address table.

Slightly contrived but will work.

Broam
+1  A: 

One scenario that might be plausible would be a date based reporting system that was queried for a particular time period. Say for example, the first 3 months of the year. Then related data from another table could be joined like you've mentioned where nonDistinctField is the year.month(yyyy.mm). Still, the join resolution might not make a lot of sense but you could be using some other aggregate function (SUM, AVG etc.) joined to the grouped month.

I suppose there should be lots of examples where aggregate queries could benefit from this type of join.

Its not to say that thats a good idea but perhaps you might be limited to using denormalized data or some really bad data model.

bic