views:

1191

answers:

8

If you have for example > 5 left joins in a query is that a code smell that there is ...

  • something wrong with your design?
  • you're doing too much in one query?
  • you're database is too normalized?
+26  A: 

It's a perfectly legitimate solution for some designs.

Say you have a hierarchy of one-to-many relations like Customer - Order - Basket - Item - Price, etc., which can be unfilled on any level: a Customer may have no Orders, an Order can have no Baskets, etc.

In this case you issue something like:

SELECT  *
FROM    Customer c
LEFT OUTER JOIN
        Order o
ON      o.CustomerID = c.ID
LEFT OUTER JOIN
        Basket b
ON      b.OrderID = c.ID
…

Note that it may be inefficient in some cases, and may be replaced with EXISTS or NOT EXISTS (if you only want to figure out that the corresponding records exist or do not exist in other tables).

See this article in my blog for performance details:

Quassnoi
The not exists vs join technique (which I'm a big fan of) is an excellent point, but again the join may be correct.
annakata
Sure. That's why I wrote "may be replaced" instead of "should be replaced at any cost and never to be used again" :)
Quassnoi
+12  A: 

In the sense that it's something you could/should investigate I'd say yes. It's likely you can get better utility and maintenance by factoring some views out of that.

In the sense that it's "bad code" no, this could quite easily be reasonable especially for larger DBs and modern databases will likely optimise any inefficiencies out.

annakata
+7  A: 

Nope it's perfectly fine to do, though if you find yourself writing the same queries/procedures over and over again using the same joins to the same tables, it maybe a candidate for creating a View just to simplify you're queries in future, and to reduce the number of touch points you'd need to change if you're schema changes

Eoin Campbell
+6  A: 

A lot of times you can alleviate the visual smell by creating helper views, I do not think there is a hard and fast rule of how many left joins are considered bad.

Unlike procedural coding, breaking down SQL into little bits and pieces can result in inefficient queries.

Sam Saffron
+2  A: 

It is pretty much impossible for someone to answer a question as general as this and to attempt to create such an arbitrary rule would pointless.

Left joins are a perfectly acceptable type of join which map onto a very common need: get me all x's, if they have associated y's then get those too.

Jack Ryan
+1  A: 

Your Results My Vary

Anything out of the ordinary could be a code-smell for anything. Like Quassnoi said it could be perfectly legitimate. It's not uncommon for really in-depth reports to require a crazy amount of joins to piece together the information correctly. That doesn't mean that the developer should looking at denormalizing their database.

TheTXI
+2  A: 

No, not at all. It's perfectly legitimate to construct a database design that uses a significant number of left joins on some queries.

Having said that I would generally prefer to construct the database so that the number of cases where outer joins are required is limited as experience tends to suggest that (complex) queries that use them are more error prone and likely to give rise to maintenance problems.

As an interesting historical aside, the early versions of IBM's DB2, when it ran on mainframes only, did not support outer joins (Oracle and Ingress both did at the time which was a major selling point). This lead to some interesting issues in database design as it was necessary to ensure that all expected data access requirements for the database could be solved using just inner joins.

Cruachan
A: 

I would contend that having to use many joins (e.g. deal with normalized data) is not a code smell, but rather an indication you might not be working in the right place. In my experience, those that are concerned about the number of joins in queries are developing too much in the database and not enough in the applications and reports that expose the data. The data structures must be flexible enough to support a myriad of uses and this is why normalization, to one degree or another, is important.

In building today's enterprise applications, developers can leverage yesterday's accomplishments to work at abstract levels high above technologies like SQL, and even XML, in order to deliver more value with less work. There are tools, i.e. Report Writers, Code Generators, ORMs, entity frameworks, etc., that abstract away the low level work of constructing SQL manually, and will perform the joins for you. Most are aware of the SQL dialect being used (for example, Oracle 9 vs MySQL 3) and can generate the SQL syntax that is most efficient for that dialect; meaning they can probably create the joins better than you can.

However, these tools work very poorly, or not at all, in a relational environment without sufficient normalization. For me, this is where a "development" smell manifests itself; if an established data access tool can't understand the relations under which I've structured my data, I probably need to look for a more normalized way to create those relationships and therein reap benefits far exceeding just the use of the tool. Typically, somewhere between 2nd and 3rd normal form is the sweet-spot; although invariably there tend to be small areas of relational data where a higher degree of normalization makes sense and adds value.

Cheers, Travis

Travis Heseman