If you have for example > 5 left joins in a query is that a code smell that there is ...
- something wrong with your design?
- you're doing too much in one query?
- you're database is too normalized?
If you have for example > 5 left joins in a query is that a code smell that there is ...
It's a perfectly legitimate solution for some designs.
Say you have a hierarchy of one-to-many relations like Customer
- Order
- Basket
- Item
- Price
, etc., which can be unfilled on any level: a Customer
may have no Orders
, an Order
can have no Baskets
, etc.
In this case you issue something like:
SELECT *
FROM Customer c
LEFT OUTER JOIN
Order o
ON o.CustomerID = c.ID
LEFT OUTER JOIN
Basket b
ON b.OrderID = c.ID
…
Note that it may be inefficient in some cases, and may be replaced with EXISTS
or NOT EXISTS
(if you only want to figure out that the corresponding records exist or do not exist in other tables).
See this article in my blog for performance details:
LEFT JOIN
's with NOT EXISTS
In the sense that it's something you could/should investigate I'd say yes. It's likely you can get better utility and maintenance by factoring some views out of that.
In the sense that it's "bad code" no, this could quite easily be reasonable especially for larger DBs and modern databases will likely optimise any inefficiencies out.
Nope it's perfectly fine to do, though if you find yourself writing the same queries/procedures over and over again using the same joins to the same tables, it maybe a candidate for creating a View just to simplify you're queries in future, and to reduce the number of touch points you'd need to change if you're schema changes
A lot of times you can alleviate the visual smell by creating helper views, I do not think there is a hard and fast rule of how many left joins are considered bad.
Unlike procedural coding, breaking down SQL into little bits and pieces can result in inefficient queries.
It is pretty much impossible for someone to answer a question as general as this and to attempt to create such an arbitrary rule would pointless.
Left joins are a perfectly acceptable type of join which map onto a very common need: get me all x's, if they have associated y's then get those too.
Your Results My Vary
Anything out of the ordinary could be a code-smell for anything. Like Quassnoi said it could be perfectly legitimate. It's not uncommon for really in-depth reports to require a crazy amount of joins to piece together the information correctly. That doesn't mean that the developer should looking at denormalizing their database.
No, not at all. It's perfectly legitimate to construct a database design that uses a significant number of left joins on some queries.
Having said that I would generally prefer to construct the database so that the number of cases where outer joins are required is limited as experience tends to suggest that (complex) queries that use them are more error prone and likely to give rise to maintenance problems.
As an interesting historical aside, the early versions of IBM's DB2, when it ran on mainframes only, did not support outer joins (Oracle and Ingress both did at the time which was a major selling point). This lead to some interesting issues in database design as it was necessary to ensure that all expected data access requirements for the database could be solved using just inner joins.
I would contend that having to use many joins (e.g. deal with normalized data) is not a code smell, but rather an indication you might not be working in the right place. In my experience, those that are concerned about the number of joins in queries are developing too much in the database and not enough in the applications and reports that expose the data. The data structures must be flexible enough to support a myriad of uses and this is why normalization, to one degree or another, is important.
In building today's enterprise applications, developers can leverage yesterday's accomplishments to work at abstract levels high above technologies like SQL, and even XML, in order to deliver more value with less work. There are tools, i.e. Report Writers, Code Generators, ORMs, entity frameworks, etc., that abstract away the low level work of constructing SQL manually, and will perform the joins for you. Most are aware of the SQL dialect being used (for example, Oracle 9 vs MySQL 3) and can generate the SQL syntax that is most efficient for that dialect; meaning they can probably create the joins better than you can.
However, these tools work very poorly, or not at all, in a relational environment without sufficient normalization. For me, this is where a "development" smell manifests itself; if an established data access tool can't understand the relations under which I've structured my data, I probably need to look for a more normalized way to create those relationships and therein reap benefits far exceeding just the use of the tool. Typically, somewhere between 2nd and 3rd normal form is the sweet-spot; although invariably there tend to be small areas of relational data where a higher degree of normalization makes sense and adds value.
Cheers, Travis