views:

192

answers:

2

I have an SQL question, related to this and this question (but different). Basically I want to know how I can avoid a nested query.

Let's say I have a huge table of jobs (jobs) executed by a company in their history. These jobs are characterized by year, month, location and the code belonging to the tool used for the job. Additionally I have a table of tools (tools), translating tool codes to tool descriptions and further data about the tool. Now they want a website where they can select year, month, location and tool using a dropdown box, after which the matching jobs will be displayed. I want to fill the last dropdown with only the relevant tools matching the before selection of year, month and location, so I write the following nested query:

SELECT c.tool_code, t.tool_description
FROM (
 SELECT DISTINCT j.tool_code
 FROM jobs AS j
 WHERE j.year = ....
        AND j.month = ....
 AND j.location = ....
) AS c
LEFT JOIN tools as t
ON c.tool_code = t.tool_code
ORDER BY c.tool_code ASC

I resorted to this nested query because it was much faster than performing a JOIN on the complete database and selecting from that. It got my query time down a lot. But as I have recently read that MySQL nested queries should be avoided at all cost, I am wondering whether I am wrong in this approach. Should I rewrite my query differently? And how?

+1  A: 

No, you shouldn't, your query is fine.

Just create an index on jobs (year, month, location, tool_code) and tools (tool_code) so that the INDEX FOR GROUP-BY can be used.

The article your provided describes the subquery predicates (IN (SELECT ...)), not the nested queries (SELECT FROM (SELECT ...)).

Even with the subqueries, the article is wrong: while MySQL is not able to optimize all subqueries, it deals with IN (SELECT …) predicates just fine.

I don't know why the author chose to put DISTINCT here:

SELECT  id, name, price
FROM    widgets
WHERE   id IN
        (
        SELECT  DISTINCT widgetId
        FROM    widgetOrders
        )

and why do they think this will help to improve performance, but given that widgetID is indexed, MySQL will just transform this query:

SELECT  id, name, price
FROM    widgets
WHERE   id IN
        (
        SELECT  widgetId
        FROM    widgetOrders
        )

into an index_subquery

Essentially, this is just like EXISTS clause: the inner subquery will be executed once per widgets row with the additional predicate added:

SELECT  NULL
FROM    widgetOrders
WHERE   widgetId = widgets.id

and stop on the first match in widgetOrders.

This query:

SELECT  DISTINCT w.id,w.name,w.price
FROM    widgets w
INNER JOIN
        widgetOrders o
ON      w.id = o.widgetId

will have to use temporary to get rid of the duplicates and will be much slower.

Quassnoi
+2  A: 

You could avoid the subquery by using GROUP BY, but if the subquery performs better, keep it.

Why do you use a LEFT JOIN instead of a JOIN to join tools?

Peter Lang
Because I haven't yet set up foreign key relationships between the tables. After I do that, I can switch to a JOIN, but at the moment it is still possible to have jobs with as-of-yet-unknown tool codes being added to the database. I need these jobs to still pop up in the select.
littlegreen