views:

1019

answers:

1

I'm wanting to optimize a query using a union as a sub query. Im not really sure how to construct the query though. I'm using MYSQL 5

Here is the original query:

SELECT  Parts.id 
FROM    Parts_Category, Parts 
    LEFT JOIN Image ON Parts.image_id = Image.id 
WHERE 
( 
    (
     Parts_Category.category_id = '508' OR 
     Parts_Category.main_category_id ='508'
    ) AND 
    Parts.id = Parts_Category.Parts_id 
) AND 
Parts.status = 'A' 
GROUP BY 
    Parts.id

What I want to do is replace this ( (Parts_Category.category_id = '508' OR Parts_Category.main_category_id ='508' ) part with the union below. This way I can drop the GROUP BY clause and use straight col indexes which should improve performance. Parts and parts category tables contains half a million records each so any gain would be great.

(
    SELECT * FROM
    (
     (SELECT Parts_id FROM Parts_Category WHERE category_id = '508') 
     UNION 
     (SELECT Parts_id FROM Parts_Category WHERE main_category_id = '508')
    )
    as Parts_id
)

Can anybody give me a clue on how to re-write it? I've tried for hours but can't get it as I'm only fairly new to MySQL.

A: 
SELECT  Parts.id
FROM    (
        SELECT  parts_id
        FROM    Parts_Category
        WHERE   Parts_Category.category_id = '508'
        UNION
        SELECT  parts_id
        FROM    Parts_Category
        WHERE   Parts_Category.main_category_id = '508'
        ) pc
JOIN    Parts
ON      parts.id = pc.parts_id
        AND Parts.status = 'A'
LEFT JOIN
        Image
ON      image.id = parts.image_id

Note that MySQL can use Index Merge and you can rewrite your query as this:

SELECT  Parts.id
FROM    (
        SELECT  DISTINCT parts_id
        FROM    Parts_Category
        WHERE   Parts_Category.category_id = '508'
                OR Parts_Category.main_category_id = '508'
        ) pc
JOIN    Parts
ON      parts.id = pc.parts_id
        AND Parts.status = 'A'
LEFT JOIN
        Image
ON      image.id = parts.image_id

, which will be more efficient if you have the following indexes:

Parts_Category (category_id, parts_id)
Parts_Category (main_category_id, parts_id)
Quassnoi
Thanks ever so much Quassnoi. FY1 on a 62,000 row category the query returns around 1 second faster!
John
I think in your second example i would have to add a GROUP BY Parts.id as i dont want the duplicates where a part may be listed in other categories. That was the idea behind using Union. I will compare the performance anyway. Thanks Again.
John
GROUP BY was needed in the second example and if added i get the same problem - filesort and using temp from Explain.
John
@John: a `UNION` still needs a filesort. But you're right, I need to add a `DISTINCT` to the second query.
Quassnoi
Thanks Quassnoi, I just added DISTINCT on parts_id in place of the GROUP BY on Parts.id. as you said - interestingly, explain now says in extras using Temp and rows required 519,330? That wasn't the case using single indexes on each col in the first example. I re-checked the indexes and they are correct.
John
@John: did you create the indexes I wrote about? How is the performance on the query?
Quassnoi
Still testing both . seems like they give different results depending on how many rows are involved.
John