views:

30

answers:

2

Hi everyone,

I have a UNION query consisting of two fast queries.

( SELECT DISTINCT ( SELECT strStatus FROM User_User_XR uuxr WHERE ( uuxr.intUserId1 = '1' AND uuxr.intUserId2 = u.intUserId ) ) AS strFriendStatus1, uuxro.strStatus AS strFriendStatus2, uuxr.intUserId2 AS intUserId, u.strUserName , u.strGender, IF( u.dtmBirth != '0000-00-00', FLOOR(DATEDIFF(CURDATE(), u.dtmBirth) / 365.25) , '?') AS intAge, u.strCountry AS strCountryCode, c.strCountry AS strCountry, u.strAvatar, u.fltPoints, IF( o.intUserId IS NULL, 'offline', 'online' ) AS strOnline, IF ( u.strAvatar != '', CONCAT( 'avatars/60/', u.strAvatar ), CONCAT( 'images/avatar_', u.strGender, 'small.png' ) ) as strAvatar, IF ( u.strAvatar != '', CONCAT( 'avatars/150/', u.strAvatar ), CONCAT( 'images/avatar', u.strGender, '.png' )) as strLargeAvatar, u.dtmLastLogin, u.dtmRegistered FROM User_User_XR uuxr, User u LEFT JOIN User_User_XR uuxro ON uuxro.intUserId2 = '1' AND uuxro.intUserId1 = u.intUserId LEFT JOIN Online o ON o.intUserId = u.intUserId LEFT JOIN Country c ON c.strCountryCode = u.strCountry WHERE u.intUserId = uuxr.intUserId2 AND ( uuxr.strStatus = 'confirmed' ) AND uuxr.intUserId1='1' )

UNION

( SELECT DISTINCT ( SELECT strStatus FROM User_User_XR uuxr WHERE ( uuxr.intUserId1 = '1' AND uuxr.intUserId2 = u.intUserId ) ) AS strFriendStatus1, uuxro.strStatus AS strFriendStatus2, uuxr.intUserId1 AS intUserId, u.strUserName , u.strGender, IF( u.dtmBirth != '0000-00-00', FLOOR(DATEDIFF(CURDATE(), u.dtmBirth) / 365.25) , '?') AS intAge, u.strCountry AS strCountryCode, c.strCountry AS strCountry, u.strAvatar, u.fltPoints, IF( o.intUserId IS NULL, 'offline', 'online' ) AS strOnline, IF ( u.strAvatar != '', CONCAT( 'avatars/60/', u.strAvatar ), CONCAT( 'images/avatar_', u.strGender, 'small.png' ) ) as strAvatar, IF ( u.strAvatar != '', CONCAT( 'avatars/150/', u.strAvatar ), CONCAT( 'images/avatar', u.strGender, '.png' )) as strLargeAvatar, u.dtmLastLogin, u.dtmRegistered FROM User_User_XR uuxr, User u LEFT JOIN User_User_XR uuxro ON uuxro.intUserId2 = '1' AND uuxro.intUserId1 = u.intUserId LEFT JOIN Online o ON o.intUserId = u.intUserId LEFT JOIN Country c ON c.strCountryCode = u.strCountry WHERE u.intUserId = uuxr.intUserId1 AND ( uuxr.strStatus = 'confirmed' ) AND uuxr.intUserId2='1' )

First of the queries runs in 0.0047s Second runs in 0.0043s

However, WITH the Union, they run 0.27s ... why is this? There is no Order By after the UNION, why wouldn't MySQL simply take the two fast queries and concatenate them?

A: 

Try using UNION ALL.

UNION on its own will remove any duplicate records, which implies a behind-the-scenes sort.

RedFilter
UNION ALL reduced it to 0.17s .. it did something, but its still way too slow :( .. thanks for the answer though :)
Armin
A: 

A UNION causes a temporary table to be created, even for a UNION ALL.

When a UNION DISTINCT (which is the same as UNION) is performed, the temporary table is created with an index so that duplicates can be removed. With UNION ALL, the temporary table is created, but without the index.

This explains the slight performance improvement when using UNION ALL, and also accounts for the performance drop when using UNION instead of two separate queries.

For more information on this, see the following entry on the MySQL performance blog:

UNION vs UNION ALL Performance

The How MySQL Uses Internal Temporary Tables page from the MySQL docs states that a temporary table is created when:

... any column larger than 512 bytes in the SELECT list, if UNION or UNION ALL is used

Mike
Thanks for the explanation :). Very helpful knowledge-wise, but not getting me any further in solving my problem. I'm basically trying to get all my queries optimized. To me, any query running slower than 0.1 is a problem. I have a large site with a lot of visitors and cant quite afford to have a query like this running at 0.17s. So far, my best bet is to just run two queries and merge the resulting arrays with PHP, which is quite ugly modularity-wise.
Armin
@Armin: I agree, it doesn't help much. I notice that both your queries are very similar - could they be combined into a single `SELECT` statement? They differ in the `WHERE` condition, and it looks as though this could be reduced down by using a couple of `OR` operators. The only other problem I see (and I've only had a quick look) is that your return `uuxr.intUserId1 AS intUserId` and `uuxr.intUserId2 AS intUserId`. Perhaps you could use an `IF` to select the correct column to return from a single statement.
Mike
I had this before. The query would be extremely slow. Here is an example. I set a limit of only 5 records here and it runs for 0.47s. `SELECT u.intUserId, u.strUserName FROM User_User_XR uuxr, User u WHERE ( uuxr.intUserId1 = '1' AND uuxr.intUserId2 = u.intUserId ) OR ( uuxr.intUserId2 = '1' AND uuxr.intUserId1 = u.intUserId ) AND uuxr.strStatus = 'confirmed' LIMIT 5`
Armin
@Armin: Are all the columns that you use in your conditions indexed? Approximately how many records are you dealing with? Can you post the output of an `EXPLAIN SELECT` on the combined `SELECT` query?
Mike
Hi Mike,Yes, all columns within the conditions are indexed. Out of close to 200 queries, I only have about 2-4 that run slower than 0.1s. I'm not a beginner at this, but it looks like I'm not an expert either hehe.The User table has around 68,000 records, the User_User_XR table has around 2,800,000 records. Thats 2.8 Million. Basically, the User_User_XR stores a friend relationship. 1 row per relationship. I'm trying to select friends for a user. The best I can do so far is the UNION query that runs 0.16s even though EACH of the queries in the UNION run at 0.0044s.I'll post the EXPLAIN now
Armin
`1 SIMPLE uuxr index_merge PRIMARY,intUserId2Index,intUserId1Index,strStatusI... intUserId1Index,intUserId2Index 4,4 NULL 1418 Using union(intUserId1Index,intUserId2Index); Usin...``1 SIMPLE u ALL PRIMARY NULL NULL NULL 64494 Using where`
Armin