views:

149

answers:

5

The following query works, but its very slow for 10 records (2 seconds). The profiling says its creating a tmp table, but I'm not sure why.

Basically, I'm joining the current user, to the acl groups, to get all groups they are in, then joining the groups to the companies, to get all companies they are in, then joining the companies to the orders, to get all the orders..

If I remove this line

ORDER BY orders.created_on DESC 

then the query executes in 0.06 secs (more then acceptable)..

Help, any ideas on how to optimize? Many thanks :)

SELECT
    orders.uuid,
    companies.name as company_name
FROM
    users u
JOIN    
    users_acl_groups g on   u.uuid = g.user_uuid
JOIN
    users_acl acl on (acl.user_uuid = u.uuid or acl.group_uuid = g.group_uuid) 
JOIN 
    companies on acl.item_uuid = companies.uuid
JOIN
    orders on companies.uuid = orders.company_uuid
WHERE
    u.uuid = 'DDEC8073-5056-C000-01ED583A51CBCA32' and orders.status <> ''
ORDER BY orders.created_on DESC 

limit 0, 10;

UPDATE, the explain of the query..

1 SIMPLE orders ALL 9403 Using temporary; Using filesort

1 SIMPLE acl ALL 1859 Using where; Using join buffer

1 SIMPLE g ALL 2005 Using where; Using join buffer

1 SIMPLE companies eq_ref PRIMARY PRIMARY 52 table.orders.company_uuid 1

1 SIMPLE u ALL 33595 Using where; Distinct; Using join buffer

+2  A: 

Have you considered making a fact table style design, as a denormalization step?

Basically it's a kind of many-to-many intersection table, for example:

CREATE TABLE user_order_fact (
  user_uuid ...
  order_uuid ...
  order_created_on ...
  order_status ...
  company_name ...,
  primary key (user_uuid, order_uuid),
  key (user_uuid, order_status, order_created_on, order_uuid, company_name)
);

... fill with data ...

SELECT
    order_uuid,
    company_name
FROM
    user_order_fact
WHERE
    user_uuid = 'DDEC8073-5056-C000-01ED583A51CBCA32' and order_status <> ''
ORDER BY order_created_on DESC 

limit 0, 10;

I'm guessing on the compound index. You'll have to experiment until you get it right. Basically you're trying to get the optimizer plan to report that it's Using index.

Of course this is storing data redundantly and in denormalized form, so you need to set up some triggers to keep this in sync with the normalized tables.

Bill Karwin
Hmm maybe the PK is just order_uuid. I'm not guaranteeing this is the best design, just trying to give you a feel of what I mean.
Bill Karwin
A: 

make sure "orders.created_on" has an index... If it does, then Bill's approach at the top will be the best however will require a bit of work.

Nick
I think it does? KEY `created_on` (`created_on`)
Brett
A: 

Hard to answer without knowing much about the existing indexes, or the volume of each tables.

Besides, without much information about the model... does the query return all the results?

Do all users belong to a group? It seems that not... and the query won't return users outside of a group.

Can a group belong to a group, calling for a recursive query?

pascal
I was trying to work that recursive query thing out but I wasn't having much luck :( the table is quiet small, 10,000 records all up. There are no indexes at the moment..
Brett
@Brett: Indexes if present in this case will not work, as NOT operator and Like Operator don't uses Indexes
Shantanu Gupta
A: 

I am not sure what might be the exact reason for it to take 2 sec. Which is not possible for this query fetching 10 records but what is see here are

  1. acl.user_uuid = u.uuid or acl.group_uuid = g.group_uuid

    UID based join, may be you are also using it as a primary key as answered above.

  2. ORDER BY orders.created_on. Using Order by on date would not be as optimal than using PK or any integer value is more appropriate.

  3. orders.status <> '' If using any indexes on tables then no index can be used in this query because NOT Operator and Like Operator does not uses indexes when used in any query.

  4. Volume of records present in a table might be another reason but only due to above factors. Otherwise it could have handled large volume as well.

Major Contributing factor that I think is UID being used in Joins So all the three avoiding conditions can be seen in your query that might make your query lazy

Shantanu Gupta
Hi, thank you for your responses.. What does minimized in no.2 mean?
Brett
A: 

Few ideas:

You actually not selecting orders.created_on in Your query. So there is no point of sorting on that column. Maybe, selecting it (SELECT orders.created_on ...) would help performance (Just wild guess -- I have no idea what I am talking here).

You can always sort in Your application -- if there is not enormous amount of records returned by Your query.

Sometimes it is more performance wise to use N small queries instead of 1 big sql query. Pseudo code:

user_id = get_one("SELECT uuid FROM users WHERE ...");
group_ids = get_many("SELECT uuid FROM groups WHERE user_uuid = " + user_id);
comps_ids = get_many("SELECT DISTINCT item_uuid FROM acls WHERE user_uuid = " + user_id + " OR group_uuid IN " + groups_ids.to_q());
orders = get_many("SELECT * FROM orders WHERE company_uuid IN " + comps_ids.as_q() + " WHERE status <> '' ORDER BY created_on");
petraszd