views:

130

answers:

4

I am currently working on a website which needs some optimisations ... since the front page takes about 15-20 seconds to be loaded I thought that some optimisation would be nice.

Here is one query that appeared on the MySQL slow query log:

SELECT a.user,a.id
FROM `profil_perso` pp
INNER JOIN `acces` a ON pp.parrain = a.id
INNER JOIN `acces` ap ON ap.id = pp.id
WHERE pp.parrain_visibilite = '1'
  AND a.actif = 1
GROUP BY a.id
ORDER BY ap.depuis DESC LIMIT 15;

On profil_perso (~207K lines -- contains emails and profiles) there is perso_id that is the primary key, there is also id (foreign key) + parrain(referer) + parrain_visibilite(referer is showed) that are indexes.

On acces there is id that is the primary key, there is also depuis (registration date) that is indexed

The benchmark shows this actually:

First time : 1.94532990456
Last time : 1.94532990456
Average time : 0.0389438009262

I tried to put it this way :

SELECT DISTINCT a.id, a.user
FROM `profil_perso` pp
LEFT JOIN `acces` a ON pp.parrain = a.id
WHERE pp.parrain_visibilite = 1
  AND a.actif = 1
  AND pp.id != 0
ORDER BY pp.id DESC LIMIT 15;

Still the benchmark show this:

First time: 1.96376991272
Last time: 1.96376991272
Average time: 0.0393264245987

Any hint to lower the query time ?

Here's the full indexes:

acces :

id (primary)
derniere_visite -- last visit
pays_id -- country_id
depuis -- registration time
perso_id -- foreign key to profil_perso primary key
actif -- account status
compte_premium -- if account is premium

profil_perso :

perso_id (primary)
id -- foreign key to acces primary key
genre -- gender
parrain_visibilite -- visibility of referer
parrain -- referer
parrain_contexte
telephone
orientation
naissance -- birthdate
photo -- if it has a picture
+3  A: 

Run EXPLAIN SELECT DISTINCT a.id .....;

This will help show you where you might be missing indexes etc.

Greg
I agree running the EXPLAIN command before your query allows you to view the query for optimization. Helpful linkshttp://dev.mysql.com/doc/refman/5.0/en/using-explain.htmlhttp://dev.mysql.com/doc/refman/5.0/en/explain.html
Phill Pafford
Until you know how MySQL runs your query you could be doing more damage that good to optimize the runtime. Using EXPLAIN creates the breakdown of how MySQL is running your query.
Phill Pafford
+1Run an EXPLAIN
James C
A: 

Why do you have two JOIN's here?

Create a composite index on acces (actif, depuis):

CREATE INDEX ix_acces_actif_depuis ON acces (actif, depuis)

, create a composite index on profil_perso (parrain, parrain_visibilite):

CREATE INDEX ix_profilperso_parrain_parrainvisibilite ON profil_perso (parrain, parrain_visibilite)

and try this:

SELECT  a.user, a.id
FROM    acces a
JOIN    profil_perso p
ON      pp.parrain = a.id
        AND pp.parrain_visibilite = 1
WHERE   a.actif = 1
ORDER BY
        a.actif DESC, a.depuis DESC
LIMIT 15

This query will use the index on actif to avoid sorting, and the index on profil_perso to find and filter out the non-visible parrain's.

Since you have a LIMIT 15 here, this query should be instant.

It would also help knowing how selective is your actif field is.

To figure this out, please run:

SELECT COUNT(DISTINCT actif) / COUNT(*)
FROM   acces
Quassnoi
We are near here. The query is a lot better but it has a problem. The query generate duplicates (a guy can refer few users ...). If I add a group by statement the query jump back to 1-2 sec. Using distinct is even worse.
Erick
Could you please post some sample data?
Quassnoi
A: 

The proper answerr(s) depends as much on the distribution of data (record counts, cardinality of fields and field combinations, etc.) amd the schema, as on the query expression. Even given that information, we could only provide suggestions for testing, which would only lead to more suggestions for testing.

But we could start with a first cut with a schema of the tables involved, plus the results of current EXPLAIN (run twise, second and first results).

le dorfier
A: 

Generally speaking, you need to ensure that your indecies are set up correctly - not just for the primary keys, but for the foreign keys used in the table joins.

In addition, it is usually preferable to have defined indecies for any field you filter / order on - so, again, make sure these are set appropriately.

However, I think the big performance hit here could be the fact you're sorting 207k records to retrieve the last 15 inserted - can you achieve the same in a different way?

BrynJ