views:

72

answers:

3

Hi there,

I’m currently using the following query for jsPerf. In the likely case you don’t know jsPerf — there are two tables: pages containing the test cases / revisions, and tests containing the code snippets for the tests inside the test cases.

There are currently 937 records in pages and 3817 records in tests.

As you can see, it takes quite a while to load the “Browse jsPerf” page where this query is used.

The query takes about 7 seconds to execute:

SELECT
 id AS pID,
 slug AS url,
 revision,
 title,
 published,
 updated,
 (
  SELECT COUNT(*)
  FROM pages
  WHERE slug = url
  AND visible = "y"
 ) AS revisionCount,
 (
  SELECT COUNT(*)
  FROM tests
  WHERE pageID = pID
 ) AS testCount
 FROM pages
 WHERE updated IN (
  SELECT MAX(updated)
  FROM pages
  WHERE visible = "y"
  GROUP BY slug
 )
 AND visible = "y"
 ORDER BY updated DESC

I’ve added indexes on all fields that appear in WHERE clauses. Should I add more?

How can this query be optimized?

P.S. I know I could implement a caching system in PHP — I probably will, so please don’t tell me :) I’d just really like to find out how this query could be improved, too.

A: 

You may want to try the subqueries one at a time to see which one is slowest.

This query:

SELECT MAX(updated)
  FROM pages
  WHERE visible = "y"
  GROUP BY slug

Makes it sort the result by slug. This is probably slow.

Sjoerd
The subquery you mentioned executes in 0.0125 seconds (‘stand-alone’, that is). I’ll test the other ones now!
Mathias Bynens
+1  A: 

You want to learn how to use EXPLAIN. This will execute the sql statement, and show you which indexes are being used, and what row scans are being performed. The goal is to reduce the number of row scans (ie, the database searching row by row for values).

GrandmasterB
Here are the results of `EXPLAIN [query]`: http://i.imgur.com/tOErB.png
Mathias Bynens
"The goal is to reduce the number of row scans" : No, that's not the problem here. He has the indexes he needs so it won't use table/index scans. The problem is due to a known bug in MySQL.
Mark Byers
+1  A: 

Use:

   SELECT x.id AS pID,
          x.slug AS url,
          x.revision,
          x.title,
          x.published,
          x.updated,
          y.revisionCount,
          COALESCE(z.testCount, 0) AS testCount
     FROM pages x
     JOIN (SELECT p.slug,
                  MAX(p.updated) AS max_updated,
                  COUNT(*) AS revisionCount
             FROM pages p
            WHERE p.visible = 'y'
         GROUP BY p.slug) y ON y.slug = x.slug
                           AND y.max_updated = x.updated
LEFT JOIN (SELECT t.pageid,
                  COUNT(*) AS testCount
             FROM tests t
         GROUP BY t.pageid) z ON z.pageid = x.id
 ORDER BY updated DESC
OMG Ponies
Thanks, that worked like a charm! I had never heard of `COALESCE` before. These `JOIN`s surely seem to be faster than the subqueries I was using before.
Mathias Bynens
@Mathias Bynens: If you can't beat 'em, JOIN 'em =)
OMG Ponies