ansaurus

Question

Answer 1

+5 A:

The only suggestion I can make is to use a table to precalculate all counts, and sums for each game and update it when table game changes using a trigger.

slipbull 2009-07-30 09:48:09

That would require hundreds of millions more rows than are currently used and more space per row too. The table's already almost 10GiB as it is, so I don't think that's viable.

PhantomCode 2009-07-30 09:59:32

But your main problem is that the calculations are too costly. Using some more disk space to build a 'cache' table may help.

slipbull 2009-07-30 10:10:41

OK, I think I'll look into building some sort of cache system. I'm still worried by the space it'd take up, though.

PhantomCode 2009-07-30 10:43:56

Answer 2

A:

I would try starting with the EXPLAIN query or profiling.

Sam Brightman 2009-07-30 09:48:36

Answer 3

A:

If you're read heavy, consider keep and maintain an aggregated table over the data you commonly query.

nos 2009-07-30 09:50:09

Unfortunately I use all different vales for `$game` with approximately equal frequency.

PhantomCode 2009-07-30 09:58:15

That just means you aggregate upon the different values of $games that you will need. even if that ranges in the millions, it'll likely be a lot less than the individual records.

nos 2009-07-30 10:54:51

Answer 4

A:

Sounds to me like you could denormalize this and create a "moves" table, recording stats per "move", not only per "game".

Sorin Mocanu 2009-07-30 14:30:50

Answer 5

A:

You can "buy speed" by sacrificing storage space or onserve storage space but got worse performance. As your problem is speed, you'll need some precalculations. And yes, some profiling of the query.

BTW, the "big ones" used to have different configs (different hardware and settings) for OLTP (serving the actual transactions in real time) and DW (analysing large amounts of data).

Csaba Kétszeri 2009-07-30 14:37:22

Answer 6

A:

The mid() function is what is killing this query. MySQL has to create a temp table in memory to deal with the mid() function and do filesort on that table because of the group by.

I'm assuming that $game is the type of game. (checkers, chess, tic tac toe)

I would hang another table off for the type of game. This allows your group to make use of an index which would be much faster.

I suggest something like:

[game]
game bigint unsigned
win bit
loss bit
game_type_id bigint unsigned

[game_type]
game_type_id bigint unsigned
game_type_desc varchar(13)

Be careful with alters statements on a table this large. Always make a backup before you issue an alter.

txyoji 2009-08-09 15:43:29

Actually, `$game` is a 15 digit long quinary number. :PAlso, as it says in the question, MID() has to be able to accept 5 different length values so that would be one HUGE extra table!

PhantomCode 2009-08-10 09:31:24

Its creating part of the huge extra table in memory every time the query is run and not giving you the benefit of an index.

txyoji 2009-08-11 15:53:16

Answer 7

+1 A:

Straight off the bat, I would stop using the MID() query in both the SELECT expression and GROUP BY. Depening on your query conditions, MySQL will not necessarily cache that inside a single expression whilst parsing, so at least try this:

SELECT MID(game,{$len},1) AS move,
   COUNT(*) AS games,
   SUM(win) AS wins,
   SUM(loss) AS losses
   FROM games WHERE game LIKE '{$game}%' GROUP BY move;

Not the biggest change in the world, but it should make a small difference. Other than that though, I second that the only real way to optimise this short of changing the way you store the data is to precalculate these values and increment them when a game finishes.

Nathan Kleyn 2009-08-09 22:34:25

Oh, great thanks. I didn't realise I could `GROUP BY` a collumn i just made up. Also, I should note that I've changed the `LIKE` to a `> x AND < y` statement.

PhantomCode 2009-08-10 09:33:58

You query should run a lot faster with the GROUP BY change, and >'s and <'s instead of LIKE. LIKE uses pattern matching and regular expressions to compute matches, whereas > and < are both boolean operators and be computed using direct assembly code.

Nathan Kleyn 2009-08-11 17:11:11

ansaurus

tags:

views:

answers:

How can I optimise this MySQL query?

related questions