ansaurus

Question

Answer 1

+4 A:

First, have an index on the game field... :)

The query seems simple and straightforward, but it hides that fact that a datasbase design change is probably required.

In such cases I always prefer to maintain a field that holds aggregated data, either per day, per user, or per any other axis. This way you can have a daily task that aggregates the relevant data and saves it in the database.

If indeed you call this query often, you should use the principle of decreasing the efficiency of insertion for increasing the efficiency of retrieval.

Roee Adler 2009-07-29 12:33:49

Answer 2

+1 A:

The query is simple and, aside from making sure there are all the necessary indexes ("game" field obviously), there may be no obvious way to make it faster by rewriting the query only. Some modification of data structures will probably be necessary.

One way: precalculate the sums. Each of these records will most likely have a create_date or an autoincremented key field. Precalculate the sums for all records, where this field is ≤ some X, put results in a side table, and then you only need to calculate for all records > X, then summarize these partial results with your precalculated ones.

ttarchala 2009-07-29 12:36:37

Answer 3

+1 A:

It looks like the game column is storing two (or possibly more) different things that this query is using:

Filtering by the start of game (first 10 characters)
Grouping by and returning MID(game,1,14) (I'm assuming one of the MID expressions is a typo.

I'd split that up so that you don't have to use string operations on the game column, and also put indexes on the new columns so you can filter and group them properly.

This query is doing a lot of conversions (long to string) and string manipulations that wouldn't be necessary if the table were normalized (as in one piece of information per column instead of multiple like it is now).

Leave the game column the way it is, and create a game_filter string column based on it to use in your WHERE clause. Then set up a game_group column and populate it with the MID expression on insert. Set up these two columns as your clustered index, first game_filter, then game_group.

Welbog 2009-07-29 12:37:04

Answer 4

A:

SELECT  MID(`game`,14,1) AS `move`,
        COUNT(*) AS `games`,
        SUM(`win`) AS `wins`,
        SUM(`loss`) AS `losses`
FROM    `games`
WHERE   `game` LIKE '1112223334%'

Create an index on game:

CREATE INDEX ix_games_game ON games (game)

and rewrite your query as this:

SELECT  move,
        (
        SELECT  COUNT(*)
        FROM    games
        WHERE   game >= move
                AND game < CONCAT(SUBSTRING(move, 1, 13), CHR(ASCII(SUBSTRING(move, 14, 1)) + 1))
        ),
        (
        SELECT  SUM(win)
        FROM    games
        WHERE   game >= move
                AND game < CONCAT(SUBSTRING(move, 1, 13), CHR(ASCII(SUBSTRING(move, 14, 1)) + 1))
        ),
        (
        SELECT  SUM(lose)
        FROM    games
        WHERE   game >= move
                AND game < CONCAT(SUBSTRING(move, 1, 13), CHR(ASCII(SUBSTRING(move, 14, 1)) + 1))
        )
FROM    (
        SELECT  DISTINCT SUBSTRING(q.game, 1, 14) AS move
        FROM    games
        WHERE   game LIKE '1112223334%'
        ) q

This will help to use the index on game more efficiently.

Quassnoi 2009-07-29 12:37:26

Why drop the GROUP BY clause? He wants the COUNT and SUM split out by the 14th digit of the game-column.

mlarsen 2009-07-29 12:41:38

@mlarsen: I didn't get it first and deleted the answer. Now it's all rewritten.

Quassnoi 2009-07-29 12:46:26

Answer 5

+1 A:

You could precompute the MID(game,14,1) and MID(game,1,14) and store the first ten digits of the game in a separate gameid column which is indexed.

It might also be an idea to investigate if you could just store an aggregate table of the precomputed values so you increment the count and wins or losses column on insert instead.

mlarsen 2009-07-29 12:38:37

Answer 6

A:

Please note I have updated the info above...

2009-07-29 12:59:07

Answer 7

A:

Can you cache the result set with Memcache or something similar? That would help with repeated hits. Even if you only cache the result set for a few seconds, you might be able to avoid a lot of DB reads.

Ted Pennings 2009-07-29 19:54:21

ansaurus

tags:

views:

answers:

How can I optimise this MySQL query?

related questions