ansaurus

Question

MySQL Performance

Answer 1

+2 A:

An index on (campaign_id,id) should take care of the first fairly well. But the distinct one is a bit trickier...

Edit: MySQL doesn't use multiple indexes for one query; so yes, you need one index that covers all the fields involved in the query.

womble 2009-02-14 05:15:34

I already do though, see the primary key, and the key for campaign_id?

Garrett 2009-02-14 05:19:44

Its the lack of an index also on ID that is the issue.WHERE campaign_id = 30 is covered by the index but:ORDER BY id DESCisn't, so server will have to load all rows matching campaign_id=3-, sort by id, then grap the first 10

Kristen 2009-02-14 07:26:42

MySQL does indeed do multiple index combination in one query. For example, a WHERE clause on two columns, where there are 2 separate queries - it can use both indexes.

Chris KL 2009-02-14 07:55:05

@Chris: In what way is "2 separate queries" performing "multiple index combination in one query"?

womble 2009-02-14 09:25:58

Answer 2

+1 A:

If the query is taking to long to process it is usually because of missing indexes, poor disk IO, or some other bottleneck. A table with 120 000 rows isn't a hell of a lot of data and the query really shouldn't take that long. I really would check disk io.

Answer 1 above is a way to speed up query 1. To speed up query 2 you may need to create a aggregate table which get updated with every hit or which gets updated in a batch run nightly and then you can just add in the days hits that have not yet been aggregated. An index on the date rage should make this relatively quick.

You should also run "explain" against your query and see what indexes it is using if any. What storage enigne are you using for mysql? This can also have an impact. If you are using MYISAM storage engine and doing inserts and reads concurrently this can have a big performance hit.

Make sure your table stats are updated by running "Analyse" against the larger tables on a regular basis. This helps the query engine select the optimal query plan.

mxc 2009-02-14 06:00:57

I am using MyISAM, what do you suggest I change it to?

Garrett 2009-02-14 06:36:04

Also to note, I have this in my my.cnf file: skip-external-locking and skip-locking

Garrett 2009-02-14 06:42:04

Answer 3

+4 A:

It seems your campaign_id index has low selectivity, i. e. there are lots of records with this value.

Ordering so many record takes a lot of time.

Try to use INDEX SCAN on the PRIMARY KEY for ordering:

/* Edited, as MySQL does not use live feed from the derived source with ORDER BY */
SELECT *
FROM hits
WHERE IFNULL(campaign_id, campaing_id) = 30
ORDER BY id DESC
LIMIT 10;

As for your second query, there is not much that may be done, as you need a complete scan on whole campaign_id = 30 anyway, be it TABLE SCAN or INDEX SCAN.

In fact, the TABLE SCAN can be even faster:

SELECT count(DISTINCT(ip_address)) AS count_distinct_ip_address
FROM `hits`
WHERE IFNULL(campaign_id, campaign_id)  = 30;

If it is not, you may create an index on (campaign_id, ip_address) and use a trick to imitate INDEX GROUP BY on this index:

CREATE INDEX ix_hits_campaign_ip ON hits(campaign_id, ip_address)

SELECT SUM(cnt)
FROM (
SELECT CASE WHEN @r = ip_address THEN 0 ELSE 1 END AS cnt,
  @r := ip_address
FROM
  (SELECT @r:='') r,
  (
  SELECT ip_address
  FROM hits
  WHERE campaign_id = 30
  ORDER BY ip_address
  ) i
) o

The trick here is simple: we don't need the result, just a count, so there is no need in scanning for actual values. Index scan will suffice.

Unfortunately, despite what MySQL documentation says here on loose index scans, they do not actually work on composite indices. That's why we need to imitate an INDEX SCAN WITH GROUP BY.

We do it by forcing MySQL to use INDEX RANGE SCAN that retrieves all records with campaign_id = 30 sorted by ip_address. Then we count DISTINCT ip_address'es using a session variable @r initialized to an empty string in the first subquery.

In the first field we set the variable to 0 when the previous ip_address (stored in the variable) equals to the current one; otherwise we set it to 1. In the second field we assign the current value of ip_address to the variable.

Finally we retrieve the SUM on the first field which will of course give us COUNT (DISTINCT ip_address).

Quassnoi 2009-02-14 11:54:12

These queries took longer than the queries above, I am just going to have to mess around some more.

Garrett 2009-02-14 22:18:23

What percent of your rows has campaign_id = 30?

Quassnoi 2009-02-14 22:19:23

See updated post, I fixed the queries a little.

Quassnoi 2009-02-14 23:06:04

I appreciate it! It took 0.00 seconds compared to 0.03 so I can see some performance improvement for sure.

Garrett 2009-02-14 23:44:29

Care to explain the trick in your last query in more detail? I'd be interested in how it works.

Tomalak 2009-02-15 12:59:55

See updated post.

Quassnoi 2009-02-15 17:34:03

Answer 4

A:

Just a guess.

SELECT * FROM hits WHERE (campaign_id = 30 AND id > 0) ORDER BY id DESC LIMIT 10;

Hopefully, MySQL will merge the indexes. Good luck.

2009-02-14 23:20:38

Answer 5

+1 A:

You need to use EXPLAIN to find out how it's executing your queries. You need to do it on production or production-like data, but obviously shouldn't do it on a production system (you need to use identical software on development and production for this exercise, of course) - the above would suggest it's doing a full table scan; this is likely to be because there either aren't any indexes it could use, or it's choosing not to use them because they have low cardinality etc.

You then need to evaluate what indexes could be added to improve it, try adding them, test it again, then try to QA the change by checking that adding the index won't break anything else in your application and doesn't regress performance elsewhere. You will want to analyse space and performance impact - again this can be done with production-like data on your test system (performance testing needs to be done on production-spec hardware of course).

Once you're sure adding the indexes is the right thing to do, you can roll those changes into a software release as you normally would. Beware of ALTER TABLE on large tables though, it can take some time and will block writes to the table (120k rows is probably not a large table however). Be sure you know how long it's going to take and what impact it will have on production before rolling the changes out.

MarkR 2009-02-14 23:23:43

ansaurus

tags:

views:

answers:

MySQL Performance

related questions