[Warning: long post ahead!]
I've banging my head at this for quite some time now but can't get on a common denominator what is going on. I've found a solution to workaround, see at the end, but my inner Zen is not satisfied yet.
I've a main table with forum messages (it's from Phorum), simplified looks like this (ignore the anon_user_id
for the moment, I will get later to it):
CREATE TABLE `test_msg` (
`message_id` int(10) unsigned NOT NULL auto_increment,
`status` tinyint(4) NOT NULL default '2',
`user_id` int(10) unsigned NOT NULL default '0',
`datestamp` int(10) unsigned NOT NULL default '0',
`anon_user_id` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`message_id`)
);
Messages can be anonymized by the software, in which case the user_id
is set to 0
. The software also allows posting complete anonymous messages which we endorse. In our case we need to still know which user posted a message, so through the hook system provided by Phorum we have a second table we update accordingly:
CREATE TABLE `test_anon` (
`message_id` bigint(20) unsigned NOT NULL,
`user_id` bigint(20) unsigned NOT NULL,
KEY `fk_user_id` (`user_id`),
KEY `fk_message_id` (`message_id`)
);
For the view in the profile, I need to get the a list of messages from a user, no matter whether they've have been anonmized by him or not.
A user itself has always the right to see the message he wrote anonymously or later anonymized.
Because user_id
gets set to 0
if anonymized, we can't simply use WHERE for it; we need to join with our own second table. Formulate the above into SQL looks like this (the status = 2
is required, other states would mean the post is hidden, pending approval, etc.):
SELECT * FROM test_msg AS m
LEFT JOIN test_anon ON test_anon.message_id = m.message_id
WHERE (test_anon.user_id = 20 OR m.user_id = 20)
AND m.status = 2
ORDER BY m.datestamp DESC
LIMIT 0,10
This query by itself, whenever the query cache is empty, takes a few second, something along 4 seconds currently. Things get worse when multiple users issue the query and the query cache is empty (which just happens; people post messages and the cached queries are invalid); we faced in our internal testing phase and reports were that the system sometimes slows down. We've seen queries taking 30 to 60 seconds because of the concurrent-ness. I don't want to start imagine what happens when we expand our user base ...
Now it's not like I didn't to any analysis about the bottleneck.
I tried rewriting the WHERE clause, adding indice and deleting them like hell.
This is when I found out that when I do not use any index, the query performs lighting fast under certain conditions. Using no index, the query looks like:
SELECT * FROM test_msg AS m USE INDEX()
LEFT JOIN test_anon ON test_anon.message_id = m.message_id
WHERE (test_anon.user_id = 20 OR m.user_id = 20)
AND m.status = 2
ORDER BY m.datestamp DESC
LIMIT 0,10
Now here comes the certain condition: the LIMIT limits the result to 10 rows. Assume my complete result n = 26
. Using a LIMIT 0,10
to LIMIT 16,0
takes zero seconds (something along < 0.01s): these are the cases were the result is always 10 rows.
Starting with LIMIT 17,10
, the result will be only 9 rows. Starting at this point, the query starts taking around four seconds again. The is applicable for all results where the result set is smaller then the number of maximum rows limited through LIMIT
. Irritating!
Going back to the first CREATE TABLE statement, I also conducted tests without the LEFT JOIN; we just assume user_id=0
and anon_user_id=<the previous user_id>
for anonymized messages, in other words, completely bypassing the second table:
SELECT * FROM test_msg
WHERE status = 2 AND (user_id = 20 OR anon_user_id = 20)
ORDER BY m.datestamp DESC
LIMIT 20,10
Result: it did not matter. The performance is still within 4 or 5 seconds; forcing to not use an index with USE INDEX()
does not speed up this query.
This is were I really got puzzled now. Index will always only be used for the status
column, the OR
prevents other indices from being used, this is also what the MySQL documentation told me in this regard.
An alternate solution I tried: do not use the test_anon
table to only relate to anonymized messages, but simply to all messages. This allows me to write a query like this:
SELECT * FROM test_msg AS m, test_anon AS t
WHERE m.message_id = t.message_id
AND t.user_id = 20
AND m.status = 2
ORDER BY m.datestamp DESC
LIMIT 20,10
This query always gave me instant results (== < 0.01 seconds), no matter what LIMIT, etc.
Yes, I've found a solution. I've not yet rewritten the whole application to the model though.
But I'ld like to better understand what the rational is behind my observed behavior (especially forcing no index speeding up queries). On paper nothing looked wrong with the original approach.
Some numbers (they aren't that big anyway):
- ~ one million messages
- message table data size is around 600MB
- message table index size is around 350MB
- number of anonymized messages in
test_anon
< 3% of all messages - number of messages from registered users < 25% of all messages
All tables are MyISAM; I tried with InnnoDB but performance was much more worse.