Hi all. I've found myself in a bit of a predicament. I have a table used for page hit tracking with nearly 105 million rows.(!) It looks like this:
CREATE TABLE `media_hits` (
`id` int(10) unsigned NOT NULL auto_increment,
`media_code` char(7) NOT NULL,
`day` date NOT NULL,
`hits` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`id`),
UNIQUE KEY `media_code` (`media_code`,`day`)
) ENGINE=InnoDB;
As you can imagine running any kind of query on this table takes a long time. A typical query would be the following:
SELECT DISTINCT(`media_code`), COUNT(*) AS c
FROM `media_hits`
WHERE `day` >= DATE_SUB(NOW(), INTERVAL 1 DAY)
GROUP BY(`media_code`)
ORDER BY c DESC
LIMIT 200;
This query takes forever. And EXPLAIN on the query gives me this:
id: 1
select_type: SIMPLE
table: media_hits
type: index
possible_keys: NULL
key: media_code
key_len: 10
ref: NULL
rows: 104773158
Extra: Using where; Using index; Using temporary; Using filesort
That's just plain awful. So my question is: What can I do about this? Trying to add proper indexes now is impossible. The ALTER TABLE query would probably take over a week to run. I tried deleting rows older than 6 months, but 24 hours later that query was still running.
I need to fix this some how. The only thing that crosses my mind is creating a new table with proper indexes, and start recording hits in that table. In the background I could have a script slowly inserting records from the old media_hits table. Can anyone offer suggestions on how to index this table, and possibly some hints on which columns I should index?