I have a big MySQL InnoDB table (about 1 milion records, increase by 300K weekly) let's say with blog posts. This table has an url field with index.
By adding new records in it I'm checking for existent records with the same url. Here is how query looks like:
SELECT COUNT(*) FROM `tablename` WHERE url='http://www.google.com/';
Currently system produces about 10-20 queries per second and this amount will be increased. I'm thinking about improving performance by adding additional field which is MD5 hash of the URL.
SELECT COUNT(*) FROM `tablename` WHERE md5url=MD5('http://www.google.com/');
So it will be shorter and with constant length which is better for index compared to URL field. What do you guys think about it. Does it make sense?
Another suggestion by friend of mine is to use CRC32 instead of MD5, but I'm not sure about how unique will be result of CRC32. Let me know what you think about CRC32 for this role.
UPDATE: the URL column is unique for each row.