The Scenario
I am building a web application where reports can be generated on the fly (based on information retrieved from an SQL database). These reports will contain charts, which can also be generated on the fly. Because these charts contain sensitive information, using a 3rd party chart API (ie: Google Charts) is out of the question.
The Problem
I am using PHP's GD extension to generate these charts. It is pretty slow. Caching is the way to go, but the problem is there is a huge number of possible charts; although I believe the majority of the charts requested will be ones that have been generated before.
Partial Solution
Charts are generated with data and other information (size, chart type, etc.). Because these can uniquely identify a chart, I give each chart a unique hash based on this information and save it. Now I can compute the hash for a newly requested chart and see if I already have it rendered.
The problem with this is the event of a collision. To get around that, I am thinking of saving the hash and a serialized form of the data in an SQL table. Then if I have a cache hit, I'll still compare the data itself.
I am over-engineering this? (It's a 160-bit hash - SHA1)
Is there a better way to handle this?