



Currently I have a Web API running on Heroku that is constantly writing information we're collecting from other data sources (currently theres about half a GB of data and it's growing very quickly). We're looking to add a reporting system on top of the current database that we can use to extract useful information out of the DB. The problem is that when we're running reports we're locking the DB and any other sites communicating with the DB are timing out. Does anyone have any solutions on how to solve this type of issue? Amazon RDS seems to have some interesting stuff with database replication but I don't know if that will solve my problems.

Any advice would be greatly appreciated.



Be sure you are running innodb tables and not the old isam or myisam tables - innodb has row level locks which is much more scalable.

Make sure that you have indexes defined on all your joining/foreign keys... if you do joins without indexes it will grind. Also make sure you have indexes where appropriate for data that you search or sort on (as long as it is diverse data, not boolean or a small number of values)

Replication is another good idea, as you could target the reports at the secondary server in read-only mode, and it will just catch up once it unlocks. half a GB of data should not really be locking it up yet, so I'd look at the indexes and innodb first.


One solution to this is to have a replica of the database, so that your normal traffic goes to the master database, while long-running queries execute on the slave. I'm not sure how much control you get over the database on Heroku though, they may not support replication.

However, have you considered that the Heroku setup may be the problem here? A 500 MB database shouldn't really have performance issues unless you're performing really complex queries.

If you're happy using MySQL instead of Postgres, Engine Yard supports database replication (although generally it may not be as easy to use as Heroku).

Alex - Aotea Studios