I'm building something akin to Google Analytics and currently I'm doing real time database updates. Here's the workflow for my app:
- User makes a RESTful API request
- I find a record in a database, return it as JSON
- I record the request counter for the user in the database (i.e. if I user makes 2 API calls, I increment the request counter for the user by '2').
1 and 2 are really fast in SQL - they are SELECTs. #3 is really slow, because it's an UPDATE. In the real world, my database (MySQL) is NOT scaling. According to New Relic, #3 is taking most of the time - up to 70%!
My thinking is that I need to stop doing synchronous DB operations. In the short term, I'm trying to reduce DB writes, so I'm thinking about a global hash (say declared in environment.rb) that is accessible from my controllers and models that I can write to in lieu of writing to the DB. Every so often I can have a task write the updates that need to be written to the DB.
Questions:
- Does this sound reasonable? Any gotchas?
- Will I run into any concurrency problems?
- How does this compare with writing logs to the file system and importing later?
- Should I be using some message queuing system instead, like Starling? Any recommendations?
PS: Here's the offending query -- all columns of interest are indexed:
UPDATE statistics_api
SET count_request
= COALESCE(count_request
, ?) + ? WHERE (id
= ?)