I am building a Ruby on Rails application where I need to be able to consume a REST API to fetch some data in (Atom) feed format. The REST API has a limit to number of calls made per second as well as per day. And considering the amount of traffic my application may have, I would easily be exceeding the limit.
The solution to that would be to cache the REST API response feed locally and expose a local service (Sinatra) that provides the cached feed as it is received from the REST API. And of course a sweeper would periodically refresh the cached feed.
There 2 problems here.
1) One of the REST APIs is a search API where search results are returned as an ATOM feed. The API takes in several parameters including the search query. What should be my caching strategy so that cached feed can be uniquely identified against the parameters? That is, for example, if I search for say
/search?q=Obama&page=3&per_page=25&api_version=4
and I get a feed response for these parameters. How do I cache the feed so that for the exact same parameters passed in a call some time later, the cached feed is returned and if the parameters change, a new call should be made to the REST API?
2) The other problem is regarding the sweeper. I don't want to sweep a cached feed which is rarely used. That is, search query Best burgers in Somalia
would obviously be very less wanted than say Barak Obama
. I do have the data of how many consumers have subscribed to the feed. The strategy here should be that given the number of subscribers to this search query, sweep the cached feeds based on how large this number is. Since the caching needs to happen in the Sinatra application, how would one go about implementing this kind of sweeping strategy? Some code will help.
I am open to any ideas here. I want these mechanisms to be very good on performance. Ideally I would want to do this without database and by pure page caching. However, I am open to possibility of trying other things.