views:

36

answers:

1

I am building a Ruby on Rails application where I need to be able to consume a REST API to fetch some data in (Atom) feed format. The REST API has a limit to number of calls made per second as well as per day. And considering the amount of traffic my application may have, I would easily be exceeding the limit.

The solution to that would be to cache the REST API response feed locally and expose a local service (Sinatra) that provides the cached feed as it is received from the REST API. And of course a sweeper would periodically refresh the cached feed.

There 2 problems here.

1) One of the REST APIs is a search API where search results are returned as an ATOM feed. The API takes in several parameters including the search query. What should be my caching strategy so that cached feed can be uniquely identified against the parameters? That is, for example, if I search for say

/search?q=Obama&page=3&per_page=25&api_version=4

and I get a feed response for these parameters. How do I cache the feed so that for the exact same parameters passed in a call some time later, the cached feed is returned and if the parameters change, a new call should be made to the REST API?

2) The other problem is regarding the sweeper. I don't want to sweep a cached feed which is rarely used. That is, search query Best burgers in Somalia would obviously be very less wanted than say Barak Obama. I do have the data of how many consumers have subscribed to the feed. The strategy here should be that given the number of subscribers to this search query, sweep the cached feeds based on how large this number is. Since the caching needs to happen in the Sinatra application, how would one go about implementing this kind of sweeping strategy? Some code will help.

I am open to any ideas here. I want these mechanisms to be very good on performance. Ideally I would want to do this without database and by pure page caching. However, I am open to possibility of trying other things.

A: 

Why would you want to replicate the REST service as a Sinatra app? You could easily just make a model inside your existing Rails app to cache the Atom feeds (storing the whole feed as a string inside for example).

a CachedFeed Model which is updated when its "updated_at" is far enough away to be renewed. You could even use static caching for your cachedFeed Controller to reduce the strain on your system.

Having the cache inside your Rails app would greatly reduce complexity in terms of when to renew your cache or even count the requests performed against the rest api you query.

You could have model logic to distribute the calls you have to the most popular feeds. Tthe search parameter could just an attribute of your model so you can easily find and distinguish them

elmac
The reason I was thinking about Sinatra was to avoid the use of database and just do page caching instead of having to store stuff in the database. The rails app will talk to Sinatra and Sinatra app decides whether to just give the cached page or make a new REST API remote call. The benefit of this kind of model is that the rails application doesn't have to care about where the response is coming from. Separation of concerns is achieved by delegating the responsibility of caching and making remote calls. Much more extensible architecture, in my opinion. I understand your point though.
Chirantan
but having the cache as a model in your db gives you the perfect method to use page_cache in your referring controllerwich in return means you have less code to care about and fewer point of failure in your architectureas for the "where" the response is coming from: you'll have to write failsafe ruby code anyways to fetch your feed, so why not do it inside your app
elmac