views:

55

answers:

1

Hello, I'm building a site with django that lets users move content around between a bunch of photo services. As you can imagine the application does a lot of api hits.

for example: user connects picasa, flickr, photobucket, and facebook to their account. Now we need to pull content from 4 different apis to keep this users data up to date.

right now I have a a function that updates each api and I run them all simultaneously via threading. (all the api's that are not enabled return false on the second line, no it's not much overhead to run them all).

Here is my question:

What is the best strategy for keeping content up to date using these APIs?

I have two ideas that might work:

  1. Update the apis periodically (like a cron job) and whatever we have at the time is what the user gets.

    benefits:

    • It's easy and simple to implement.
    • We'll always have pretty good data when a user loads their first page.

    pitfalls:

    • we have to do api hits all the time for users that are not active, which wastes a lot of bandwidth
    • It will probably make the api providers unhappy
  2. Trigger the updates when the user logs in (on a pageload)

    benefits:

    • we save a bunch of bandwidth and run less risk of pissing off the api providers
    • doesn't require NEARLY the amount of resources on our servers

    pitfalls:

    • we either have to do the update asynchronously (and won't have anything on first login) or...
    • the first page will take a very long time to load because we're getting all the api data (I've measured 26 seconds this way)

edit: the design is very light, the design has only two images, an external css file, and two external javascript files.

Also, the 26 seconds number comes from the firebug network monitor running on a machine which was on the same LAN as the server

+1  A: 

Personally, I would opt for the second method you mention. The first time you log in, you can query each of the services asynchronously, showing the user some kind of activity/status bar while the processes are running. You can then populate the page as you get the results back from each of the services.

You can then cache the results of those calls per user so that you don't have to call the apis each time.

That lightens the load on your servers, loads your page fast, and provides the user with some indication of activity (along with incrimental updates to the page as their content loads). I think those add up to the best User Experience you can provide.

Justin Niessner