tags:

views:

152

answers:

3

What would be best practice for the following situation. I have an ecommerce store that pulls down inventory levels from a distributor. Should the site, for everytime a user loads a product detail page use the third party API for the most up to date data? Or, should the site using third party APIs and then store that data for a certain amount of time in it's own system and update it periodically?

To me it seems obvious that it should be updated everytime the product detail page is loaded but what about high traffic ecommerce stores? Are completely different solutions used for that case?

A: 

Actually, there is another solution. Your distributor keeps the product catalog on their servers and gives you access to it via Open Catalog Interface. When a user wants to make an order he gets redirected in-place to the distributor's catalog, chooses items then transfers selection back to your shop.

It is widely used in SRM (Supplier Relationship Management) branch.

Developer Art
Unfortunately the distributor does not currently supply an OCI (abbr?) yet. All we have access to are a few SOAP method calls. Which delivers the data back in XML format. Very interesting though, thanks for pointing that out.
Brian Wigginton
A: 

It depends on many factors: the traffic to your site, how often the inventory levels change, the business impact of displaing outdated data, how often the supplers allow you to call their API, their API's SLA in terms of availability and performance, and so on.

Once you have these answers, there are of course many possibilities here. For example, for a low-traffic site where getting the inventory right is important, you may want to call the 3rd-party API on every call, but revert to some alternative behavior (such as using cached data) if the API does not respond within a certain timeout.

Sometimes, well-designed APIs will include hints as to the validity period of the data. For example, some REST-over-HTTP APIs support various HTTP Cache control headers that can be used to specify a validity period, or to only retrieve data if it has changed since last request.

Eugene Osovetsky
+1  A: 

In this case I would definitely cache the results from the distributor's site for some period of time, rather than hitting them every time you get a request. However, I would not simply use a blanket 5 minute or 30 minute timeout for all cache entries. Instead, I would use some heuristics. If possible, for instance if your application is written in a language like Python, you could attach a simple script to every product which implements the timeout.

This way, if it is an item that is requested infrequently, or one that has a large amount in stock, you could cache for a longer time.

if product.popularityrating > 8 or product.lastqtyinstock < 20:
   cache.expire(productnum)
distributor.checkstock(productnum)

This gives you flexibility that you can call on if you need it. Initially, you can set all the rules to something like:

 cache.expireover("3m",productnum)
 distributor.checkstock(productnum)

In actual fact, the script would probably not include the checkstock function call because that would be in the main app, but it is included here for context. If python seems too heavyweiaght to include just for this small amount of flexibilty, then have a look at TCL which was specifically designed for this type of job. Both can be embedded easily in C, C++, C# and Java applications.

Michael Dillon