views:

189

answers:

2

I have a restaurant locater web application that mashes up the location of restaurants to a Google Maps.

I use JQuery sliders to limit the amount of restaurant to show on the map by having Search filter such as: price, type of food, locale.

These JQuery sliders call back via AJAX to an API I created to update the map without the web page having to refresh.

JQuery calls a RESTFUL API like so:

http://example.com/search/?city=NYC&max-price:50&cuisine=french

This returns a JSON string of restaurants which match this criteria so that my web application can display on the map all the restaurants which match the search.

What I don't want to have happen is for someone to come along and figure out my API and dumps out ALL of my restaurant listings.

Is there a way that I can limit who call the above HTTP API, so that only my web server calls the URL and not spamer/hackers looking to dump my database?

Thanks

+1  A: 

All the big REST API's tend to use tokenized authentication - basically before you do a REST request, you have to send some other request to the token service to fetch a token to include with your data request. Bing Maps does this, Amazon does this, Flickr does this... etc.

I don't know too much about it other than having worked with Bing Maps. You'll need to read up on tokenized authentication with REST. Here's a blog post to get you started: http://www.naildrivin5.com/daveblog5000/?p=35

womp
+1  A: 

First, declare your intentions in robots.txt.

Then, send a Set-Cookie header with a nonce or some kind of unique ID on the main page, but not on your API responses. If the cookie is never sent to your API endpoint, return a 401 Bad Request response, because it's a bot, a very broken browser, or somebody is rejecting your cookies. The Referer header can also be used as an additional check, but it's trivial to fake. Keep track of how many API calls have been made by that ID. You may also want to match IDs to IP addresses. If it goes above your threshold, spit back a 403 Forbidden response. Make your threshold high enough that legitimate users don't get caught by it.

Keep good logs, and highlight 401 and 403 responses.

Realistically, if someone is determined enough, they WILL be able to dump this information. Your goal shouldn't be to make this impossible, because you will never succeed. (See all the usual adages about achieving perfect security.) Instead, you want to make it abundantly clear that:

  • This behavior violates the terms of service.
  • You are actively trying to prevent this.
  • You know that the offender exists and roughly who they are.
  • Scary lawyers might start getting involved if this continues.

(You do have a lawyer, right?)

To achieve this, be sure the body of your 403 Forbidden response conveys a scary sounding message along the lines of "This request exceeds the maximum allowed usage of the API. Your IP address has been logged. Please refer to the terms of service and obey the directives in robots.txt."

IANAL, but I believe the DMCA can be made to apply in this situation if you claim copyright on your database. This essentially means that if you can track illegal usage of your API to an IP address, you can send a nastygram to their ISP. This should always be a last resort of course.

I don't encourage the use of assigned API keys/tokens because they turn out to be a barrier to adoption and kind of a pain in the neck to manage. As a counter-point to @womp's answer, Google is slowly moving away from their use. Also, I don't think they actually apply in this case, because it sounds like your "API" is more like a JSON call that's used mainly on your own site.

Bob Aman