views:

158

answers:

3

I am designing an API, and I'd like to ask a few questions about how best to secure access to the data.

Suppose the API is allowing access to artists. Artists have albums, that have songs.

The users of the API have access to a subset of all the artists. If a user calls the API asking for some artist, it is easy to check if the user is allowed to do so.

Next, if the user asks for an album, the API has to check if the album belongs to an artist that the user is allowed to access. Accessing songs means that the API has to check the album and then the artist before access can be granted.

In database terms, I am looking at an increasing number of joins between tables for each additional layer that is added. I don't want to do all those joins, and I also don't want to store the user id everywhere in order to limit the number of joins.

To work around this, I came up with the following approach.

The API gives the user a reference to an object, for instance an artist object. The user can then ask that artist object for the albums, which returns a list object. The list object can be traversed, and album objects can be obtained from it. Likewise, from an album object a songlist object can be obtained and from that, the individual song objects.

Since the API trusts the artist object, it also trusts any objects (albums in this case) that the user gets from it, without further checks. And so forth for all the other objects. So I am delegating the security/trust to objects down the chain.

I would like to ask you what you think of it, what's good or bad about it, and of course, how you would solve this "problem".

Second, how would you approach this if the API should be RESTful? My approach seems less applicable in that case.

A: 

It is possible that doing the joins is much faster than your object approach (although it is more elegant). With the joins you have only one db request, with the objects you have many. (Or you have to retrieve all the "possible" data in the first request, which could also slow down things)

I recommend doing the joins. If there is a problem about the sql you can ask at stackoverflow :D

Another idea:

If you make urls like "/beatles/whitealbum/happinesisawarmgun"

then you would know the artist in the begining of the request and could get the permission at once without traversing - because the url contains the traversal information. Just a thought.

+1  A: 

Is this a real program or rather a sample to illustrate a question? Because it is not clear why you would restrict access to the artists and albums rather than just to individual media items or even tracks.

I don't think that the joins should cost you that much, any half-smart DB system will do them cheaply enough when you are making a fairly simple criteria match on multiple tables.

IMHO, the problem with putting that much security logic into queries is that it limits your ability to handle more complex DRM issues that are sure to bound up. For example, what if the album is a collection from multiple artists? What if the album contains a track which is a duet and I only have access to one artist? etc, etc.

My view is that in those situations, a convenient programming model with sensible exception is much more important than the performance of individual queries, which you could always cache or optimize in the future. What you are trying to do with queries sounds like premature optimization.

Design your programming model as flexible as possible. Define a sensible sense of extensions, then work on implementing the database and optimize queries after profiling the real system.

Uri
A: 

It is a good idea to include a security descriptor for each resource and not only to a top-level one. In your example the security descriptor is simply artist's ID or a list of artists' IDs, if you support duets etc. So I would think about adding the list of IDs to both the artists and the songs tables. You can add a string field where the artist IDs for the resource will be written in comma-separated way.

Such solution scales well, you can add more layers without increasing time needed for security check. Adding a new resource also doesn't require any additional penalty except for one more field to insert (based on resource's parent field). And of course, this solution supports special situations described above (like more than one artists etc.).

This kind of solution also doesn't violate RESTful architecture.

And the fact that each resource contains its own security descriptor generalizes the resource's access permissions, making it possible to implement some completely different security policy in future (for example, making access permissions more granular, based on albums, not only artists).

Dmitry Perets