views:

111

answers:

3

I am looking into porting a website in CouchDB and it looks very interesting.

However, a big problem is that CouchDB does not seem to support read authentication; all documents within a database are accessable by all readers.

It is suggested elsewhere to use different databases for different reader-groups or to implement reader authentication in another (middle) tier, neither of which is an option for this project where the access is determined by complex, per document ACLs.

I was thinking to implement the authentication in lists and to restrict all access to the CouchDb to these lists. This restriction could be enforced by the simple mod_rewrite clauses in the Apache used as reverse-proxy. The lists would simple fetch the row and check the userCtx against the document's ACL. Something like:

function(head, req) {
  var row;
  while (row = getRow()) {
     if (row.value.ACL[req.userCtx.name])
       send(row.value);
     else
       throw({unauthorized : "You are not allowed to access this resource"});
}

Since I have no experience with CouchDB, and I haven't read about this approach anywhere, I'd like to know whether this approach could work.

Is this a way to implement read access or am I abusing lists for the wrong purpose? Should I not expect such a simple solution is possible with CouchDB?

A: 

I'm not sure using list is the best option to restrict the access to resources since list are functions that are used to render the ouupt of a view in specific format (RSS, CSV, config files, HTML,...).

Have you considered using a document containing users and their permissions? I found a post by Kore Nordmann which explains how to convert the classical user/group/permissions from relational databases to the CouchDB model:

alt text

Depending on its permissions, a user would have access to only a set of defined views.

CouchDB offers validation functions but they only get called when a document is created or updated. The O’Reilly book states that "The authentication system is pluggable, so you can integrate with existing services to authenticate users to CouchDB using an http layer, LDAP integration, or through other means". But since you mentioned a middle tier is not an option, the list could be a temporary solution until more authentication support is added to CouchDB.

jdecuyper
Thanks. The post by Kore Nordmann uses user/permission tables as an example of data and how it is transformed to CouchDB. It does not seem to address the problem of actually restricting access to resources.
Tomas
Also, I understand that the purpose of lists is primarily to change the format, I understand that they are also used for filtering beyond the B-tree, right?
Tomas
You're right about the post not offering a complete solution to the permissions problem, but it is good starting point and as shown in the picture, you could use the "permissions" structure to hold the views available for one particular user. I didn't knew about the list/B-tree relationship but I'm looking for it right now. Do you have any link that explains it in more detail? http://books.couchdb.org/relax/appendix/btrees explains how the B-Trees is (basically) implemented.
jdecuyper
What I mean: A view offers sorting and filtering lineary on a key. If you want additional filtering within that view you need a list right? I don't completely understand yet, but this is what I gather from the book your referencing.
Tomas
I update a bit my answer. And yes, I think using the list could be an option to restrict access to some resources.
jdecuyper
+2  A: 

Apache mod_rewrite is a middle tier, so it is not clear what you mean when you say a middle tier is not an option.

Implementing your security policy based on data in couchdb is perfectly fine. However the cost is that you are responsible for the implementation to be correct. It's not as bad as it sounds. Remember, people have been doing this with MySQL web apps for a long time.

The thing to keep in mind is that CouchDB does not support document-level read permissions because it is impractical to track those permissions as the data weaves through all the maps and reduces of the views. For example, say we have a bidding system.

  • There are two bids, mine and yours
  • I have read access to my bid which is $10, but I cannot read your bid document due to middleware policy
  • However I discover a view which computes the average of all bids. The average is $7.50. Therefore I know you bid $5 and I will lower my bid to $6

In other words, if you are wrapping the CouchDB API, you will at least need to whitelist those queries which are allowed. And remember, the vhost and rewrite rules run within CouchDB so simply looking at the incoming query may not be enough.

Hopefully that sheds some light on why read control is at the database level.

jhs
Thanks. What do you mean by "the vhost and rewrite rules run within CouchDB" ?
Tomas
Could you (or someone) more specifically address whether LISTS are a good way to implement this?
Tomas
A: 

Usually it is sufficient to restrict access to certain views - this can be done via lists as you proposed (thanks for the idea). Usind unguessable IDs for documents, you already have some kind of access control for documents. I would avoid iterating through the rows and checking for permissions there, but I don't think that's much of a problem either.

Some have mentioned here that the purpose of lists is to change the format - I don't agree, as even the official CouchDB guide states that lists could even produce json documents.

Another way is to restrict users per database and use selective replication so one database will only contain the data a certain group of users is allowed to access. See http://stackoverflow.com/questions/2765027/couchdb-read-authentication/2766831#2766831 This is not actually per-user, but maybe anyway an option for you. For details on filtered replication see http://wiki.apache.org/couchdb/Replication

Edit: I just came up with a great idea to enforce per document user permissions via lists with better performance:

  1. You pass the user name as an argument to the view and filter accordingly.
  2. In the list using the view, you check whether the given argument user name is identical to the actual user.

The advantage is that CouchDB, as far as I know, internally uses caching for views. I'm not sure about how the caching works with lists. Also I think iterating and filtering in views is generally faster than in lists.

Johann Philipp Strathausen
Thanks for your answer. About your edited idea, this does seem to be useful only in the limited case where no ordering or filtering other then on username is required as, correct?
Tomas
No, you can use it any way you want, because CouchDB allows sorting with multiple values. Just do something like "emit([doc.user_id, doc.sort_value], doc)".
Johann Philipp Strathausen