views:

87

answers:

2

We are currently working on a project where the main domain objects are content nodes and we are using an ACL-like system where each node in the hierarchy can contain rules that override or complement those on their parents. Everything is as well based on roles and actions, for example.

Node 1 - {Deny All, Allow Role1 View}
\- Node 2 - {Allow Role2 View}
   \- Node 3 - {Deny Role1 View}

In that case, rules will be read in order from top to bottom so the Node 3 can be viewed only by Role2. It's not really complicated in concept.

Retrieving the rules for a single node can result in some queries, getting all the parents and then recreating the list of rules and evaluating them, but this process can be cumbersome because the hierarchy can become quite deep and there may be a lot of rules on each node.

I have been thinking on preparing a table with precalculated rules for each node which could be recreated whenever a permission is changed and propagated to all the leaf nodes of the updated one.

Do you think of any other strategy to speed up the retrieval and calculation of the rules? Ideally it should be done in a single query, but trees are not the best structures for that.

+3  A: 

I would think that an Observer Pattern might be adapted.

The idea would be that each Node maintains a precomputed list and is simply notified by its parent of any update so that it can recompute this list.

This can be done in 2 different ways:

  1. notify that a change occurred, but don't recompute anything
  2. recompute at each update

I would advise going with 1 if possible, since it does not involve recomputing the whole world when root is updated, and only recomputing when needed (lazy eval in fact) but you might prefer the second option if you update rarely but need blazing fast retrieval (there are more concurrency issues though).

Let's illustrate Solution 1:

Root ___ Node1 ___ Node1A
     \         \__ Node1B
      \_ Node2 ___ Node2A
               \__ Node2B

Now, to begin with, none of them has precomputed anything (there are all in a dirty state), if I ask for Node2A rules:

  • Node2A realizes it is dirty: it queries Node2 rules
  • Node2 realizes it is dirty: it queries Root
  • Root does not have any parent, so it cannot be dirty, it sends its rules to Node2
  • Node2 caches the answer from Root, merges its rules with those received from Root and cleans the dirty bit, it sends the result of the merge (cached now) to Node2A
  • Node2A caches, merges, cleans the dirty bit and returns the result

If I subsequently asks for Node2B rules:

  • Node2B is dirty, it queries Node2
  • Node2 is clean, it replies
  • Node2B caches, merges, cleans the dirty bit and returns the result

Note that Node2 did not recomputed anything.

In the update case:

  • I update Node1: I use the Root cached rules to recompute the new rules and send a beat to Node1A and Node1B to notify them their cache is outdated
  • Node1A and Node1B set their dirty bit, they would also have propagated this notification had they had children

Note that because I cached Root rules I don't have to query the Root object, if it's a simple enough operation, you might prefer not to cache them at all: if you're not playing distributed here, and querying Root only involves a memory round-trip you might prefer not to duplicate it in order to save up some memory and book-keeping.

Hopes it gets you going.

Matthieu M.
Very helpful response, I suppose that you are implicitly saying: precompute all (in a more or less complex strategy).
Marc Climent
Not at all, it's wasteful. I mean to compute lazily (compute on need, and cache the result) and to use the observer pattern to know when the cached result is deprecated.
Matthieu M.
+1  A: 

Your version of pre-computation appears to store all the permissions relevant to each role at each node. You can save a little time and space by traversing the tree, numbering the nodes as you reach them, and producing, for each role, an array of the node numbers and permission changes just for the nodes at which the permissions relevant to that role change. This produces output only linear in the size of the input tree (including its annotations). Then when you come to check a permission for a role at a node, use the number of that node to search the array to find the point in the array that represents the most recent change of permission when you visited that node during the tour.

This may be associated in some way with http://en.wikipedia.org/wiki/Range_Minimum_Query and http://en.wikipedia.org/wiki/Lowest_common_ancestor, but I don't really know if those references will help or not.

mcdowella