views:

113

answers:

4

An upcoming project of mine is considering a design that involves (what I'm calling) "abstract entity references". It's quite a departure from a more common data model design, but it may be necessary to achieve the flexibility we want. I'm wondering if other architects have experience with systems like this and where the caveats are.

The project has a requirement for a to control access to various entities (logically: business objects; physically: database rows) by various people. For example, we might want to create rules like:

  • User Alice is a member of Company Z
  • User Bob is the manager of Group Y, which has users Charlie, Dave, and Eve.
  • User Frank may enter data for [critical business object] X, and also the [critical business objects] in [critical business object group] U.
  • User George is not a member of Company T but may view the reports for Company T.

The idea is that we have a lot of different securable objects, roles, groups, and permissions, and we want a system to handle this. Ideally this system would require little to no coding for new situations once it's launched; it should be very flexible.

In a "traditional" data design, we might have entities/tables like this:

  • User
  • Company
  • User/Company Cross-Reference
  • UserGroup
  • User/UserGroup Cross-Reference
  • CBO ("Critical Business Object")
  • User/CBO Cross-Reference
  • CBOGroup
  • User/CBOGroup Cross-Reference
  • CBO/CBOGroup Cross-Reference
  • ReportAccess, which is a cross-reference between User and Company specifically for access to reports

Note the big number of cross-reference tables. This system isn't terribly flexible as any time we want to add a new means of access we'd need to introduce a new cross-reference table; that, in turn, means additional coding.

The proposed system has all of the major entities (User, Company, CBO) reference a value in a new table called Entity. (In the code we'd probably make all of these entities subclasses of an Entity superclass). Then there's two additional tables that reference Entity * Group, which is also an Entity "subclass". * EntityRelation, which is a relation between two entities of any type (including Group). This will probably also have some sort of "Relationship Type" field to explain/qualify the relationship.

This system, at least at first glance, looks like it would meet a lot of our requirements. We might introduce new Entities down the road, but we'd never need to do additional tables to handle the grouping and relationships between these entities, because Group and EntityRelation can already handle that.

I'm concerned, however, whether this might not work very well in practice. The relationships between entities would become very complex and might be very hard for people (users and developers alike) to understand them. Also, they'd be very recursive; this would make things more difficult for our SQL-dependent report writing staff.

Does anyone have experiences with a similar system?

+3  A: 

I have a weird experience with this; which is as follows:

Architect/programmer designs extermely symmetrical, generic model that looks really really neat and is very tree-ish and recursive.

When it comes to user interface design the customer or user insists that real usage is much simpler and would be satisfied with these two simple screens (user/customer draws these on a blackboard for you as you listen).

At this stage I consistently find that the solution tends to get very bloated when the underlying model supports very general use cases that no-one really wants or needs. So my basic advice is to always listen very closesly to the customer and stick very close to what the real requirements are. Make sure your personal desires for neat structures are not the driving force here.

And yes, I have experienced this a multitude of times: In my most recent experience all the developers were absolutely sure that we were talking about a hierarchical tree structure. But the customer decidedly wanted this to be flat list-like structure in all regards. We had to go full circle (implement tree first, then list) before we caved in.

I'm not entirely sure about the generic model you suggest, but it has all the smells that set me off talking about overly generic models. I would in the least always be very sure to model both alternatives in full detail before selecting.

krosenvold
Yep, I have had similar experiences. It's best to stick to your user stories!
Bill Karwin
Somehow our desires to make things generic backfire a bit too often for my liking. When I was fresh out of university I tended to make everything generalized. Nowdays I almost always go for the most explicit simplest solution. Every time something becomes overly generic, it bulges out somewhere else
krosenvold
I call that "waterbed architecture." http://karwin.blogspot.com/2007/11/in-support-of-relational-model.html
Bill Karwin
Agreed on not letting "neat structures" drive the development, but this really is customer-desire driven (although they didn't ask specifically for this; they're not that savvy).
Craig Walker
Also: agreed on modeling both alternatives. This is a big decision, and I want to go in with eyes open (hence the SO post)
Craig Walker
FWIW, I'm hoping to hide all of the complexity of the system behind a very user-friendly UI (and probably a dev-friendly API as well). I know that there's no way the users would really understand the intricacies of the model. They still want the benefits of flexibility though.
Craig Walker
Flexibility can be achieved in a number of ways. Explicit and simple models have flexibility of change. Generic models are much harder to change IMO.
krosenvold
+3  A: 

You're modeling a set of business rules in the real world that are themselves complex. So it's not surprising that your model is going to be complex no matter how you do it.

I would recommend that you choose database design that describes the relationships more accurately, instead of trying to be clever. Your clever design may result in fewer tables (though not by an order of magnitude, actually), however you're trade-off is a lot more application code to manage it.

For example, you already know that it's going to cause confusion for users and for report designers. Another weakness is making sure the "relationship type" column contains only meaningful strings for the entities involved in the relationship. E.g. it makes sense to say Bob IsMemberOf UserGroup4, but what does it mean if CBO CanViewReportsOf Bob? Also how do you prevent mutually exclusive conditions, such as Bob IsMemberOf Company1 and Bob IsMemberOf Company2?

You have to write application code to validate the data before inserting it, and after fetching it (because your code can never be sure another part of the code hasn't introduced a data integrity bug). You may also need to write application code to perform quality control checks on the whole database, and clean up anomalies when they occur.

Compare with a database design in which it's impossible to enter invalid relationships, because the database metadata includes constraints that prevent it. This would simplify your application code a great deal.

You also identify hierarchical access privileges, like if Bob CanViewReportsOf Company1, then should he be able to view reports of any UserGroup or CBO that is a member of that company? Or do you need to enter a separate row for every entity's reports Bob can read? These are policy problems, that will exist regardless of which design you use.


To reply to your comments:

I can certainly empathize with byzantine exception-cases and evolving requirements making it hard to design simple solutions.

I worked on systems that tried to model real-world policies that grew so complex that it seemed foolish to try to codify them in software. Ultimately, the client who hired me would have used their money more effectively to hire one or two full-time administrative assistants to track their projects using paper and pencil. New exception cases that took me weeks to implement in software would have taken minutes to describe to the AA.

Automation is harder than doing things manually. The only way automation is justified is if the information needs to be tracked faster, or with higher volume, than a human could do.

Bill Karwin
Very good answer to you too ;) I skipped the part about explicit modeling because I sort-of implied it. There's a hidden cost to those smart solutions that you only see over time. +1
krosenvold
I like Joe Celko's phrase for it: "mixing data with metadata." That is, when you store an attribute name as string data, you're asking for trouble.
Bill Karwin
See my comments @krosenvold's post. I definitely want to avoid the "clever" trap... but I do want something that's future-proof. I think that the proposed solution is frighteningly un-obvious, but it still appears to give the most flexibility. :-\
Craig Walker
On mutually exclusive conditions: actually that's kind of the driving force behind this. What was assumed to be a MEC in dev turned out to not be after launch (ie: poor/changing requirements), requiring a bunch of dev work to fix.
Craig Walker
Again on MEC: I think after-the-fact data checks are probably the way to go. There's so many exceptions to the "rules" that any DB-level constraints would have to be so generic as to be useless. Code-based rules could probably do better as they allow more complexity.
Craig Walker
(Not to rag on DB constraints; I think they're great and should be used whenever possible. But I don't think they're flexible enough here.)
Craig Walker
+2  A: 

Your Entity/Relation proposal is so "meta" that it would be flexible enough to handle all the crazy permutations - heck you're one step away from a single table with a single column that contains the path to the class which implements it's logic, been there done that - but as you point out administering it directly would result in crazy confusion. You'd need to put a nice pretty wrapper on it from the business object layer (re: single table inheritance?) to hide all the abstraction. But before you go through all that trouble, check out other already-established systems out there. Most of the time I find myself falling down this rabbit hole I end up implementing Unix file system permissions which unsurprisingly have stood the test of time.

Teflon Ted
+2  A: 

At a previous job we went down a similar path and ended up effectively implementing active directory type permissions for data entities.

Each table that required permissions would have a foriegn key to a SecurityObject table. Rows in UserPermission and GroupPermission tables indicated the type of permissions various users had for that SecurityObject row. The data in the SecurityObject table was hierarchical - each row by default inherited permissions from its parent if it had one.

The corresponding data entity classes implemented a common interface so that the security api could work with any "securable" data without having to know exactly what it was.

UI components for controlling group and user permissions provided a common interface for managing the entity permissions.

This was used to set up permissions for an in-house database driven CMS, used to build a number of sites for use internally and externally by partner companies, and also other types of data such as access to client records.

One of the main problems we had when building it was to minimize the number of database hits. This was problematic as the hierarchical data in the SecurityObject and Group tables made efficient querying difficult - a big problem when you are doing permission checks on a lot of data entities.

This led to pretty heavy caching of data with it's own set of problems if you have your applications distributed over a number of machines.

As a result of all this I would tend to agree with some of the other posters that you need to be very sure that this is what the business requires or you may find you are wasting your time in a painful manner.

If I was doing this again I would be sure to use stored procedures for checking and managing the security objects and permissions - Insisting on using ORM entities most likely made my job a lot harder :]

Andrew Kennan