ansaurus

Question

how are viewing permissions usually implemented in a relational database?

Answer 1

+1 A:

Looks like a typical many-to-many relationship -- I don't see any restrictions on what you desire that would allow space savings wrt the typical relational DB idiom for those, i.e., a table with two columns (both foreign keys, one into users and one into tweets)... since the current followers can and do change all the time, posting a tweet to all the followers that are current at the instant of posting (I assume that's what you mean?) does mean adding that many (extremely short) rows to that relationship table (the alternative of keeping a timestamped history of follower sets so you can reconstruct who was a follower at any given tweet-posting time appears definitely worse in time and not substantially better in space).

If, on the other hand, you want to check followers at the time of viewing (rather than at the time of posting), then you could make a special userid artificially meaning "all followers of the current user" (just like you'll have one meaning "all users on Twitter"); the needed SQL to make the lookup fast, in that case, looks hairy but feasible (a UNION or OR with "all tweets for which I'm a follower of the author and the tweet is readable by [the artificial userid representing] all followers"). I'm not getting deep into that maze of SQL until and unless you confirm that it is this peculiar meaning that you have in mind (rather than the simple one which seems more natural to me but doesn't allow any space savings on the relationship table for the action of "post tweet to all followers").

Edit: the OP has clarified they mean the approach I mention in the second paragraph.

Then, assume userid is the primary key of the Users table, the Tweets table has a primary key tweetid and a foreign key author for the userid of each tweet's author, the Followers table is a typical many-to-many relationship table with the two columns (both foreign keys into Users) follower and followee, and the Canread table a not-so-typical many-to-many relationship table, still with two column -- foreign key into Users is column reader, foreign key into Tweets is column tweet (phew;-). Two special users @everybody and @allfollowers are defined with the above meanings (so that posting to everybody, all followers, or "just myself", all add only one row to Canread -- only selective posting to a specific list of N people adds N rows).

So the SQL for the set of tweet IDs a user @me can read is, I think, something like:

SELECT Tweets.tweetid 
  FROM Tweets
  JOIN Canread ON(Tweets.tweetid=Canread.tweet)
 WHERE Canread.reader IN (@me, @everybody)

UNION

SELECT Tweets.tweetid 
  FROM Tweets
  JOIN Canread ON(Tweets.tweetid=Canread.tweet)
  JOIN Followers ON(Tweets.author=Followers.followee)
 WHERE Canread.reader=@allfollowers
   AND Followers.follower=@me

Alex Martelli 2010-06-26 05:17:41

Sorry, my bad for ambiguity—I mean current as in changing. See updated question.

aharon 2010-06-26 05:20:43

I just think adding so many rows per post can't be the optimal way of doing things; Facebook has a similar (real) ability, I doubt that they do it that way—if only that it seems a lot of work.

aharon 2010-06-26 05:23:16

@aharon, does Facebook use a relational DB? I thought, like all very-high-volume sites (inc. Twitter), they had jumped on the NoSQL bandwagon (exactly for reasons of scalability).

Alex Martelli 2010-06-26 05:25:20

Facebook supposedly uses a huge MySQL cluster—as of 2008 at least. Are there any non-sql databases that can deal with this kind of query/relation better that you know of?

aharon 2010-06-26 05:26:30

@aharon, yes there are (where "better" means "far more scalable") -- of course NoSQL DBs always have to be carefully de-normalized for maximum performance and scalability, in ways that vary entirely among them, but Google's Bigtable (not the much simpler facade exposed in App Engine) would make mincemeat of this, and I'd be astonished if, say, Cassandra couldn't (but as a Googler my work's on bigtable, so that's the one I'm really familiar with). Anyway, see my SQL attempt above, doesn't look too bad to me (with all the needed indices of course).

Alex Martelli 2010-06-26 05:41:53

Thanks so much Alex; the MySQL seems to work, I sort of understand how it works :) and I'll have a look at Bigtable and Cassandra.

aharon 2010-06-26 17:49:21

@aharon, about Bigtable you can read whitepapers such as http://labs.google.com/papers/bigtable.html , but to _use_ it you should apply to work at Google (hint: we're hiring, email me if interested;-), as I don't think we make it available outside (except through the nice simple facade in App Engine). Cassandra, MongoDB, Hypertable, CouchDB (etc), open source, or Amazon's Dynamo (commercial closed source), may be more useful to you otherwise, as you get to _use_ them, not just _study_ them;-). Glad to hear the SQL I proposed works for you, anyway!

Alex Martelli 2010-06-26 18:12:40

ansaurus

tags:

views:

answers:

how are viewing permissions usually implemented in a relational database?

Application / Example

related questions