views:

128

answers:

2

Hi --

I have a question on how to optimize a query. Actually, as I'm going to be running the query frequently, I was thinking of using a materialized or indexed view (is this a good idea here?) or denormalizing.

Consider the following four tables (with irrelevant fields omitted):

  • Users (int userId)
  • Groups (int groupId)
  • GroupMemberships (int userId, int groupId, bool isSharing)
  • Computers (int userId)

The relationships are that a user can have 0..n computers (one user to many computers) and can be a member of 0..n groups. A group can have 0..n users (many users to many groups). The "isSharing" denotes whether a user is sharing to that group or is a "read-only" member of that group (that is, can see sharing members' computers, but does not share her own).

The query is to find, for a given user what computers that user can see. A user can see all of her own computers. She can also see any computers of other users who are in groups that she is a meber of and are sharing to that group. Okay, that doesn't make much sense, so here's the goal in O(n^3) psudocode:

List<Computer> l
foreach(Computer c in Computers)
    if(c.userId == current_user_id)
        add c to l
    else
        foreach(GroupMembership m where m.userId == current_user_id)
            foreach(GroupMembership m2 where c.userId == m2.userId && m.groupId == m2.groupId)
                if(m2.isSharing)
                    add c to l

Right now I'm using an ORM mapper and basically doing the above (I'm not too good on the whole SQL thing), but that is obviously a less-than-ideal solution. I have indexes on every field I listed there (except isShared) and an extra index on GroupMembership's (userId, groupId) tuple. But can any database wizards out there think of a better solution?

The project's not live yet, but I'm guessing there would be an average of maybe 1.2 computers per user (everyone will have one, a few may have more) and maybe 0.75 group memberships per user (many users won't use the groups feature, but those who do will likely be members of multiple groups). Also, all these associated tables will receive frequent additions, which may make materialized views a less practical solution. I'm using SQL Server 2008.

Thanks, All the best, Robert

+1  A: 

OK, i take it you want the table and queries for the above specification?

I took from the specs that a computer is "assigned" to a given user, but can be share?

Computers (int userId)

Have a look at this and let me know if you want to change any specs.

DECLARE @Users TABLE(
     UserID INT
)

DECLARE @Computers TABLE(
     ComputerID INT,
     UserID INT
)

DECLARE @Groups TABLE(
     GroupID INT
)

DECLARE @GroupMemberships TABLE(
     UserID INT,
     GroupID INT,
     IsSharing INT
)

INSERT INTO @Users (UserID) SELECT 1
INSERT INTO @Users (UserID) SELECT 2

INSERT INTO @Computers (ComputerID, UserID) SELECT 1, 1
INSERT INTO @Computers (ComputerID, UserID) SELECT 2, 1
INSERT INTO @Computers (ComputerID, UserID) SELECT 3, 1
INSERT INTO @Computers (ComputerID, UserID) SELECT 4, 2
INSERT INTO @Computers (ComputerID, UserID) SELECT 5, 2

INSERT INTO @Groups (GroupID) SELECT 1
INSERT INTO @Groups (GroupID) SELECT 2
INSERT INTO @Groups (GroupID) SELECT 3

INSERT INTO @GroupMemberships (UserID,GroupID,IsSharing) SELECT 1, 1, 0
INSERT INTO @GroupMemberships (UserID,GroupID,IsSharing) SELECT 1, 2, 1
INSERT INTO @GroupMemberships (UserID,GroupID,IsSharing) SELECT 2, 2, 0
INSERT INTO @GroupMemberships (UserID,GroupID,IsSharing) SELECT 2, 3, 0

DECLARE @UserID INT
--SELECT @UserID = 1
SELECT @UserID = 2

SELECT  DISTINCT 
     ComputerID
FROM    @Computers
WHERE   UserID = @UserID
UNION
SELECT  DISTINCT 
     ComputerID
FROM    @Computers c INNER JOIN
     (
            SELECT  DISTINCT 
                    gm.UserID
            FROM    @GroupMemberships gm INNER JOIN
     @GroupMemberships ThisUserGroups ON gm.GroupID = ThisUserGroups.GroupID
              AND ThisUserGroups.UserID = @UserID
            WHERE   gm.UserID != @UserID
            AND             gm.IsSharing = 1
    ) OtherUsersInSharedGroups ON c.UserID = OtherUsersInSharedGroups.UserID
astander
Ah, thanks, that looks like what I'm doing right now with ORM... but with two subquieries, will this query be efficient? And is it worth making it a materialized view?
Robert Fraser
Yes, there's only one user per computer, but possibly many computers per user; thanks!
Robert Fraser
The sub query is not required, you can modify it, but that was how i typed it as i read your question X-).If the indexing on the tables are good, i dont think you will have too many problems. Also, you would probably want to use a query or table function using the param.Also, if the values do not change regularly, whuy not cache the values, you could even add an extra field to the select, indicating wether the computer is direct, or shared from another. Caching the values might make things a lot quicker, but remember to clear the cache on updates, deletes and inserts
astander
+1  A: 

I think this would do it without any subqueries. Disclaimer: This is off the top of my head, not tested.

select distinct computerId
from groupMemberships m1
join groupMemberships m2 on m2.groupId=m1.groupId
  and (m2.isSharing or m2.userId=m1.userId)
join computers c on c.userId=m2.userId
where m1.userId=?

There's no need to read the Group of User tables unless there's other data from those tables that you want to include in the select that you haven't mentioned.

The "isSharing or userId" should get you your own computers plus any shared computers. This might be unnecessarily clever: a simple union might be more effective.

Jay