views:

25

answers:

2

Say I have four tables, users, contacts, files, and userfiles.

Users can upload files and have contacts. They can choose to share their uploaded files with their contacts.

When a user selects one or more of their uploaded files, I want to show a list of their contacts that they are not already sharing all of their selected files with. So if they selected one file, it'd show the contacts that can't already see that file. If the selected multiple files, it'd show the contacts that can't already see all of the files.

Right now I'm trying a query like this (using sqlite3):

select users.user_id, users.display_name
from users, contacts, userfiles
where contacts.user_id = :user_id
and contacts.contact_id = users.user_id
and (
    userfiles.user_id != users.user_id
    and userfiles.file_id != :file_id
);

Note that the last line is auto-generated in a loop in the case of multiple selected files.

Where :user_id is the user trying to share the file, and :file_id is the file which, if a user can already see that file, they are omitted from the result. What I end up with is a list of contacts which are sharing any files other than the selected one, so if the user is sharing multiple files with any one contact, that contact shows up in the list multiple times.

How can I avoid the duplicates? I just want to check if the file is already being shared, not grab all of the contents of userfiles that don't involve a particular file or files.

+1  A: 

Hi Carson! Try this:

select users.user_id, users.display_name
from users, contacts as c
where c.user_id = :user_id
and c.contact_id = users.user_id
and not exists (
    select user_id
    from userfiles as uf
    where uf.user_id = c.contact_id
    and uf.file_id in (:file_ids)
);

Note that :file_ids is all your file_id's, seperated with commas. No more looping to run multiple queries!

EDIT:

This is the data I'm running as a test:

create table users (user_id integer primary key, display_name text);
insert into users values (1,"bob");
insert into users values (2,"jim");
insert into users values (3,"bill");
insert into users values (4,"martin");
insert into users values (5,"carson");

create table contacts values (user_id integer, contact_id integer);
insert into contacts select u1.user_id, u2.user_id from users u1, users u2 where ui.user_id!=u2.user_id;

create table userfiles (user_id integer, file_id integer);
insert into userfiles values (1,10);
insert into userfiles values (2,10);
insert into userfiles values (3,10);
insert into userfiles values (4,10);
insert into userfiles values (1,20);
insert into userfiles values (2,30);

Then, if I run my query with :user_id = 5 and :files_id = 20,30, I get:

select users.user_id, users.display_name
from users, contacts as c
where c.user_id = 5
and c.contact_id = users.user_id
and not exists (
    select user_id
    from userfiles as uf
    where uf.user_id = c.contact_id
    and uf.file_id in (20,30)
);

UserID|Display_Name
3     |bill
4     |martin

That seems like what you want, as I understand it, that is, the only users who do not have all the file ID's. If I misunderstood something, please let me know.

MPelletier
It works perfect, thanks! I forgot about `not exists` :)
Carson Myers
@MPelletier Actually, it seems this doesn't quite work, but it's closer than I was. If I have multiple `file_ids`, then it will exclude all contacts that match any of them. I only want to exclude the ones that match them all -- I'm not sure how I would do that, without possibly adding a tonne of `and not exists` clauses to the first one. I may have to just get all of the contacts into a dict, and then remove the ones that match all of the `file_ids`.
Carson Myers
@MPelletier Got it! Check out my answer :) please do comment on it if there is something less-than-good about it, I'm still getting my hands wet with SQL
Carson Myers
I ran your queries, and reproduced your results, but unfortunately my desired result in this case would be for users 1, 2, 3 and 4 to show up. I only want to exclude users which can already see _all_ selected files (in this case, none of the users can see both files 20 and 30, so they would all be displayed). I went for count(*) since whether or not a user is displayed depends on several of the rows in the userfiles table and not just whether or not one of them exists. Thanks so much for the help though, it got me thinking and helped me solve the problem :)
Carson Myers
oh, and I did get the query working, it was a subtle escaping problem
Carson Myers
Oh, I get it! Then yes, I think your query is safe to use, even if a user has more files than others demanded, I see it is carefully restricted to those files you want. Good work!
MPelletier
+1  A: 

This seemed to work, not sure if it's optimal but it is the only way I could figure it out:

select users.user_id, users.display_name
from users, contacts
where contacts.user_id = :user_id
and contacts.contact_id = users.user_id
and (
    select count(*)
    from userfiles
    where userfiles.user_id = users.user_id
    and userfiles.file_id in (:file_ids)
) < :number_of_files;

It selects all contacts, except the ones that match all of the file_ids. It does select the contacts which match some of the file_ids, since it grabs the count of contacts matching the specified IDs, and then checks if that is less than the number of ids that were provided.

Carson Myers
I'm going to test this out tonight. My understanding is that if Bob is shared 10,000 files, he will show up in that query if you select the few files he doesn't have. I think...One thing you could do is take my query (or your original one for that matter) and tweak it a bit. Use your loop to go through each file and do a large `INTERSECT` query.
MPelletier
you are right but, that is the behavior I want: I only want to exclude the users that can already view every selected file. In any case, it's the damndest thing: this query works in the sqlite3 interpreter, but not in python. I can't figure it out.
Carson Myers
Ah, with Python I am no use. Please see my edit above, I'm giving a sample of test data I used to verify what I thought was proper behaviour.
MPelletier