This post is taking a substantial amount of time to type because I'm trying to be as clear as possible, so please bear with me if it is still unclear.
Basically, what I have are a table of posts in the database which users can add privacy settings to.
ID | owner_id | post | other_info | privacy_level (int value)
From there, users can add their privacy details, allowing it to be viewable by all [privacy_level = 0), friends (privacy_level = 1), no one (privacy_level = 3), or specific people or filters (privacy_level = 4). For privacy levels specifying specific people (4), the query will reference the table "post_privacy_includes_for" in a subquery to see if the user (or a filter the user belongs to) exists in a row in the table.
ID | post_id | user_id | list_id
Also, the user has the ability to prevent some people from viewing their post in within a larger group by excluding them (e.g., Having it set for everyone to view but hiding it from a stalker user). For this, another reference table is added, "post_privacy_exclude_from" - it looks identical to the setup as "post_privacy_includes_for".
My problem is that this does not scale. At all. At the moment, there are about 1-2 million posts, the majority of them set to be viewable by everyone. For each post on the page it must check to see if there is a row that is excluding the post from being shown to the user - this moves really slow on a page that can be filled with 100-200 posts. It can take up to 2-4 seconds, especially when additional constraints are added to the query.
This also creates extremely large and complex queries that are just... awkward.
SELECT t.*
FROM posts t
WHERE ( (t.privacy_level = 3
AND t.owner_id = ?)
OR (t.privacy_level = 4
AND EXISTS
( SELECT i.id
FROM PostPrivacyIncludeFor i
WHERE i.user_id = ?
AND i.thought_id = t.id)
OR t.privacy_level = 4
AND t.owner_id = ?)
OR (t.privacy_level = 4
AND EXISTS
(SELECT i2.id
FROM PostPrivacyIncludeFor i2
WHERE i2.thought_id = t.id
AND EXISTS
(SELECT r.id
FROM FriendFilterIds r
WHERE r.list_id = i2.list_id
AND r.friend_id = ?))
OR t.privacy_level = 4
AND t.owner_id = ?)
OR (t.privacy_level = 1
AND EXISTS
(SELECT G.id
FROM Following G
WHERE follower_id = t.owner_id
AND following_id = ?
AND friend = 1)
OR t.privacy_level = 1
AND t.owner_id = ?)
OR (NOT EXISTS
(SELECT e.id
FROM PostPrivacyExcludeFrom e
WHERE e.thought_id = t.id
AND e.user_id = ?
AND NOT EXISTS
(SELECT e2.id
FROM PostPrivacyExcludeFrom e2
WHERE e2.thought_id = t.id
AND EXISTS
(SELECT l.id
FROM FriendFilterIds l
WHERE l.list_id = e2.list_id
AND l.friend_id = ?)))
AND t.privacy_level IN (0, 1, 4))
AND t.owner_id = ?
ORDER BY t.created_at LIMIT 100
(mock up query, similar to the query I use now in Doctrine ORM. It's a mess, but you get what I am saying.)
I guess my question is, how would you approach this situation to optimize it? Is there a better way to set up my database? I'm willing to completely scrap the method I have currently built up, but I wouldn't know what to move onto.
Thanks guys.
Updated: Fix the query to reflect the values I defined for privacy level above (I forgot to update it because I simplified the values)