Hello everyone!
I recently ran into a quite complex problem and after looking around a lot I couldn't find a solution to it. I've found answers to my questions many times before on stackoverflow.com, so I decided to post here.
So I'm making a user/group managment system for a web-based project, and I'm storing all related data into a postgreSQL database. This system relies on three tables:
- USERS (Contains the primary key "USER_ID")
- GROUPS (Contains the primary key "GROUP_ID")
- GROUP_USERS
The two first tables simply define all the users and all the groups on the site, and the last table, GROUP_USERS, stores the groups every user is part of. It only has two columns:
- USER_ID
- GROUP_ID
Since every user can be a member of several groups, I decided to make a separate table for this purpose, rather than storing a comma separated column in the USERS-table.
Now, both columns are foreign keys, and I want to make them a composite primary key as well, this since each combination of USER_ID and GROUP_ID has to be unique. But now I am stuck with what seems to be a lot of indexes and relations to a very small table only containing numbers. In the end, I want this table to be as fast as possible, even if containing tens of thousands of rows. Size on disk shouldn't be a problem since its just all numbers anyway, but it feels quite stupid to have a full-sized index refering to a smaller table.
Should I stick with my current solution, store comma-separated values in a column in the USERS-table or is there any other solution I should be aware of. What I am looking for is best possible performance. This table could potentially (but not likely or commonly) be queried several hundreds of times on a single page load.
I don't want to use an array-column, even if they are supported by postgreSQL. I want to be as generic as possible so I can switch database later on, if necessary.
EDIT: In other words, will using a composite primary key and two foreign keys in one table with only two columns have a negative impact on performance rather than the opposite due to the size of the generated index?
EDIT2: Clarifications.
Thank you!