views:

48

answers:

3

Hi guys,

I have several tables in my ERD which which I would like to combine in a relational manner.

I have several use cases, but I completely lost track of what kind of relations to use between the tables.

  • Every user can work on multiple projects.
  • Every user has one specific role per project (Manager, Contributor, User)
  • Every project has multiple datasets (currDataXXX columns in 'projects') which need to be linked to the table data.
  • The application will keep track of datasets that have been added by users. Thus I assume I need a relation between tables 'users' and 'data' too?

I used a bridge model in table 'roles' with 2 PK's to link the users and projects together and defining a role for that user and project at the same time (is this the correct way?).

Could somebody please help me assist the correct relations between the tables? and maybe suggest columns (which are missing) for tables (or tables as a whole of course).

Kind of lost sight of this.

With kind regards,

B.

REVISED ERD: alt text (Original image: http://i55.tinypic.com/2mq2ejs.jpg)

A: 

Every user can work on multiple projects.

This is a many-to-many relationship, so it has to be with a middle table as you've done with the ID of the user and the project.

Every user has one specific role per project (Manager, Contributor, User)

You should add third field in this table mentioned above and call it "RoleID" and have another table called "Roles" containing two fields "RoleID" and "Role"

Ahmad Farid
Thank you, I will make those changes and see how it works. About the data table; Do I have to integrate foreign keys and link them to the users table as well as the projects table?
Rhizosis
You're welcome :)As much as I understand, if the data is related only to the project, then you should integrate the foreign keys to the projects table only. The users will be connected to the data through the projects and not directly.
Ahmad Farid
A: 

You don't design your tables then work out the relationships - they should both evolve at the same time.

Particularly when you're using a database that does not require foreign-key declarations, I'd recommend using a strict naming convention. Although you seem to be doing that for the tables you've shown relationships for, the remainder of the field names seem somewhat hap-hazard.

But without seeing all the analysis (and knowing it is correct) its not really possible for anyone to tell you how the tables should be joined, nor what may be missing

symcbean
A: 

Every project has multiple datasets (currDataXXX columns in 'projects')

This means that there is a one-to-many relationship from project to dataset. As such, the project ID should be a foreign key on the dataset table, while the currDataXX columns should be removed from the project table. (The current design represents a denormalisation.)

Having done this, you now have two many-to-many relationships between project and user - one with a link entity of roles, and one with a link entity of dataset. Such relationships are normally redundant - in this case, I would assume that a) only users with a role on a project can add datasets, and b) that a single user can add many datasets for a single project.

If both of these assumptions are correct, then there is actually a one to many relationship from role to dataset, and therefore the role ID should be a foreign key on the dataset table, while the user and project IDs become redundant on the dataset table and can be removed. Like this:

+----------+    +----------+    +----------+
|   User   +---<|   Role   |>---+  Project |
+----------+    +----+-----+    +----------+
                     |
                    /|\
                +----------+
                |  Dataset |
                +----------+

[If assumption b) is incorrect, then role and dataset can be combined into a single table, while if assumption a) is incorrect then role and dataset remain two distinct linking entities, unrelated to each other.]

EDIT - updated proposed structure, following Rhizosis' edits:

                +----------+
                |   Role   |
                +----+-----+
                     |
                    /|\
+----------+    +----------+    +----------+
|   User   +---<|  Users/  |>---+  Project |
|          |    | Projects |    |          |
+----------+    +----+-----+    +----------+
                     |
                    /|\
                +----------+
                |  Dataset |
                +----------+
Mark Bannister
Thank you Mark, this gave me a good insight. You are indeed correct when you stated that eventually a user that has a specific role, will have to be able to add the 3 datasets to one specific project. Would having two many-to-many relationship be the best approach for this situation?
Rhizosis
Sorry mark, maybe I used the term 'dataset' wrongly. The 'data' table will not hold it data itself. I joined a project which started of really bad and has a real bad conceptual design of data storage. What happened is that the 3 data sets are stored in 3 different formats (1. MySQL DB, 2. Excel sheet, 3. CSV file). If it was up to me I would have stored everything in a SQL database... At this point I could not change where the data is stored. Thus in this ERD I have the table 'data' which stores the different locations where the data can be found and what type of data it is.
Rhizosis
@Rhizosis: in your first comment, it sounds as though you are confirming assumption a). In which case, I think the one-to-many role-to-dataset relationship proposed is optimum. Having two separate many-to-many relationships between User and Project should only be considered if there is absolutely no relationship between role and dataset - which does not appear to be the case, here. (You may want to add a dataset type field to the dataset table.)
Mark Bannister
There is no relationship between role and dataset. Maybe the revised ERD will explain it better, what my general concept is.
Rhizosis
Thanks for your time and input Mark. Much appreciated
Rhizosis