ansaurus

Question

ERD diagram and SQL relationships linking user, project and dataset tables.

Answer 1

A:

Every user can work on multiple projects.

This is a many-to-many relationship, so it has to be with a middle table as you've done with the ID of the user and the project.

Every user has one specific role per project (Manager, Contributor, User)

You should add third field in this table mentioned above and call it "RoleID" and have another table called "Roles" containing two fields "RoleID" and "Role"

Ahmad Farid 2010-10-29 11:14:58

Thank you, I will make those changes and see how it works. About the data table; Do I have to integrate foreign keys and link them to the users table as well as the projects table?

Rhizosis 2010-10-29 11:24:46

You're welcome :)As much as I understand, if the data is related only to the project, then you should integrate the foreign keys to the projects table only. The users will be connected to the data through the projects and not directly.

Ahmad Farid 2010-10-29 13:36:14

Answer 2

A:

You don't design your tables then work out the relationships - they should both evolve at the same time.

Particularly when you're using a database that does not require foreign-key declarations, I'd recommend using a strict naming convention. Although you seem to be doing that for the tables you've shown relationships for, the remainder of the field names seem somewhat hap-hazard.

But without seeing all the analysis (and knowing it is correct) its not really possible for anyone to tell you how the tables should be joined, nor what may be missing

symcbean 2010-10-29 11:37:10

Answer 3

A:

Every project has multiple datasets (currDataXXX columns in 'projects')

This means that there is a one-to-many relationship from project to dataset. As such, the project ID should be a foreign key on the dataset table, while the currDataXX columns should be removed from the project table. (The current design represents a denormalisation.)

Having done this, you now have two many-to-many relationships between project and user - one with a link entity of roles, and one with a link entity of dataset. Such relationships are normally redundant - in this case, I would assume that a) only users with a role on a project can add datasets, and b) that a single user can add many datasets for a single project.

If both of these assumptions are correct, then there is actually a one to many relationship from role to dataset, and therefore the role ID should be a foreign key on the dataset table, while the user and project IDs become redundant on the dataset table and can be removed. Like this:

+----------+    +----------+    +----------+
|   User   +---<|   Role   |>---+  Project |
+----------+    +----+-----+    +----------+
                     |
                    /|\
                +----------+
                |  Dataset |
                +----------+

[If assumption b) is incorrect, then role and dataset can be combined into a single table, while if assumption a) is incorrect then role and dataset remain two distinct linking entities, unrelated to each other.]

EDIT - updated proposed structure, following Rhizosis' edits:

                +----------+
                |   Role   |
                +----+-----+
                     |
                    /|\
+----------+    +----------+    +----------+
|   User   +---<|  Users/  |>---+  Project |
|          |    | Projects |    |          |
+----------+    +----+-----+    +----------+
                     |
                    /|\
                +----------+
                |  Dataset |
                +----------+

Mark Bannister 2010-10-29 11:39:01

Thank you Mark, this gave me a good insight. You are indeed correct when you stated that eventually a user that has a specific role, will have to be able to add the 3 datasets to one specific project. Would having two many-to-many relationship be the best approach for this situation?

Rhizosis 2010-10-29 11:49:05

Sorry mark, maybe I used the term 'dataset' wrongly. The 'data' table will not hold it data itself. I joined a project which started of really bad and has a real bad conceptual design of data storage. What happened is that the 3 data sets are stored in 3 different formats (1. MySQL DB, 2. Excel sheet, 3. CSV file). If it was up to me I would have stored everything in a SQL database... At this point I could not change where the data is stored. Thus in this ERD I have the table 'data' which stores the different locations where the data can be found and what type of data it is.

Rhizosis 2010-10-29 11:56:52

@Rhizosis: in your first comment, it sounds as though you are confirming assumption a). In which case, I think the one-to-many role-to-dataset relationship proposed is optimum. Having two separate many-to-many relationships between User and Project should only be considered if there is absolutely no relationship between role and dataset - which does not appear to be the case, here. (You may want to add a dataset type field to the dataset table.)

Mark Bannister 2010-10-29 12:55:54

There is no relationship between role and dataset. Maybe the revised ERD will explain it better, what my general concept is.

Rhizosis 2010-10-29 13:57:21

Thanks for your time and input Mark. Much appreciated

Rhizosis 2010-10-31 16:23:52

ansaurus

tags:

views:

answers:

ERD diagram and SQL relationships linking user, project and dataset tables.

related questions