In my database, I have a lot of courses that are compulsory. Some are elective. However, there are courses of a third kind: A list from which you have to choose X courses. The list (and the number X) is different for each study program. How would you represent this relationally?
You could consider these courses as a "set", and then have a separate COURSES_SET
table with the IDs of the sets. Your courses could have an optinal set_id
that references the table of sets of courses. That's just one way to do it...
Example:
COURSES_SET CourseSetID Name ----------- --------------- 1 Early Renaissance medical techniques 2 Jurassic theological certificate program 3 Mad Science COURSES CourseID Name CourseSetID CourseSequenceNumber -------- -------------------- ----------- --------------------- 1001 The joys of leeches 1 1 2011 How to keep your patient from dying 1 2 1700 Is there a T-Rex? Arguements for and Against 2 1 1301 Intro to Algorithms (NULL) (NULL) 3301 Cackling: An advanced 3 3 course
This model allows you to associate a course with a set (though you could also call it a "program") of courses, and also assign a sequence number to ensure that students take them in the correct order. If two courses could be taken at the same time (sequence not important between the two), you could assign them the same sequence number.
Then you could have a separate STUDY_PROGRAM
table that is referenced by COURSES_SET
so you know which study program a set of courses belongs to.
You need 3 tables here: StudyPrograms, Courses and Components. Components represents the Courses which comprise each StudyProgram and is a junction table between Courses and StudyPrograms.
Each Component record can contain a field indicating if the Course is a compulsory part of the StudyProgram. You can also include a field to indicate whether the Course is one of a list that can be chosen.
There is no way to represent the 'X of Y' relationally, you will need some logic in your stored procedures to ensure this business rule is followed (or possibly in your data access code layer, depending on how you want to organise the application).
|--------| |-----------------| |---------------|
|Program | |CoursesByProgram | |Courses |
|------ | |-----------------| |---------------|
|Id PK |---------|ProgramId PK | |CourseId PK |
|Name | |CourseId PK |-------------|Name |
|--------| |ListId FK NULL | |---------------|
|IsCompulsory bit|
|IsElective bit|
-------------------
|
|
|-----------------|
|List |
|-----------------|
|Id PK |
|Name |
------------------
I would do the following:
First create your Course table:
courseId courseName IsCompulsory OptionalCoursesId
======== ========== ============ ================
Then have a second table for the optional course(s):
OptionalcourseId courseName etc etc etc
================ ========== === === ===
in essence the second table is the link table that holds all the optional choices.
Example:
courseId courseName IsCompulsory OptionalCoursesId
======== ========== ============ ================
1 science 0 NULL
2 IT 0 2
in the optionalCourses table:
OptionalcourseId courseName
================ ==========
2 SQL
2 C#
2 Java
5 Extreme Ironing
5 Native Julu dancing
Hopefully that makes sense
You have two options: you can model the data closer to reality, where some are single-course requirements, and others are X from Y courses requirements, or you could model all requirements as X from Y, where the single-course requirements are "1 from 1" requirements.
I would recommend something like this:
Course
---------------
CourseID
Description
...
Program
---------------
ProgramID
Description
...
CourseGroup
---------------
CourseGroupID
CourseID
ProgramCourseGroup
---------------
ProgramID
CourseGroupID
RequiredCourses
Course
and Program
are the two top-level tables. They define the simple list of all courses and programs, respectively, without any sort of relation between the two.
CourseGroup
defines a group of courses. This relates to Course
, but no other tables.
ProgramCourseGroup
relates course groups to programs. A program indicates that a particular group of courses is required, then RequiredCourses
indicates how many courses from that group must be taken in order to satisfy the requirement.
For instance, say you have a program called "Basket Weaving" that requires:
- Intro to baskets
- Basic weaving techniques
and two of the following four courses:
- Easter baskets
- Handbaskets
- Picnic baskets
- SCUBA Diving
Your data would look like:
Course
------------------------------------
CourseID Description
1 Intro to baskets
2 Basic weaving techniques
3 Easter baskets
4 Handbaskets
5 Picnic baskets
6 SCUBA Diving
Program
--------------------------
ProgramID Description
1 Basket Weaving
CourseGroup
--------------------------
CourseGroupID CourseID
1 1
2 2
3 3
3 4
3 5
3 6
ProgramCourseGroup
-----------------------------------------
ProgramID CourseGroupID RequiredCourses
1 1 1
1 2 1
1 3 2
I would define one table Courses
, one table Components
, one table Programmes
and link tables between Courses
and Components
and between Components
and Programmes
. The link table between Programmes
and Components
should also have a column indicating how many credits from the linked-in programme needs from the linked-in component.
For example, you could then have a Math component, in which you have courses like Algebra, Trigonometry, Calculus I and II etc - this is indicated with records in the ComponentCourses
table.
Then, you can have both a Physics
programme and a Chemistry
programme that require math classes. If Physics
require 6 credits math and Chemistry
require 4, you indicate that with the Credits
value in the ProgrammeComponents
table.
To find out if a certain set of courses fulfill the diploma requirements of a programme, just sum the courses' credit values over each component, and see if it matches the required values in the ProgrammeComponents
table.
Example:
Below is some example table data that defines two programmes, physics and chemistry, and their relations to three math courses and a mechanics course. The requirements of the students will be as follows:
- Physics students will have to choose at least 3 math credits out of the 4 available.
- Chemistry students will have to choose at least 2 math credits
- Physics students can optionally take mechanics.
As you see, this schema is really flexible as far as determining "baskets" of courses that you can "pick and choose" a minimum amount of (0 < requried credits in the component < total credits in component), entirely optional courses (required credits = 0) as well as mandatory courses (required credits = total credits in component).
Schema:
Courses Components
******* **********
CourseId | CourseName | Credits ComponentId | ComponentName
1 | Algebra | 1 1 | Math
2 | AP Algebra | 2 2 | Physics
3 | Trigonometry | 1
4 | Mechanics | 1
ComponentCourses Programmes
**************** **********
CourseId | ComponentId ProgramId | ProgramName
1 | 1 1 | Physics
2 | 1 2 | Chemistry
3 | 1
4 | 2
ProgrammeComponents
ProgrammeId | ComponentId | RequiredCredits
1 | 1 | 3
2 | 1 | 2
1 | 2 | 0
In principle one would expect to be able to create a constraint something like this to enforce the rule:
CHECK
(NOT EXISTS
(SELECT 1
FROM CourseEnrolement c, ProgramEnrolement p
WHERE c.StudentId = p.StudentId
AND c.ProgramId = p.ProgramId
GROUP BY p.StudentId, p.ProgramId, p.NumberOfCoursesRequired
HAVING COUNT(*) <> p.NumberOfCoursesRequired
))
Unfortunately, SQL makes it all but impossible to implement this, or at least makes it very difficult to update the database while the constraint is enforced. So if you really want to represent such a rule in the database then you may need a better model than SQL can offer. In practice such rules will often be enforced in application code instead.
I find it interesting that the accepted answer says, "There is no way to represent the 'X of Y' relationally" when that is essentially what the question was asking for. It seems to me that 'X of Y' can indeed be modelled (and largely enforced) using SQL and here's a suggested way:
Example scenario: students taking the course 'French' must choose two components (x) out of a total of three possible compnents (y).
CREATE TABLE Components
(
component_name VARCHAR(100) NOT NULL,
UNIQUE (component_name)
);
INSERT INTO Components (component_name) VALUES
('Oral'),
('Writing'),
('Vocab'),
('Databases');
Clearly, 'Databases' doesn't belong on a course on French so we need a tables for course designers to model courses [these tables have many relevant candidate keys so for clarity I'll define them at the 'bottom' of the CREATE TABLE
statement):
CREATE TABLE XofYCourses
(
course_name VARCHAR(100) NOT NULL,
x_components_choice_tally INTEGER NOT NULL
CHECK (x_components_choice_tally > 0),
y_components_tally INTEGER NOT NULL
CHECK (y_components_tally > 0),
CHECK (x_components_choice_tally < y_components_tally),
UNIQUE (course_name),
UNIQUE (course_name, y_components_tally),
UNIQUE (course_name, x_components_choice_tally)
);
INSERT INTO XofYCourses (course_name, y_components_tally,
x_components_choice_tally) VALUES
('French', 2, 3);
The above allows us to model the 'two out of three' attribute of the French course. Now we need a table to model what the three possible components of that course actually are:
CREATE TABLE XofYCourseComponents
(
course_name VARCHAR(100) NOT NULL,
y_components_tally INTEGER NOT NULL,
FOREIGN KEY (course_name, y_components_tally)
REFERENCES XofYCourses (course_name, y_components_tally),
component_sequence INTEGER NOT NULL
CHECK (component_sequence > 0),
component_name VARCHAR(100) NOT NULL
REFERENCES Components (component_name),
CHECK (component_sequence <= y_components_tally),
UNIQUE (course_name, component_sequence),
UNIQUE (course_name, component_name)
);
INSERT INTO XofYCourseComponents (course_name,
component_sequence, y_components_tally, component_name)
VALUES
('French', 1, 3, 'Oral'),
('French', 2, 3, 'Writing'),
('French', 3, 3, 'Vocab');
Now for enrolment. Billy want to do the French course...
CREATE TABLE Students
(
student_name VARCHAR(20) NOT NULL,
UNIQUE (student_name)
);
INSERT INTO Students (student_name) VALUES ('Billy');
...and chooses 'Oral' and 'Vocab':
CREATE TABLE XofYCourseComponentChoices
(
student_name VARCHAR(20) NOT NULL
REFERENCES Students (student_name),
course_name VARCHAR(100) NOT NULL,
x_components_choice_tally INTEGER NOT NULL,
FOREIGN KEY (course_name, x_components_choice_tally)
REFERENCES XofYCourses (course_name, x_components_choice_tally),
component_name VARCHAR(100) NOT NULL,
FOREIGN KEY (course_name, component_name)
REFERENCES XofYCourseComponents (course_name, component_name),
x_component_sequence INTEGER NOT NULL
CHECK (x_component_sequence > 0),
CHECK (x_component_sequence <= x_components_choice_tally),
UNIQUE (student_name, course_name, component_name),
UNIQUE (student_name, course_name, x_component_sequence)
);
INSERT INTO XofYCourseComponentChoices (student_name, course_name,
component_name, x_component_sequence, x_components_choice_tally)
VALUES
('Billy', 'French', 'Oral', 1, 2),
('Billy', 'French', 'Vocab', 2, 2);
The above structure works a good way of enforcing the maximum values i.e. no more than three components for the French course and no more than two choices for each student.
What it doesn't do, however, is to ensure exact amounts e.g. that Billy doesn't choose just one component. Standard SQL has solutions to this problem e.g. CHECK
constraints that support subqueries (e.g. to count that there are a total of two rows for Billy...) and DEFERRABLE
constraints (...but delay the count until the point when the transaction is being committed). Having a 'multiple assignment' feature would be even better. However, most SQL products don't have these features.
Does this lack of support for the full solution mean we don't do anything and just trust the application will refrain from writing invalid data? of course not!
A good interim approach is to revoke the privileges from the base tables and provide helper stored procedures e.g. one to enrol a student which takes their chosen course components as parameters: the count is done at after the INSERT
s and if it violates the data rules (e.g. less than two for French) then the transaction is rolled back and an error returned.