views:

162

answers:

8

In my database, I have a lot of courses that are compulsory. Some are elective. However, there are courses of a third kind: A list from which you have to choose X courses. The list (and the number X) is different for each study program. How would you represent this relationally?

+1  A: 

You could consider these courses as a "set", and then have a separate COURSES_SET table with the IDs of the sets. Your courses could have an optinal set_id that references the table of sets of courses. That's just one way to do it...

Example:


COURSES_SET

CourseSetID      Name
-----------      ---------------
1                Early Renaissance medical techniques
2                Jurassic theological certificate program
3                Mad Science

COURSES

CourseID     Name                   CourseSetID          CourseSequenceNumber
--------     --------------------   -----------          ---------------------
1001         The joys of leeches    1                    1

2011         How to keep your
             patient from dying     1                    2

1700         Is there a T-Rex?
             Arguements for and
             Against                2                    1

1301         Intro to Algorithms    (NULL)               (NULL)

3301         Cackling: An advanced  3                    3
             course

This model allows you to associate a course with a set (though you could also call it a "program") of courses, and also assign a sequence number to ensure that students take them in the correct order. If two courses could be taken at the same time (sequence not important between the two), you could assign them the same sequence number.

Then you could have a separate STUDY_PROGRAM table that is referenced by COURSES_SET so you know which study program a set of courses belongs to.

FrustratedWithFormsDesigner
+4  A: 

You need 3 tables here: StudyPrograms, Courses and Components. Components represents the Courses which comprise each StudyProgram and is a junction table between Courses and StudyPrograms.

Each Component record can contain a field indicating if the Course is a compulsory part of the StudyProgram. You can also include a field to indicate whether the Course is one of a list that can be chosen.

There is no way to represent the 'X of Y' relationally, you will need some logic in your stored procedures to ensure this business rule is followed (or possibly in your data access code layer, depending on how you want to organise the application).

David
@David: "There is no way to represent the 'X of Y' relationally". This statement is untrue. It *can* be represented and enforced by constraints in relational database terms. It just can't be done declaratively in the SQL language. It's the SQL bit that is the limiting factor and that generally means it would have to be done in procedural code.
dportas
So, there's no way to represent X of Y purely in terms of relationships between tables. To me, that makes the statement 'there is not way to represent X of Y relationally' correct - which is the answer to the question.
David
+1  A: 
|--------|         |-----------------|             |---------------|
|Program |         |CoursesByProgram |             |Courses        |
|------  |         |-----------------|             |---------------|
|Id PK   |---------|ProgramId   PK   |             |CourseId    PK |
|Name    |         |CourseId    PK   |-------------|Name           |
|--------|         |ListId FK NULL   |             |---------------|
                   |IsCompulsory  bit|             
                   |IsElective    bit|             
                   -------------------
                          |
                          |
                  |-----------------|
                  |List             |
                  |-----------------|
                  |Id        PK     |
                  |Name             |
                  ------------------
Gregoire
+1  A: 

I would do the following:

First create your Course table:

courseId  courseName IsCompulsory OptionalCoursesId
========  ========== ============ ================

Then have a second table for the optional course(s):

OptionalcourseId courseName etc etc etc
================ ========== === === ===

in essence the second table is the link table that holds all the optional choices.

Example:

courseId  courseName IsCompulsory OptionalCoursesId
========  ========== ============ ================
1         science    0            NULL
2         IT         0            2

in the optionalCourses table:

OptionalcourseId courseName 
================ ========== 
2                SQL       
2                C#
2                Java
5                Extreme Ironing
5                Native Julu dancing

Hopefully that makes sense

Darknight
+3  A: 

You have two options: you can model the data closer to reality, where some are single-course requirements, and others are X from Y courses requirements, or you could model all requirements as X from Y, where the single-course requirements are "1 from 1" requirements.

I would recommend something like this:

Course
---------------
CourseID
Description
...

Program
---------------
ProgramID
Description
...

CourseGroup
---------------
CourseGroupID
CourseID

ProgramCourseGroup
---------------
ProgramID
CourseGroupID
RequiredCourses

Course and Program are the two top-level tables. They define the simple list of all courses and programs, respectively, without any sort of relation between the two.

CourseGroup defines a group of courses. This relates to Course, but no other tables.

ProgramCourseGroup relates course groups to programs. A program indicates that a particular group of courses is required, then RequiredCourses indicates how many courses from that group must be taken in order to satisfy the requirement.

For instance, say you have a program called "Basket Weaving" that requires:

  • Intro to baskets
  • Basic weaving techniques

and two of the following four courses:

  • Easter baskets
  • Handbaskets
  • Picnic baskets
  • SCUBA Diving

Your data would look like:

Course
------------------------------------
CourseID    Description
1           Intro to baskets
2           Basic weaving techniques
3           Easter baskets
4           Handbaskets
5           Picnic baskets
6           SCUBA Diving

Program
--------------------------
ProgramID   Description
1           Basket Weaving

CourseGroup
--------------------------
CourseGroupID  CourseID
1              1
2              2
3              3
3              4
3              5
3              6

ProgramCourseGroup
-----------------------------------------
ProgramID  CourseGroupID  RequiredCourses
1          1              1
1          2              1
1          3              2
Adam Robinson
+1  A: 

I would define one table Courses, one table Components, one table Programmes and link tables between Courses and Components and between Components and Programmes. The link table between Programmes and Components should also have a column indicating how many credits from the linked-in programme needs from the linked-in component.

For example, you could then have a Math component, in which you have courses like Algebra, Trigonometry, Calculus I and II etc - this is indicated with records in the ComponentCourses table.
Then, you can have both a Physics programme and a Chemistry programme that require math classes. If Physics require 6 credits math and Chemistry require 4, you indicate that with the Credits value in the ProgrammeComponents table.

To find out if a certain set of courses fulfill the diploma requirements of a programme, just sum the courses' credit values over each component, and see if it matches the required values in the ProgrammeComponents table.

Example:

Below is some example table data that defines two programmes, physics and chemistry, and their relations to three math courses and a mechanics course. The requirements of the students will be as follows:

  • Physics students will have to choose at least 3 math credits out of the 4 available.
  • Chemistry students will have to choose at least 2 math credits
  • Physics students can optionally take mechanics.

As you see, this schema is really flexible as far as determining "baskets" of courses that you can "pick and choose" a minimum amount of (0 < requried credits in the component < total credits in component), entirely optional courses (required credits = 0) as well as mandatory courses (required credits = total credits in component).

Schema:

Courses                                Components
*******                                **********
CourseId | CourseName   | Credits      ComponentId | ComponentName
1        | Algebra      | 1            1           | Math
2        | AP Algebra   | 2            2           | Physics
3        | Trigonometry | 1
4        | Mechanics    | 1

ComponentCourses                Programmes
****************                **********
CourseId | ComponentId          ProgramId | ProgramName
1        | 1                    1         | Physics
2        | 1                    2         | Chemistry
3        | 1
4        | 2

ProgrammeComponents
ProgrammeId | ComponentId | RequiredCredits
1           | 1           | 3
2           | 1           | 2
1           | 2           | 0
Tomas Lycken
+2  A: 

In principle one would expect to be able to create a constraint something like this to enforce the rule:

CHECK
 (NOT EXISTS
  (SELECT 1
   FROM CourseEnrolement c, ProgramEnrolement p
   WHERE c.StudentId = p.StudentId
   AND c.ProgramId = p.ProgramId
   GROUP BY p.StudentId, p.ProgramId, p.NumberOfCoursesRequired
   HAVING COUNT(*) <> p.NumberOfCoursesRequired
 ))

Unfortunately, SQL makes it all but impossible to implement this, or at least makes it very difficult to update the database while the constraint is enforced. So if you really want to represent such a rule in the database then you may need a better model than SQL can offer. In practice such rules will often be enforced in application code instead.

dportas
SQL has the `DEFERRABLE` model: declare the constraint as `INITIALLY DEFERRED` or explicitly defer it in a transaction and the constraint is not checked until the transaction is committed. I understand there are better models (e.g. multiple assignment) but I don't get why you think "SQL makes it very difficult".
onedaywhen
Also, it seems to me that a stored procedure would be a better place to implement such a business rule, rather than replicating it in every application that will use the database.
onedaywhen
Right, but SQL forces you to do it procedurally with the disadvantages that implies. Not least that the optimiser or data management tools won't use the constraint. SQL DBMSs make it hard to implement even the most common and basic constraints - for example inclusion dependency and join dependency constraints or even foreign keys that are mandatory on "both sides" of the constraint. DEFERRABLE is just a way of saying the constraint won't be enforced some of the time and it = procedural code.
dportas
"dportas" -- ah, the penny drops. We've been here before (e.g. on your blog) so I know we're coming from the same place ;)
onedaywhen
+3  A: 

I find it interesting that the accepted answer says, "There is no way to represent the 'X of Y' relationally" when that is essentially what the question was asking for. It seems to me that 'X of Y' can indeed be modelled (and largely enforced) using SQL and here's a suggested way:

Example scenario: students taking the course 'French' must choose two components (x) out of a total of three possible compnents (y).

CREATE TABLE Components
(
 component_name VARCHAR(100) NOT NULL, 
 UNIQUE (component_name)
);

INSERT INTO Components (component_name) VALUES 
('Oral'), 
('Writing'), 
('Vocab'), 
('Databases');

Clearly, 'Databases' doesn't belong on a course on French so we need a tables for course designers to model courses [these tables have many relevant candidate keys so for clarity I'll define them at the 'bottom' of the CREATE TABLE statement):

CREATE TABLE XofYCourses
(
 course_name VARCHAR(100) NOT NULL, 
 x_components_choice_tally INTEGER NOT NULL 
    CHECK (x_components_choice_tally > 0), 
 y_components_tally INTEGER NOT NULL
    CHECK (y_components_tally > 0), 
 CHECK (x_components_choice_tally < y_components_tally),
 UNIQUE (course_name), 
 UNIQUE (course_name, y_components_tally), 
 UNIQUE (course_name, x_components_choice_tally)
);


INSERT INTO XofYCourses (course_name, y_components_tally, 
x_components_choice_tally) VALUES 
('French', 2, 3);

The above allows us to model the 'two out of three' attribute of the French course. Now we need a table to model what the three possible components of that course actually are:

CREATE TABLE XofYCourseComponents
(
 course_name VARCHAR(100) NOT NULL, 
 y_components_tally INTEGER NOT NULL, 
 FOREIGN KEY (course_name, y_components_tally)
    REFERENCES XofYCourses (course_name, y_components_tally), 
 component_sequence INTEGER NOT NULL
    CHECK (component_sequence > 0), 
 component_name VARCHAR(100) NOT NULL 
    REFERENCES Components (component_name), 
 CHECK (component_sequence <= y_components_tally), 
 UNIQUE (course_name, component_sequence), 
 UNIQUE (course_name, component_name) 
);

INSERT INTO XofYCourseComponents (course_name, 
component_sequence, y_components_tally, component_name) 
VALUES 
('French', 1, 3, 'Oral'), 
('French', 2, 3, 'Writing'), 
('French', 3, 3, 'Vocab');

Now for enrolment. Billy want to do the French course...

CREATE TABLE Students
(
 student_name VARCHAR(20) NOT NULL, 
 UNIQUE (student_name)
);

INSERT INTO Students (student_name) VALUES ('Billy');

...and chooses 'Oral' and 'Vocab':

CREATE TABLE XofYCourseComponentChoices
(
 student_name VARCHAR(20) NOT NULL
    REFERENCES Students (student_name), 
 course_name VARCHAR(100) NOT NULL, 
 x_components_choice_tally INTEGER NOT NULL, 
 FOREIGN KEY (course_name, x_components_choice_tally)
    REFERENCES XofYCourses (course_name, x_components_choice_tally), 
 component_name VARCHAR(100) NOT NULL, 
 FOREIGN KEY (course_name, component_name)
    REFERENCES XofYCourseComponents (course_name, component_name), 
 x_component_sequence INTEGER NOT NULL
    CHECK (x_component_sequence > 0), 
 CHECK (x_component_sequence <= x_components_choice_tally), 
 UNIQUE (student_name, course_name, component_name), 
 UNIQUE (student_name, course_name, x_component_sequence)
);

INSERT INTO XofYCourseComponentChoices (student_name, course_name, 
component_name, x_component_sequence, x_components_choice_tally)
VALUES
('Billy', 'French', 'Oral', 1, 2), 
('Billy', 'French', 'Vocab', 2, 2);

The above structure works a good way of enforcing the maximum values i.e. no more than three components for the French course and no more than two choices for each student.

What it doesn't do, however, is to ensure exact amounts e.g. that Billy doesn't choose just one component. Standard SQL has solutions to this problem e.g. CHECK constraints that support subqueries (e.g. to count that there are a total of two rows for Billy...) and DEFERRABLE constraints (...but delay the count until the point when the transaction is being committed). Having a 'multiple assignment' feature would be even better. However, most SQL products don't have these features.

Does this lack of support for the full solution mean we don't do anything and just trust the application will refrain from writing invalid data? of course not!

A good interim approach is to revoke the privileges from the base tables and provide helper stored procedures e.g. one to enrol a student which takes their chosen course components as parameters: the count is done at after the INSERTs and if it violates the data rules (e.g. less than two for French) then the transaction is rolled back and an error returned.

onedaywhen
Modelled, but not _enforced_ is how I interpret the answer. Which was useful.
Christian Jonassen
@Christian Jonassen: a data model needs to be enforced in SQL, preferably via constraints, otherwise it is worthless ;)
onedaywhen
+1. Excellent answer
dportas
+1. Somehow just reading the SQL without the explination tells a story on it's own. :)
wtfsven