views:

247

answers:

5

We're writing a records management product for schools and one of the requirements is the ability to manage course schedules. I haven't looked at the code for how we deal with this (I'm on a different project at the moment), but nonetheless I started wondering how best to handle one particular part of this requirement, namely how to handle the fact that each course can be held one or more days of the week, and how best to store this information in the database. To provide some context, a bare-bones Course table might contain the following columns:

Course          Example Data
------          ------------

DeptPrefix      ;MATH, ENG, CS, ...
Number          ;101, 300, 450, ...
Title           ;Algebra, Shakespeare, Advanced Data Structures, ...
Description     ;...
DaysOfWeek      ;Monday, Tuesday-Thursday, ...
StartTime       
EndTime

What I'm wondering is, what is the best way to handle the DaysOfWeek column in this (contrived) example? The problem I'm having with it is that is a multi-valued field: that is, you can have a course on any day of the week, and the same course can take be held on more than one day. I know certain databases natively support multi-value columns, but is there a "best practice" to handle this assuming the database doesn't natively support it?

I've come up with the following possible solutions so far, but I'm wondering if anyone has anything better:

Possible Solution #1: Treat DaysOfWeek as a bit field

This was the first thing that popped into my head (I'm not sure if that's a good thing or not...). In this solution, DaysOfWeek would be defined as a byte, and the first 7 bits would be used to represent the days of week (one bit for each day). A 1 bit would indicate that a class was held on corresponding day of the week.

Pros: Easy to implement (the application can deal with the bit manipulations), works with any database.

Cons: Harder to write queries that use the DaysOfWeek column (although you could deal with this at the application level, or create views and stored procedues in the database to simplify this), breaks relational database model.

Possible Solution #2: Store DaysOfWeek as a string of characters

This is essentially the same approach as using a bit field, but instead of dealing with raw bits, you assign a unique letter to each day of the week, and the DaysOfWeek column just stores a sequence of letters indicating what days a course is held on. For example, you might associate each weekday with a single-character code as follows:

Weekday      Letter
-------      ------

Sunday       S
Monday       M
Tuesday      T
Wednesday    W
Thursday     R
Friday       F
Saturday     U

In this case, a course held on Monday, Tuesday, and Friday would have have the value 'MTF' for DaysOfWeek, while a class held only on Wednesdays would have a DaysOfWeek value of 'W'.

Pros: Easier to deal with in queries (i.e. You could use INSTR, or its equivalent, to determine if a class is held on a given day). Works with any database that supports INSTR or an equivalent function (most, I would guess...). Also friendlier to look at, and easy to see at a glance what is happening in queries that use the DaysOfWeek column.

Cons: The only real "con" is that, like the bitfield approach, this breaks the relational model by storing a variable number of values in a single field.

Possible Solution #3: Use a Lookup Table (ugly)

Another possibility would be to create a new table that stores all the unique combinations of days of the week, and have the Course.DaysOfWeek column simply be a foreign key into this lookup table. However, this solution seems like the most inelegant one, and I only considered it because it seemed like the The Relational WayTM to do things.

Pros: It's the only solution that is "pure" from a relational database point of view.

Cons: It's inelegant and cumbersome. For example, how would you design the user interface for assigning corresponding weekdays to a given course around the lookup table? I doubt a user wants to deal with choices alongs the lines of "Sunday", "Sunday, Monday", "Sunday, Monday, Tuesday", "Sunday, Monday, Tuesday, Wednesday", and so on...

Other Ideas?

So, is there a more elegant way to handle multiple values in a single column? Or would one the proposed solutions suffice? For what it's worth, I think my second solution is probably the best of the three possible solutions that I outlined here, but I'd be curious to see if someone has a different opinion (or indeed a different approach altogether).

+3  A: 

A possible #4: Why does it need to be a single column? You could add 7 bit columns for each day of the week to the table. Writing SQL against it is simple, just test for a 1 in the column of your choice. And the app code reading from the database just hides this in a switch. I realize that this is not normal form and I usually spend quite a bit of time trying to undo such designs from previous programmers, but I somewhat doubt that we're going to add an eighth day to the week any time soon.

To comment on the other solutions, I would probably groan if I encountered the lookup table. My first inclination as well was the bit field with a few custom database functions to help you write natural queries against that field easily.

I'll be curious to read some of the other suggestions that people come up with.

Edit: I should add that #3 and the suggestion above are easier to add indexes to. I'm not sure how one could write a SQL query like "get me all classes on Thursday" for the #1 or #2 queries that wouldn't result in a table scan. But I may just be dim tonight.

Nicholas Piasecki
+1 I like the idea of separate columns for each weekday. Clean and simple. As for it not being in normal form, I'm doubting there is a way to handle this with a totally normalized design that doesn't end up being a total mess (i.e. like the lookup table), but maybe someone here knows better...
Mike Spross
I think this is the only reasonable solution. Lookups make sense if there could be some unexpected variety in the values. I somehow argue that the number of days a week will change anytime soon, so a fixed number of column for such a core notion is legitimate.
Uri
I've done this before. It works well.
Jim
How do you write a query like "How many courses on each day" or "list all courses with 3 sessions a week"? Doesn't this solution complicate reporting?
Vincent Ramdhanie
That's a great point, Vincent. Eventually every database design is a tradeoff. I don't think it affects reporting, but the session number thing, you would need some extra function or view there, perhaps.
Uri
Hmmm ... do a `select count` on the particular day and a `select blah where sum(mon, tues, wed, thurs, fri, sat, sun) = 3`. True, not clean, but I think possible enough.
Nicholas Piasecki
For #3: You wouldn't need to do a sum, as you could just do a grouped subquery... something like "SELECT class_id, COUNT(*) AS numdays from classdays GROUP BY class_id HAVING COUNT(*) >= 3" in the FROM clause, which you could then join to the classes table on class_id.
R. Bemrose
+1  A: 

Solution number 3 seems to be closest to what I would recommend. An extension on the idea of the look up table. Each course has one or more sessions. Create a session table with attributes: course_id, day, time, lecturer_id, room_id etc.

You now have the ability to assign a different lecturer or room to each session of each course assuming that you may want to store this data later.

The user interface issues are not relevant if you are considering the best database design. You can always create views for displaying the data, and for capturing the data your application can take care of the logic of capturing many sessions for each course and adding them to the database.

The meaning of the tables would be clearer which makes long term maintenance easier.

Vincent Ramdhanie
I wasn't a fan of the lookup table the way I designed it, but having a separate session table the way you described it would be nicer, and, like you said, would keep the database normalized.
Mike Spross
I am actually developing a solution for a school and this is close to what we are doing, only its a lot more complicated because we have to take into consideration that each session can be moved or cancelled etc and we want to track all of it.
Vincent Ramdhanie
+2  A: 

If you choose one or two, your table will not be in 1NF (first normal form) as it contains a multi-valued column.

Nicholas has an excellent idea, although I'd disagree that his idea breaks first normal form: The data is not actually repeating, as each day is being stored independently. The only problem with it is that you have to retrieve more columns.

R. Bemrose
+3  A: 

I would avoid the string option for the sense of purity: it adds an extra layer of encoding/decoding that you do not need. It may also mess you up in the case of internationalization.

Since the number of days in a week is 7, I would keep seven columns, perhaps boolean. This will also facilitate subsequent queries. This will also be useful if the tool is ever used in countries where the workweek starts on different days.

I would avoid the lookup because that would be over-normalization. Unless your set of lookup items is not obvious or could possibly change, it's overkill. In the case of days-of-the-week (unlike US states, for example), I would sleep soundly with the fixed set.

Considering the data domain, I don't think that a bitfield would achieve any significant space savings for you and would just make your code more complex.

Finally, a word of warning about the domain: a lot of schools do weird things with their schedules where they "swap days" to balance out an equal number of weekdays of each type over the semester despite holidays. I am not clear about your system, but perhaps the best approach would be to store a table of the actual dates in which the course is expected to take place. This way, if there are two Tuesdays in a week, the teacher can get paid for showing up twice, and the teacher for the Thursday that was canceled will not pay.

Uri
I like many of the proposed solutions here, because they range from very simple to more involved yet very flexible; this answer nicely sums up the two ends of the spectrum: The basic "a column for each day" vs. adding one or more extra tables to track more granular schedule information.
Mike Spross
+1  A: 

If performance is an issue I would recommend a cleaner varation of #3.

Link your course to a "schedule" table.

Which is in turn linked to a days_in_schedule table.

The days_in_schedule table has columns the schedule_name, and the date in_schedule_day. With a row for each valid day in that schedule.

You need some time opr some clever program to populate the table but once this is done the flexibility is worth it.

You can cope not only with "course on Fridays only", but also "first semester only", "lab closed for refurbishment in third semester" and "Canadian branch has differnet holiday schedule".

Other possible queries are "Whats the end date of 20 day course starting 1st April", which "schedules clash most". If you are really good at SQL you can ask "what possible days are open in course xxx to a student who is already booked for course yyy" -- which I have the feeling is the real puprose of your proposed system.

James Anderson