views:

382

answers:

3

Question

I'm trying to write a simple employee Scheduling software for about 10-20 people in my software development company. After some consideration I settled on writing a web app in Python, Ruby or PHP + Postgres/MySQL DB. While designing database models I began to wonder what data structure would actually be the best for that kind of application.

What it will look like

Example of app showing the month view would be similar to this:

 OCTOBER    1 2 3 4 5 6 7 8 9 ...
John Apple  M M A A N N O O O ...
Daisy Pear  O O O M M A A N N ...
Steve Cat   A A N N O O O M M ...
Maria Dog   N N O O O M M A A ...

where M -> for Morning shift; A -> Afternoon shift etc. (letters can be changed to codes)

What data structure or database design would be the best for this? I was thinking about storing strings (max of 31 characters -> 1 char , 1 day) similar to -> "MMAANNOOOAAMMNNAAOO..." for each user; Month table would contain such strings for each employee.

What would you suggest?

+2  A: 

A quick answer first:

  • EmployeeID
  • Date
  • ShiftType

That said, the best database design largely depends on what you're going to do with the data. If all you need to do is store the records and display them in a table similar to you example, your approach (while not elegant) would work.

However, if you're going to retrieve the data or run reports, you're going to want something a little more structured than a string where each character represents the type of shift assignment.

Larsenal
I considered doing a "normal" database design that comes to mind right away, but wouldn't it be an overkill for something that should be that simple? I may want to do a report or two, but not that much.
Murzyn1
@Muzzyn1 - Using your design, how much effort would you have to go to to determine who was working last Tuesday afternoon? Or who is scheduled for next Friday morning? This database should be so simple to normalise that it is definitely not 'overkill'.
Kirk Broadhurst
Sorry, I am not exactly a database specialist. Low level programming is something that is closer to me (maybe that is why I'm trying to use the concepts from low level software design? ). This project is mainly to teach myself some web app programming and make mine and others life a bit easier, when it comes to shifts :) Is there anything that could be taken into consideartion apart from presented normalized database design?
Murzyn1
A good gut check is to write out some sample queries for known scenarios. Kirk's examples are a good starting point. If it's just a small one-off project and you can easily write queries against your data model, then go with it. The downside of a normalized approach is that it's (arguably) not (as) easy to pull up a row of the database to get a snapshot of a person's schedule. If you wanted to manually edit a person's schedule by manipulating the raw data, your approach may be easier. However, if you want to address individual assignments, a normalized approach may be better.
Larsenal
+1  A: 

I'd suggest a more noramlized database, e.g. a table for persons and one which is the product of shift information for a perdon and a date.

Lucero
+1  A: 

I would go with three-table Kimball star (Date, Employee, Schedule), because sooner or later you will be asked to create (demanding) reports out of this. Who worked most nights? Who worked most weekends? Who never works weekends? Why am I always scheduled Friday afternoon? On which day of a week are certain employees most likely not to show up? Etc, etc...

Tables would be:

TABLE dimDate (KeyDate, FullDate, DayOfWeek, DayNumberInWeek, IsHoliday,... more here)
You can pre-fill dimDate table for 10 years, or so -- may need to tweek the "IsHoliday" column from time to time.

Employee table also changes (relatively) rarely.
TABLE dimEmployee (KeyEmployee, FirstName, LastName, Age, ... more here)

Schedule table is where you would fill-in the work schedule, I have also suggested "HoursOfWork" for each shift, this way it is easy to aggregate hours in reports, like: "How many hours did John Doe work last year on holidays?"

TABLE factSchedule (
KeySchedule, -- surrogate PK
KeyDate, -- FK to dimDate table
KeyEmployee, -- FK to dimEmployee table
Shift, -- shift number (degenerate dimension)
HoursOfWork, -- number of work hours in that shift
)

Instead of having the surrogate KeySchedule, you could also combine KeyDate, KeyEmployee and Shift into a composite primary key to make sure you can not schedule same person on the same shift the same day. Check this on the application layer if the surrogate key is used.
When querying, join tables like:

SELECT SUM(s.HoursOfWork)
FROM factSchedule AS s
JOIN dimDate AS d ON s.KeyDate = d.KeyDate
JOIN dimEmployee AS e ON s.KeyEmployee = e.KeyEmployee
WHERE
e.FirstName='John' AND e.LastName='Doe'
AND d.Year = 2009 AND d.IsHoliday ='Yes';

If using MySQL it is OK to use MyISAM for storage engine and implement your foreign keys (FK) as "logical only" -- use the application layer to take care of referential integrity.

Hope this helps.

alt text

Damir Sudarevic
Thanks, trying to implement something similar.
Murzyn1
I have published a model here: http://www.damirsystems.com/?p=466
Damir Sudarevic
Thank you very much for the advise, I started writing webapp in django basing on your design. It's very helpful. Thanks a lot again :)
Murzyn1