views:

3200

answers:

9

Does anyone have any good scenarios for teaching relational databases and SQL? All the examples I can find are either trivial or have improbable domain constraints (like full name being unique).

I'm especially trying to find some good examples for normalisation: tables that don't immediately fit 3NF and BCNF. At the moment I'm using a different problem for each level.

Of course, I'd also love good examples of badly designed databases, but it's a bit distracting until the fundamentals have been mastered.


Thanks, some nice examples. I've marked the student/class one as the answer as I think it's the best so far, but if anyone wants to contribute some more, please do.

+9  A: 

I seem to remember student / class as being a classic, you can put grades in there as well to make it a bit more complex.

  • Student can atted many classes

  • Classes have many students

  • For eachclass a student attends they can have a grade

Initially you can do this in one table and denormalise to three.

Jeremy French
+1 good, classic example
Michael Haren
Could go further and have academic years/terms for classes too
Russ Cam
Don't forget the faculty. And sections (the scheduled version of the concept of a course).
Robert C. Barth
Nice, and by limiting exams/classes to one per year, I can get some interesting dependencies.
Mark
+1  A: 

E-commerce/shopping cart design is good because most people understand the concept and you can push it in many different directions.

You can do simple things like cart, cart_items, users, orders, order_items, etc.

Then you can go deeper with user_addresses, user_emails, items, item_details, item_history, etc.

This can provide a lot of good debate because there are lots of judgment calls.

Michael Haren
Thanks, I like this one too, perhaps as a final problem.
Mark
A: 

There's always the library example as well (A library has many books, each book has an author and publisher which can be pushed into separate tables as you normalise)

Rowland Shaw
A: 

Re:

I'm especially trying to find some good examples for normalisation: tables that don't immediately fit 3NF and BCNF.

You can find normalized database schema samples here: http://www.microsoft.com/sqlserver/2005/en/us/express-starter-schemas.aspx. You can build them from the ground up to show your students the "way of normalization". Particularly look at the Contact Management schema. You can easily denormalize the schema and bring it back on 3NF or deeper.

MarlonRibunal
A: 

Itzik Ben Gan's new book, Microsoft SQL Server 2008: T-SQL Fundamentals, has a very basic example which you can see is derived from a simplified Northwind database.

Microsoft SQL Server 2008: T-SQL Fundamentals author support page

K. Brian Kelley
+3  A: 

off topic

Having taught database classes, I would suggest forgetting about design until the basics of querying are mastered. People understand the need for normalization much better once they understand how to get data out of a database. If you start with normalization and design, you will lose most of the students for the rest of the class. Design should be the very last module of a database course, but all the textbooks I ever reviewed started with it.

Even better make them query both good and bad database designs when learning querying and then they will really understand how painful bad design is when it is time to teach design.

HLGEM
I totally agree - you need to identify what the limitations are, and what drives you to a higher level of design. (+1)
Mark
+3  A: 

One concept that I will never forget is the whole "plural" versus "singular" thing with naming. A great mentor I had way back once told me that you should design your table names as plural and your column names as singular and that you never create temporal-specific names for column names. Temporal Names examples are NutsSold1998, NutsSold1999, NutsSold2000, etc. Never add a year or month or week number or time, etc to a column name.

Table Name Examples: Employees (not Employee) Parts (not Part) Students (Student)

Column Name Examples: EmployeeID (not EmployeesID or EmployeeIDS, etc) PartID (not PartsID or PartIDS, etc) StudentID (not StudentsID or StudentIDS, etc)

And to pay attention to the correct usage of ID, Code, Key, Number, etc... I was always taught to not use "Key" in the name of a Column unless it was an actual table Key (primary or foreign, but not necessarily alternate).. Most of the time appending "ID" will be a better choice than appending "Number" or "Code", but it all depends on the context.

And that comes with time and experience in designing tables, reading good materials such as the book Database Design For Mere Mortals and Data Modeling for Everyone. Also, spending lots of time looking at good designs and taking them apart. Its definitely a craft and you only get better with time and practice.

Optimal Solutions
A: 

Another good model is the invoice-item model, because the "best choice" depends on various factors:

  • number of write vs read operations;
  • performance;
  • are reporting functionalies needed?

Look at this data model:

Invoice

  • ID
  • date

InvoiceItem

  • invoiceID
  • description
  • amount

Application funcitonalities are:

  • create new invoice;
  • each evening create and send two reports: (a) total amount for each invoice, (b) total amount of the day.

Say you have an average of 5 items per invoice and 100 invoices per day, you end up doing, for each day:

  • 5 x 100 writes on InvoiceItem;
  • 100 writes on Invoice;
  • 5 x 100 + 100 = 600 reads from Invoice and InvoiceItem (report (a));
  • 5 x 100 + 100 = 600 reads from Invoice and InvoiceItem (report (b));

so total is = 1800 operations / day, assuming reads and writes have the same cost.

If you add "TotalAmount" attribute on entity "Invoice", situation is a bit different:

  • 5 x 100 writes on InvoiceItem;
  • 100 writes on Invoice (including TotalAmount);
  • 100 from Invoice only (report (a));
  • 100 from Invoice only (report (b));

with a total of 800 operations :)

Megadix
+1  A: 

Here's an example that dates back to my own University days - it was used both as a database design challenge and as an Object Oriented design challenge.

Not all information was revealed at once - part of the challenge was to see how to adapt a design to handle new requirements, and how proper normalisation made this easier.

Assume you have to design a database for a university/college situation and want to handle enrollments.

You have the Courses taught. Each course has a title and a regular timeslot each week.

Each Course has a Lecturer who presents the course.

Each Course has many Students who study the course.

Each Course has one or more Tutors who help students with their study. You don't need to track which tutors help which students.

Some courses have multiple regular timeslots.

Some courses have multiple Lecturers.

Lecturers and Tutors are paid, which means we need to track some information for tax purposes. The Tax department doesn't care what they were paid for - they expect us to have a single record per person.

On some courses, the Lecturer works as a Tutor as well, to get a close up view of how some students are handling the material.

Some Tutors are also Lecturers, on other courses.

To be a Tutor on a course, you have to have been a Student on that course at an earlier time.

Not every student will receive credit for passing the course - some are just auditing the course without a need for credit.

A student who fails a course may attend the course again later on. We need to keep records of every attempt.

Bevan