views:

123

answers:

4

Hi all - I'm designing a library management system for a department at my university and I wanted to enlist your eyes on my proposed schema. This post is primarily concerned with how we store multiple copies of each book; something about what I've designed rubs me the wrong way, and I'm hoping you all can point out better ways to tackle things.

For dealing with users checking out books, I've devised three tables: book, customer, and book_copy. The relationships between these tables are as follows:

  • Every book has many book_copies (to avoid duplicating the book's information while storing the fact that we have multiple copies of that book).
  • Every user has many book_copies (the other end of the relationship)

The tables themselves are designed like this:

------------------------------------------------
book
------------------------------------------------
+ id
+ title
+ author
+ isbn
+ etc.
------------------------------------------------

------------------------------------------------
customer
------------------------------------------------
+ id
+ first_name
+ first_name
+ email
+ address
+ city
+ state
+ zip
+ etc.
------------------------------------------------

------------------------------------------------
book_copy
------------------------------------------------
+ id
+ book_id (FK to book)
+ customer_id (FK to customer)
+ checked_out
+ due_date
+ etc.
------------------------------------------------

Something about this seems incorrect (or at least inefficient to me) - the perfectionist in me feels like I'm not normalizing this data correctly. What say ye? Is there a better, more effective way to design this schema?

Thanks!

+3  A: 

It's an OK schema. However, it doesn't model the possibility that a work can have several different presentations -- that is, a book can have many editions (and translations, and formats).

How you slice this -- the granularity you use -- depends, as always in data modeling, on on your usage. In other words, is it "true", for you, for your usage, that a German translation of Alice in Wonderland is "different" from teh English orioginal? (Yes, probably). Is a paperback version "different" from a hardcover?

The simple answer to this is to just use ISBN as the key -- letting the publishing industry makes these decisions for you. Then, anything with the same ISBN is equal and fungible.

You may also want to model something like "acceptable substitute", "this ISBN is an acceptable substitute for that one, becauise the only different is binding" or "this ISBN (Darwin's Origins 6th edition) is the sixth edition of that one (darwin;s original The Origin of Species)", or "this ISBN is a translation of that one" or even "this ISBN (the KJV Bible) is similar to that one (the NIV Bible)." This gets into subtle gradations.

The other, more fundamental problem, is that copies of the same book are conflated with checkouts of those copies. If you unfortunately ordered have 10 copies of, say, Herb Schildt's The Annotated ANSI C Standard, but they are mercifully not checked out because students at your uni read Pete Seebach's excellent review of that terrible book, what is the customer_id for those copies in book_copy?

You want (at least) tables for book (work, isbn); copy; user; and the relation user-checksout-copy.

tpdi
Just to put it in different words, tpdi is suggesting a many-to-many relationship table that links the users with the book_copy, separating the user_id from the copy table in a way that is cleaner.
JYelton
+1 For don't combine book_copy with checkout. Using ISBN may not work since most libraries also loan CDs, DVDs, periodicals, and perhaps even unpublished books are a possibility. But I agree being able to associate related titles may be a plus.
Charles
These are excellent observations; thank you for sharing them.
ABach
+1  A: 

What about a master table titles for the books, which, as tpdi suggested be keyed on ISBN numbers, which would have the title, author, and other details. A subtable book could then be the list of actual inventory, where you have a primary key for each copy (10 entries if you have 10 copies of The Annotated ANSI C Standard) and a foreign key to the titles table, which links those copies to their appropriate title and ISBN number. Finally, a checkout table would be your many-to-many relationship wherein the id of users is linked to the id of the book (rather than title).

The primary difference between how you've done it so far, and how I suggest, is that you've moved the customer_id out of the book table and employed a new checkout table. You could still have datetimes for each book, showing whether it is out or not, but to find out to whom it is lent, you must consult the checkout table.

JYelton
In re-reading tpdi's answer, the relation table user-checksout-copy is really what I am suggesting here, I just didn't realize it at first; so I'm just rephrasing tpdi's suggestion.
JYelton
Thank for helping to clarify tpdi's points.
ABach
+1  A: 

A couple of things.

I think the only obvious normalization error is the one tpdi pointed out: you shouldn't be combining the book_copy entity with the record of a checkout. Checkouts should be recorded in a separate table which cross-references a customer to a book_copy:


checkout

  • customer_id (FK: customer.id)
  • book_copy_id (FK: book_copy.id)
  • date_checked_out

You don't strictly need due_date in this table, since you can always calculate the due date by looking at the date_checked_out and adding however many days the title is allowed to be checked out for. This raises the topic of having various media types and varying checkout limits. For instance DVDs may have a 1 week limit while books have a 2 week limit. To track this information, you'd have another table:


checkout_limit

  • media_type
  • checkout_limit_days

And if you do have multiple types of media available, then of course you have to think about whether you want more tables (one for each media type) or want all media types in the same table (then you'd have to rename it from "book" to something generic). In the latter case you will have some redundancy since some types of media won't have attributes that other types have (e.g. books have ISBN but DVDs don't).

Charles
Also superb; if I could give both you and tpdi the green tick, I would. Thank you for responding.
ABach
A: 

I think what he is trying to do is to ease the way he will query for any available book copy.
Say there's 10 copies of Herb Schildt's The Annotated ANSI C Standard, there will be 10 records on the book_copy table.
Am I right? so when a student checks-out a book, in particular, copy #3, he can just put the customer_id of the student.
The other 9 records will still have an empty customer_id.
This will definitely make your queries easier, but it is not the correct practice
It is correct what the others has pointed out, you need some sort of transaction table, check-outs table to keep track of any books going out.

p01ntbl4nk