views:

76

answers:

3

So, yesterday I asked 2 questions that pivoted around the same idea: Reorganizing a database that A- wasn't normalized and B- was a mess by virtue of my ignorance. I spent the better part of the day organizing my thoughts, reading up and working through some tests. Today I think I have a much better idea of how my DB should look and act, but I wanted to make sure I understood the core ideas of proper SQL DB design and normalization processes.

Originally I had ONE table called "Files" that held data about a file (it's URL, date uploaded, user ID of whomever uploaded it etc.) as well as a column called "grades" that represented the grade level you might use that file for. (FYI: These files are lesson plans for schools) I realized I'd violated Rule #1 about Normalization- I was storing my "grades" like this "1,2" or "2,6" or "3,5,6" in one column. This caused major headaches when trying to parse that data if I wanted to see JUST 3rd grade lessons or JUST 5th grade lessons.

What was suggested to me, and what became evident later, was that I have 3 tables:

files (data about the files, url etc.) grades (a table of available grade levels. Likely 1-6 to start) files_grades (a junction table)

This makes sense,. I just want to make sure I understand what I'm doing before I do it. Let's say User A uploads File xyz and decides that it's good for grades 2 and 3.

I'd write ONE record to the "files" table with data about that file (kb size, url, description, name, primary key files_id). Let's say it gets id 345.

Because of the limited number of grade options, grades will likely be equivalent to their ID (i.e., Grade 1 is grades_id 1, Grade 2 is grades_id 2)

I'd then write TWO records to the "files_grade" junction table containing

files_grade_id, files_id, and grades_id i.e.

1,345,2

1,345,3

To represent the 2 grades that files_id 345 is good for. Then I wave my magic SELECT and JOIN wands and pull the data I need.

Does this make sense? Am I, again, misunderstanding the proper structure of a relational many-to-many database?

Problem 2 which just dawned on me: So, a Lesson can have Multiple "Grades". No problem, we just solved that (I hope!). But it could, in theory, have multiple "Schools" as well- Elementary, Middle, High. What do we do if a files entry has Grades 1,2 for Middle,High? This could very easily be solved by saying "One school per file, users!", but I like to throw this out there.

A: 

From the sounds of it, it sounds pretty good. Just one thing, though: you don't really need to have an ID for the bridge table (FILES_GRADES), and if you do, you need to increment the ID.

You would have a two-part primary key: grade_id and file_id, the files_grade_id just complicates things, and would make for a bad index, since you'd never use it in a select.

Slokun
Ah. That makes sense! Thank you.
GilloD
+1  A: 

I'd then write TWO records to the "files_grade" junction table containing

files_grade_id, files_id, and grades_id i.e.

files_grade_id here is redundant, because the combination of files_id and grades_id is already unique (thus can be set as the primary key).

But it could, in theory, have multiple "Schools" as well- Elementary, Middle, High. What do we do if a files entry has Grades 1,2 for Middle,High?

Depending on your requirement, you can perhaps store those as "continuations" of the previous grades, e.g. 1-6 elementary, 7-9 middle, 10-12 high. Then you can make do without the grades table completely (since you can just store these numbers in the files_grade table).

Johannes Sasongko
Here's the headache: This is aimed at Korean schools which are K-6. Then Middle 1-2 and High 1-4. Bleh. New idea: At point of capture,if school = highschool and grade = 2, grade = 10 or whatever. I need to think that over a little bit. I may just stick with the one-school thing.
GilloD
A: 

Since you first question is already been answered I will take a stab at the second one.

There are multiple ways to do this, but one possibility is to add another table for the "Schools" and include it as part of the junction table - renaming the junction table of course to match the new design. So, you could have:

School Table:

-------------------------
  SchoolId  |   School  
-------------------------
     1      | Elementary
     2      | Middle
     3      | High
-------------------------

Files_grades_school

------------------------------------
  FileId  |   GradeId  |  SchoolId 
------------------------------------
    345   |      1     |      1
    345   |      1     |      2

You will probably want to create multiple indexes based on your usage patterns.

Waleed Al-Balooshi
The problem was that there could be MULTIPLE school IDs. I think I'll restrict to one for now. Honestly, they're all the same skill-wise, the distinction is that Elementary lessons are tailored to a curriculum where Middle/High are free form
GilloD
You shouldn't add any extra complexity if you don't need it. So, if you don't believe that it is something that will be necessary than there is no need to add this extra level of complexity. But, be sure that customers agree that the school distinction will not affect the lesson plans, because changing this in the future will be much more difficult once you have data in your tables.
Waleed Al-Balooshi