views:

47

answers:

3

Lets say you have 4 types of assessments, Test, Quiz, MiniQuiz and FinalExam

and we store records in the database like so

studentid ----- assesType
1                test
2                quiz
3                quiz
4                quiz
5                miniquiz
6                miniquiz
7                final
8                final

is it faster and a better approach to assign numbers to each type lets say:

test = 1 quiz = 2 miniquiz = 3 final = 4

AND use this instead for record keeping.

studentid ----- assesType
1                1
2                2
3                2
4                2
5                3
6                3
7                4
8                5

What I really want to ask, is it worth it to do this? advantages etc ? since it becomes a bit harder to write code in the server-side language of choice when doing this.

Thank you =)

+2  A: 

Yes, it is much more efficient to use numbers, and that is the standard convention.

SQL and relational databases are built for this kind of storage.

For example:

select students.studentid, assessments.assesName AS assesType
from students
  inner join assessments on (students.assesType = assesments.assesType)
Senseful
+2  A: 

Yes, numeric comparisons are faster than string comparisons. Strings also take a lot more space, and the data duplication means that if you'll have to rename "miniquiz" to "microquiz" you'd have to update all the rows. Finally, and probably the most important of all, your database wouldn't be able to reject unacceptable strings: You said that there are four types of assessments, but your database will happily accept any string you pass.

In general, you would want to create another table, maybe calling it assesTypes, with just id and name fields, and keep in it all the acceptable types. Then in your main table, make the assesType field a foreign key that references the id attribute of the new assesTypes table. Example:

CREATE TABLE assesTypes (
   id    int,
   name  varchar(15),
   PRIMARY KEY (id)
) ENGINE=INNODB;

CREATE TABLE assessments (
   student_id    int,
   assesType     int,
   mark          int,
   PRIMARY KEY (student_id, assesType),
   FOREIGN KEY (assesType) REFERENCES assesTypes (id)
) ENGINE=INNODB;

Now we can populate our assesTypes table:

INSERT INTO assesTypes VALUES (1, 'Test');
INSERT INTO assesTypes VALUES (2, 'Quiz');
INSERT INTO assesTypes VALUES (3, 'MiniQuiz');
INSERT INTO assesTypes VALUES (4, 'FinalExam');

And now let's insert some assessment data into the assessments table:

INSERT INTO assessments VALUES (1, 1, 55);
INSERT INTO assessments VALUES (1, 2, 65);
INSERT INTO assessments VALUES (1, 3, 75);

That's all good. We can now INNER JOIN the assessments table with the assessTypes table like this:

SELECT a.student_id, at.name, a.mark
FROM   assessments a
JOIN   assesTypes at ON (at.id = a.assesType);

For this result:

+------------+----------+------+
| student_id | name     | mark |
+------------+----------+------+
|          1 | Test     |   55 |
|          1 | Quiz     |   65 |
|          1 | MiniQuiz |   75 |
+------------+----------+------+
3 rows in set (0.00 sec)

Now let's try to insert an invalid assesType in the assessments table:

INSERT INTO assessments VALUES (1, 5, 75);

We can't. MySQL will report:

ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails

Foreign keys aren't required to have a working relational database, but they are essential to avoid broken relationships and orphan rows (ie. referential integrity). The ability to enforce referential integrity at the database level is required for the C in ACID to stand.

Daniel Vassallo
Hmmm Thanks for the answer, can you please elaborate what you mean by "your database will happily accept any string you pass." and why this does not apply to numbers?
Akay
@Akay: Let me update my answer with an example.
Daniel Vassallo
Oh ok i get it.. if the column is set to varchar(15) .. it will accept anything that can be a 15 char string, including numbers, =) Thanks for the help hope thats what you meant.
Akay
@Akay: Yep you got it. The idea is to put constraints on what data the database should expect, in such a way that invalid data would never be present in your database. This is one of the most important features of SQL databases.
Daniel Vassallo
A: 

You should use the second approach. The main advantage though is not performance, but maintainability of your data.

By storing the assesType in a separate table and referring to that table by ID you are sure that you don't get different spelling for the different assesTypes, you can easily rename an assesType without having to update your entire database. You also get the benefit of keeping a separate maintenance of the assesTypes, you can't accidentally create a new assesType when assigning a student to an assesType. You also do get the luxury of having a quick way of asking what kind of assesTypes you have when creating a list in the GUI for the available assesTypes.

You may look at database normalization for more "rules" about how to structure your data.

Albin Sunnanbo