views:

105

answers:

7

As an example, I have a 3 tables:

School: ID int, Name varchar

Student: ID int, Name varchar

StudentInSchool: StudentID int, SchoolID int

Now the question is whether I should put a column ID int with a primary key on it in StudentInSchool table? If yes, why?

Will it be helpful in indexing?

Any help appreciated.

+7  A: 

Personally, I create composite PK (StudentID and SchoolID) on such junction tables. This also ensures uniqueness.

If, however, uniqueness is not required, you'll have to add an ID column to uniquely identify each row.

Generally speaking, addition of a separate ID column will not help much: very few queries (if any) will actually use this column. As for performance, you can create separate index for each column and you'll be just fine.

Anton Gogolev
+1. I take the same approach
Alex
Depends on the cardinality of the relation we want to describe using the junction table. Enforcing one-to-one -relation may backfire when we try to model the real world.
Aleksi
+1  A: 

Create a primary key on StudentID, SchoolID and a secondary index on SchoolID, or vice versa, depending on what search condition is used more often.

If your table is index organized (ORGANIZATION INDEX in Oracle, CLUSTERED in SQL Server, InnoDB in MySQL), then the secondary index will have a PRIMARY KEY as a leftmost part and, hence, all information can be fetched out of the index.

Quassnoi
A: 

The answer is, it depends. In most cases the answer is 'No': a compound primary key of (StudentID, SchoolID) will suffice.

But if that intersection table starts to acquire other related data (say, joining date, leaving date) and/or it becomes a parent of related tables (e.g. attendance record) then you may want or need to treat it as a regular table. In which case (StudentID, SchoolID) becomes a business key (i.e. still unique) and you add a synthetic (or surrogate) primary key of Id or whatever.

APC
A: 

In terms of pure data integrity: no. It's perfectly sufficient to define the primary key as (StudentID, SchoolID).

However, you don't say which RDBMS you are using. It may be that, for some of them, a single ID column would result in more efficient query plans.

In the case of SQL Server, a composite primary key of two integers is very efficient, and no further indexes should be required on the two columns.

Christian Hayter
A: 

In this example, unless the StudentInSchool table is going to have other attributes, e.g. timestamps for when the student was in that school to cope with moves, I wouldn't use it and I'd put the schoolID field in the Student table and define it as a foreign key there.

But if this is the design, then yes, you're not going to be losing anything by putting a primary key on the StudentInSchool table.

Unsliced
A: 

You could combine StudentID and SchoolID to one primary key.

There are some general rules which describe when to use indexes. When dealing with relatively small tables, indexes do not improve performance. In general indexes improve performance when they are created on fields used in table joins. Use indexes when most of your database queries retrieve relatively small datasets, because if your queries retrieve most of the data most of the time, the indexes will actually slow the data retrieval. Use indexes for columns that have many different values (there are not many repeated values within the column). Although indexes improve search performance, they slow the updates, and this might be something worth considering.

Source: SQL Indexes

R van Rijn
A: 

Ok I think there is something missing in the assignment, so I'll try with my poor knowledge of real world :o)

What are students? They go to school(s), they may study at more than one school (especially universities), they may even repat same school later, etc.

Is the junction table as-is (with PK over both ids) enough to model these relationships?

short answer: no

long answer: still no, but for subset of simple cases it is sufficient (is yours one of them?).

If you want to extend db later for all these cases, surrogate PK (your ID) will be required. I would put ID there if I have just a doubt it might be required (as there's not much to lose).

As stated in the first sentence - correct answer is: "We don't know" as requirements and context of application are missing.

MaR