Even if you could put the identity sequence across multiple tables, your comment table is not going to be able to reference both columns in a single foreign key.
The best way to do this, in terms of relational database design theory, would be to create two comment tables. But obviously, you want to avoid that, probably for code-reuse reasons.
The most straightforward pragmatic approach would be to put two foreign key columns on the comment table, and just make one null and the other not null for each comment.
Another approach, which might be the best compromise, is this. You refer in your question to an "entity ID". So make an Entity table! Then the authors and books and comments can all refer to that table.
Edited to add:
Philip Kelley, Ray, and (I think) Artic have all suggested modifying the comment table by adding an entity_id
, which can refer to either the book_id
or the author_id
, and a flag of some sort (char(1)
, tinyint
, and boolean
, respectively) that indicates which of these is being referred to.
This is not a good solution for many reasons, both pragmatic (including data integrity, reporting, efficiency) and theoretical.
The first and most obvious problem is the data integrity problem. A relational database system should always be responsible for maintaining the integrity of its own data, and there are natural and preferred ways that the DB is designed to do this. One of the most important of these mechanisms is the foreign key system. If the comment.entity_id
column is to reference both book.book_id
and author.author_id
, then a foreign key cannot be created for this column.
Sure, you could put a check in your DML (insert, update, delete) stored procedures to verify the references, but that would quickly turn into a big mess, as all DML operations on all three tables would be involved.
And that leads us to the efficiency problem. Whenever a query is run against the comment
table, it will require joins to either the author
or book
table or both. The query plan generation system will not have foreign keys available to optimize with, so its performance could very well be degraded.
Then there are problems with this scheme in reporting. Any report generating system is going to have trouble with this sort of system. Sure this won't be a problem for expert programmers, but any user ad-hoc reports are going to have to mock up the logic behind when the event_id
means this or that, and it could be a pretty bad deal. Maybe you won't ever use report generating tools on this database. But then again, nobody knows where a database is going to be ultimately used. Why not work with the system to allow for anything?
And that leads us to the theoretical problems.
In relational database theory, each row (a.k.a. "tuple") in each table ("relation variable") represents a proposition about the real world. Designing a table is to decide the form of that proposition. Let's look at a few examples of how this might work.
comment (comment_id int, comment_type char(1), entity_id int,
user_id int, comment_text nvarchar(max), comment_date datetime)
/* comment_id identifies a comment (comment_text) that a user (user_id)
has made about a book (entity_id if comment_type = 'B') or author
(entity_id if comment_type = 'A') at a particular date and
time (comment_date).*/
Here it is clear that the column (or "attribute") called entity_id
is doing double-duty. It doesn't really represent anything, except with reference to another column. This is workable, but unsatisfactory.
comment (comment_id int, book_id int, author_id int, user_id int,
comment_text nvarchar(max), comment_date datetime)
/* comment_id identifies a comment (comment_text) that a user (user_id)
has made about a book (book_id if not null) or author (author_id if
not null) at a particular date and time (comment_date). */
This buys us the foreign keys that are the biggest omission from the first version. But this still isn't terribly satisfactory, unless a single comment can refer to both a book and an author (which might be reasonable). Nullable columns are a warning sign that something is wrong with the design, and that may be the case here as well. A check constraint may be necessary to avoid a comment that refers to nothing at all, or to both a book and an author if that is not to be allowed.
From a theoretical perspective (and thus, my perspective :)) there is a clear best option:
book_comment (book_comment_id int, book_id int, user_id int,
comment_text nvarchar(max), comment_date datetime)
/* book_comment_id identifies a comment (comment_text) that a
user (user_id) has made about a book (book_id) at a particular
date and time (comment_date). */
author_comment (author_comment_id int, author_id int, user_id int,
comment_text nvarchar(max), comment_date datetime)
/* author_comment_id identifies a comment (comment_text) that a
user (user_id) has made about an author (author_id) at a particular
date and time (comment_date). */
This last option would provide the best efficiency, data integrity, and ease of reporting. And the only expense would be that the DML stored procedures would need to put the comments into the right tables, which is not a big deal, since they had to know what the comments were referring to anyway.
If your plan was to retrive all comments for a book or author at once, then you can easily create a view on top of these tables that reproduces the other designs, if that's what you want to do.
create view comments as
select
book_comment_id as comment_id,
book_id as entity_id,
comment_text,
'B' as comment_type
from book_comment
union
select
author_comment_id as comment_id,
author_id as entity_id,
comment_text,
'A' as comment_type
from author_comment