This is a question about tradeoffs.
Imagine a social network. Each user has a status message, that he can change anytime. Whenever he does change it, all his friends are notified through a wall (like in Facebook).
To make this work. We have 3 tables Users(id, name), FriendLists(userId, friendUserId), Notifications(?).
Now let's assume each user has approximately 50 friends in his friend list. I am faced with the dilemma - how to implement Notifications table.
1st option
CREATE TABLE Notifications
(
toUserId bigint NOT NULL,
[identity] bigint IDENTITY(1,1) NOT NULL,
fromUserId bigint NOT NULL,
data varchar(256) NOT NULL,
CONSTRAINT [PK_Notifications] PRIMARY KEY CLUSTERED (toUserId, [identity])
)
Send notifications:
-- Get all friends of @fromUserId.
WITH Friends AS
(SELECT FriendLists.friendUserId
FROM FriendLists
WHERE userId = @fromUserId)
-- Send updates to all friends.
SELECT
friendUserId as toUserId,
@fromUserId as fromUserId,
@data as data
INTO Notifications
FROM Friends
In this case, for each status change we create 50 records (assuming 50 friends). This is bad. However the good point is that to retrieve notifications for a specific user it is really fast, since we have a clustered index on the toUserId.
2nd option
CREATE TABLE Notifications
(
toUserId bigint NOT NULL,
[identity] bigint IDENTITY(1,1) NOT NULL,
fromUserId bigint NOT NULL,
data varchar(256) NOT NULL,
CONSTRAINT [PK_Notifications] PRIMARY KEY CLUSTERED ([identity])
)
CREATE NONCLUSTERED INDEX [IX_toUserId] ON Notifications (toUserId ASC)
Send notifications:
-- Get all friends of @fromUserId.
WITH Friends AS
(SELECT FriendLists.friendUserId
FROM FriendLists
WHERE userId = @fromUserId)
-- Send updates to all friends.
INSERT INTO Notifications(toUserId, fromUserId, data)
VALUES(friendUserId, @fromUserId, @data)
Here we only insert a single record per status update. This is good. The bad point is that the retrieval of the notifications is going to be slower, since records are not clustered by toUserId.
Getting notifications is same for both methods:
SELECT TOP(50) fromUserId, [identity], data
FROM Notifications
WHERE toUserId = @toUserId
So what is your take on this?