Assume we have a popular site. We need to implement mail-like messaging between users. Typical solution is to use 2 tables:
Users (user_id)
Messages (message_id, sender_id (references user_id), receiver_id (references user_id), subject, body ).
This method has 2 significant limitations
- All messages of all users are stored in one table leading to it's high load and decreasing overall database performance.
- When someone needs to send message to several users simultaneously, the message gets copied (recipients_count) times.
The other solution uses 3 tables:
Users(user_id)
Sent_messages(sent_id, sender_id (references user_id), subject, body)
Received_messages(sent_id, receiver_id (references user_id), subject, body)
subject and body of received_messages are copied from corresponding fields of sent_messages.
This method leads to
- Denormalizing the database by copying information from one table to another
- Users can actually delete sent/received messages without removing them from the receivers/senders.
- Messages take approximately 2 times more space
- Each table is loaded approximately 2 times less.
So here go the questions:
- Which one of considered design is better for high load and scalability? (I think it's the second one)
- Is there another database design that can handle high load? What is it? What are the limitations?
Thanks!
P.S. I understand that before getting to these scalability issues the site has to be very successful, but I want to know what to do if I need to.
UPDATE
Currently for the first versions I'll be using design proposed by Daniel Vassallo. But if everything is OK in the future, the design will be changed to the second one. Thanks to Evert for allaying my apprehension about it.