views:

214

answers:

5

Hi,

I'm developing a new site, and I have to make private messages for our users. I have already made this on other projects, but the design there just doesn't seem right (I can't have more than two people engaged in a message for instance). So what is the "right" approach to this? I'd like to offer my users the same functionality as Facebook (again, I have already done this but it feels dirty :)) So the system should support 2 or more users in a conversation and thread-like messages.

I was thinking and one solution would be to have two tables like so:

pm_messages: id | pm_messages_id |user_id | title | content | date_time

pm_recipients: id | pm_messages_id | user_id | has_seen | deleted

I would store the actual content in the "pm_messages" table, and I would store the recipients (including the original sender) in the "pm_recipients" table.

Is this the right direction or am I completely off with this? What bothers me here is that messages don't really get deleted until all of the recipients have deleted the message which leads to some awkward deleting logic.

+1  A: 

I wouldn't say the implementation you have proposed in your post is necessarily a bad one. Certainly, it's not the most concise or quickest, but it would appear to me the most understandable to humans. Also, the deletion logic shouldn't be very difficult to encapsulate.

One solution that I might suggest is to rather use a single table that stores each message, containing the field for the sender ID, and another field which is a list of recipient IDs. The problem, of course, is deciding on how to represent a list of IDs using one of the standard database types, given that there is usually no array/vector/list type. I would suggest you use the VARBINARY(max) type, if it's available, treating it as a bit vector (say, 4 bytes per recipient ID). Then you can just create a couple of functions to do the very simple bitwise encoding/decoding to/from an array/list.

Noldorin
There would be no clean way to track who has seen or/and delete the message this way. Unless, of course, I'm missing something here with your proposal?Also I like to have my tables normalized unless absolutely necessary ...
Jan Hancic
Well, you would just use another bit vector field for that (where each bit represents a Boolean flag) - it's simple enough to implement, really. My proposed solution may not be the prettiest, but it's certainly quick, if you're after that!
Noldorin
Dunno, it sounds like I would have to do this in the database to get the desired results and I'm not aware of MySQL supporting this.
Jan Hancic
(contd.) As I originally said, there's nothing wrong with your first approach... so go with that an just encapsulate the deletion logic if you like. Besides the issue of speed, it's almost down to aesthetics. My solution nonetheless seems valid... why the downvote, may I ask?
Noldorin
Do you have some links demonstrating your approach? I would like to try different things before I make up my mind. I just don't understand fully how your solution would work :)
Jan Hancic
According to http://dev.mysql.com/doc/refman/5.0/en/string-types.html, MySQL seems to support VARBINARY. Did you mean something else?
Noldorin
Sorry, I'm afraid I can't find any descriptions of it online. If you think of it as the binary equivalent of a comma separated list, perhaps it will make sense to you. Indeed, it's probably worth trying a few different solutions (and maybe benchmarking). They all have their pros and cons.
Noldorin
Using VARBINARY in this way is an intentional denormalization, which imo should only be done in cases of a performance issue that this approach would address.
gridzbi
dreidy, I see your point. This is why I recommend to the OP that he benchmarks the various methods and perhaps uses a more standard approach unless he needs the performance boost.
Noldorin
Performance is not an issue. And I think that this "more complex" approach isn't worth it. Thanks for the suggestion never the less, maybe it will come in handy somewhere else :)
Jan Hancic
No problem. And I would agree, the more standard table-based approach that takes advantage of database structure would be more suited if you're not worrying about performance.
Noldorin
Why would you want to do this? You'd loose out on referential integrity and worse, this premature optimization would be slower. Lets say you want a "New Messages" flag on the recipients profile. you'd have to dive into the array to find out which messages are for the recipient. Slow, slow slow!
Cory R. King
@Cory: I think you're considering this in a slightly narrow way. Your point about referential integrity is surely valid (as I've pointed out). There is however a clear performance benefit, and the solution here partly depends on how the data gets used... Still, the OP seems to know what he needs.
Noldorin
+3  A: 

If there can be multiple recipients, and they can send reply msgs, you are more dealing with some kind of chat application. You could store the "chat" sessions, or conversations in a seperate table with 1-n relation between conversation and participant, as well as 1-n relation between conversation and message (tables laid out below). But in the end, it's of course up to you. For regular message sending, a 1-n between message and recipient as you use will do.

table user:
- id (pk)
- name

table conversation (one entry per "chat/messaging" session)
- id (pk)
- started_by_user_id
- started_ts

table conversation_participant (keeps track of all recipients)
- id (pk)
- conversation_id
- user_id (refers to user.id)

table message
- id
- conversation_id (refers to conversation.id)
- sender (refers to user.id)
- msg
tehvan
Jan Hancic
conversation is like a chat session. It can contain many messages.
tehvan
You can do that with my design to, and it's one table less. Any other ups?
Jan Hancic
Cory R. King
A: 

What is the pm_messages_id column in the pm_messages table for?

Otherwise it makes sense... But I don't see why the deleting logic should be awkward. You could handle it one of two ways:

  1. After a user deletes it: delete if there are no more recipients.
  2. As a cron task or manually at a later time: There's no reason why this deletion needs to happen right away. They're just orphan records, and they can be found easily:

e.g.:

DELETE FROM pm_messages
RIGHT JOIN pm_recipients ON pm_messages.id = pm_recipients.pm_messages_id
Andrew Vit
Awkward in a sense that it isn't a simple DELETE FROM clause. And I like simple :)The pm_messages_id references the "parent" message so you can have threaded messages. A tree like structure if you will.
Jan Hancic
A: 

(Too long for a comment)

Tehvans method involves storing a list of participants for a particular conversation, whereas in your method the participants are stored per-message. I expect the reason for this is to allow deleting and read-flagging, the question is - why do it like this?

Forums don't normally require you to mark each part of the conversation as read, so instead store a last_read timestamp in the participants table, that way any messages created/modified after the timestamp can be highlighted as such.

In (nearly) all cases, deleting of forum messages is done by a the author/admins, and causes the message to be removed from view to all users.

gridzbi
A: 

One more use-case to consider (I know my users want this)... they all want a "sent items". Might want to consider that use case before you clean-house so to speak.

What bothers me here is that messages don't really get deleted until all of the recipients have deleted the message which leads to some awkward deleting logic.

Create a trigger that does this for you. All your application code needs to do is worry about setting your "deleted" column to true and have the UPDATE trigger nuke the whole deal when everybody flags the message as deleted.

Course, next thing you know, your users will want un-delete. Personally, I'd never delete the message from the database, just hide it from the user.

Cory R. King