views:

235

answers:

3

I'm designing a News Feed system using PHP/MySQL similar to facebook's.

I have asked a similar question before but now I've changed the design and I'm looking for feedback.

Example News:

User_A commented on User_B's new album.

  "Hey man nice pictures!"

User_B added a new Photo to [his/her] profile.

     [show photo thumbnail]

Initially, I implemented this using excessive columns for Obj1:Type1 | Obj2:Type2 | etc..

Now the design is set up using a couple special keywords, and actor/receiver relationships. My database uses a table of messages joined on a table containing userid,actionid,receiverid,receiverObjectTypeID,

Here's a condensed version of what it will look like once joined:

News_ID | User_ID |                  Message                   |     Timestamp

  2643       A       %a commented on %o's new %r.                  SomeTimestamp
  2644       B     %a added a new %r to [his/her] profile.         SomeTimestamp

%a = the User_ID of the person doing the action

%r = the receiving object

%o = the owner of the receiving object (for example the owner of the album) (NULL if %r is a user)

Questions:

  1. Is this a smart (efficient/scalable) way to move forward?

  2. How can I store a "Preview of the event"? For example, if I want to show the comment that User_A made to User_B (like above, and on facebook's news feed). I have considered using an encoded copy of only the relevant data.. for example JSON encoding the comment text or photo html.. but this seems fragile (the user may delete the photo while it's still in another users' feed)

  3. How can I show messages like: "User_B added 4 new photos to his profile." with thumbnails of the photos?

+1  A: 
  1. Yes it's the bast way to move forward with this. Those are messages that live for a very short time so you'll never have to update them back in time if you make changes to the HTML of the message you store etc. So you should be in good shape using this.

  2. Just use the plain HTML. It will be fast and there are no relations that you will have to enforce later, you'll never have to be able to update those or anything like that.

Edit: I actually misunderstood, I didn't know you intended those %s-things to be some special notation for what objects you are referencing. I would just place plain HTML notifications in that text.

arnorhs
Storing the html to link each object and preview each action (and link each image) seems brittle. If the user changes or removes their picture, the database stores the broken link
pws5068
My design above tries to approximate that as much as possible, while separating the information about Objects that may change with special replacements.. but there probably is a more traditional (better) way
pws5068
it may seem "brittle" (never seen that word before, interesting) but the fact is that the chances of that happening from when the activity happened and until the user sees the notification is minimal, also, why do you mention images? We're talking notifications, similar to FB's ? So that's a username/user ID, and your image is probably just requested from the ID/user, you wouldn't be referencing a file name anyways.
arnorhs
I mention photos for the case of a user adding images to a gallery, and I want to be able to preview those in the news feed. So on facebook, an event might say: "Paul added 10 new photos to his gallery" and it will show thumbnails for the first 4 or 5 in the feed
pws5068
+1  A: 

Hey,

When you are talking in regards to Facebook, you must think of sender and receiver.

Say when you want to see all messages/feeds to B? then you must fire a query right? I think your table seems fine, but also add column for receiver person's ID. May be you can keep that in separate table.

So that you can easily find all feeds for user B OR from user B OR from user A to user B

hope this might be helpful to you.

But ignore this if you only want to concentrate on feeds. then you want only the latest ones. What i thought is w.r.t. facebook where we can see anyone's profile with walls he got from diff. users.

thanks.

Paarth
+2  A: 

Having built something similar just recently, one thing I would suggest is to separate the idea of how to store the data from performance. In my case, the users need to be able to go back and look at news from any time period, so arnorhs' assumptions don't work (regardless, there's no reason to store HTML if you don't have to-- leave the formatting outside).

What I found was that I store the stuff in a couple of classes, ActivityType and Activity. ActivityType holds the format for the message (like your '%a commented on %o's new %r') and an indicator of whether it represents actual activity or a comment on someone else's activity (so I know which object to link to, the actor's activity or the actor of the activity commented upon) and Activity stores the actor, victim, primary key of the object, a primary key to the commented-upon object if it exists and the timestamp of when it occurred.

Which is all great and results in nicely-normalized data. Which slows to a crawl as soon as you have a half-dozen friends (performance is complicated by the fact the whole thing is location-based so I'm looking up the distance each that user is away from you). Everybody is looking for an excuse to play with NoSQL storage systems now, but this is actually a good one. You're going to have to de-normalize the hell out of the data to get decent performance from a relational database. And the stuff's hard to cache due to the various intersections of relationships. Think about storing the data in MySQL but getting it back out of a NoSQL storage system.

Tom
Thank you tom, I'm beginning to research NoSql based on your suggestions. Is there an article or other resource that I might find useful for understanding/implementing such a strategy? Also, will I need to restructure the existing relational database to accommodate NoSql retrieval?
pws5068
I've got a few bookmarks at http://delicious.com/yerfatma/nosql -- te last one in the list claims to be "NoSQL Required Reading". The database and the nosql store are two separate things. They don't need to know about each other at all. The idea is you get your data out of the db as few times as possible; once you have the data in PHP code, cache it in the nosql store so you can get it back from there until it gets invalidated.
Tom
This is much clearer now, I will continue my research. Thank you for your help
pws5068