views:

579

answers:

3

This may be a hairy question but. Say I have

Followers:
-user_id
-follower_id

Activities:
-id
-user_id
-activity_type
-node_id

Pulling a users activity is fairly easy. But what is the best way to get a followers activity? A subselect? It seems like it is incredibly slow as users get more and more followers. Any ideas to speed this up?

Also, on a more conceptual level. How does the grouping work. Is it all done with a single query? Or is all the activity data pulled in and then sorted and grouped on the PHP side?

Users X, Y and Z did Activity A User J did 3 of Activity B

+2  A: 

Subselects are often slower than JOINs, but it really depends on what exactly you're doing with them. To answer you main question, I would get follower data with a JOIN:

SELECT * FROM followers f
LEFT JOIN activities a ON f.follower_id=a.user_id
WHERE f.user_id=$followedPerson

That's assuming that the followers table represents a user with user_id, and someone who is following them with a follower_id that happens to be a user_id in the users table as well.

It won't ever be incredibly slow as long as you have an index on followers.user_id. However, the amount of data such a query could return could become larger than you really want to deal with. You need to determine what kinds of activity your application is going to want to show, and try to filter it accordingly so that you aren't making huge queries all the time but only using a tiny fraction of the returned results.

Pulling data out and grouping it PHP side is fine, but if you can avoid selecting it in the first place, you're better off. In this case, I would probably add an ORDER BY f.follower_id,activity_date DESC, assuming a date exists, and try to come up with some more filtering criteria for the activity table. Then I'd iterate through the rows in PHP, outputting data grouped by follower.

zombat
I think the notion of date is pretty crucial, but maybe it's just an omission of Josh.
Wookai
A: 

I dont know if I understood correctly what you need but I would try this select, if I'm right you should get all activity for all followers of #USERID#

SELECT a.* FROM Activities AS a 
INNER JOIN Followers AS f1 
ON a.user_id = f1.follower_id
WHERE f1.user_id = #USERID#
Gabriel Sosa
+1  A: 

An activity log has the potential for a very large number of records since it usually has a mix of the current user's activity and all their friends. If you are joining various tables and a user has 100s of friends that's potentially a lot of data being pulled out.

One approach is to denormalise the data and treat it as one big log where all entries that should appear on a user's activity log page to be stored in the activity log table against that user. For example if User A has two friends, User B and User C, when User A does something three activity log records are created:

record 1: "I did this" log for user A
record 2: "My friend did this" log for user B
record 3: "My friend did this" log for user C

You'll get duplicates, but it doesn't really matter. It's fast to select since it's from one table and indexed on just the user ID. And it's likely you'll housekeep an activity log table (i.e. delete entries over 1 month old).

The activity log table could be something like:

-id
-user_id  (user who's activity log this is)
-action_user_id  (user who took the action, or null if same as user_id)
-activity_type
-date

To select all recent activity logs for a single user is then easy:

SELECT * from activity_log WHERE user_id = ? ORDER by date DESC LIMIT 0,50

To make this approach really efficient you need to have enough information in the single activity log table to not need any further selects. For example you may store the raw log message, rather than build it on the fly.

simonrjones
What if I just added you as a friend and I want to see your last activities ? Do we have to retrospectively add a user's activities to a new follower ?
Wookai
good point. In the above example, you wouldn't see the user's last activities in your own activity stream. You'd only see new entries. Which I think is acceptable. You can always go to the user's public page to see their own activity log (if it is public to other users).
simonrjones
SO what you are saying, if I have 5,000 friends/followers and I make a post, I should create 5,000 entries into the DB for each action I do? I hope I misunderstood your answer otherwise that will be a horrible answer no disrespect.
jasondavis