views:

323

answers:

2

I am working on a social network type site in PHP, I have done this once before and the site outgrew my coding ability to keep up, this was a couple years back and now I am wanting to tackle this project again.

Basicly on my network there is a friend_friend mysql table that keeps track of who is who's friend, for every confirmed friend, there are 2 entries into the DB here is that table:

    CREATE TABLE IF NOT EXISTS `friend_friend` (
  `autoid` int(11) NOT NULL AUTO_INCREMENT,
  `userid` int(10) DEFAULT NULL,
  `friendid` int(10) DEFAULT NULL,
  `status` enum('1','0','3') NOT NULL DEFAULT '0',
  `submit_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `alert_message` enum('yes','no') NOT NULL DEFAULT 'yes',
  PRIMARY KEY (`autoid`),
  KEY `userid` (`userid`),
  KEY `friendid` (`friendid`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1657259 ;

I then have a user table with all users info called friend_reg_user

Then a table for bulletins that users post, the object is to only show bulletins from users who you are friends with. Here is bulletins table

CREATE TABLE IF NOT EXISTS `friend_bulletin` (

  `auto_id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` int(10) NOT NULL DEFAULT '0',
  `bulletin` text NOT NULL,
  `subject` varchar(255) NOT NULL DEFAULT '',
  `color` varchar(6) NOT NULL DEFAULT '000000',
  `submit_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `status` enum('Active','In Active') NOT NULL DEFAULT 'Active',
  `spam` enum('0','1') NOT NULL DEFAULT '1',
  PRIMARY KEY (`auto_id`),
  KEY `user_id` (`user_id`),
  KEY `submit_date` (`submit_date`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=455144 ;

Ok so to do this I would either run a query on the friend_friend table to get all friends of a user and add them to a string like this 1,2,3,4,5,6 those would be friend ID numbers and then select from bulletin table where bulletin author ID is in my friend ID list

The second method is to use JOINS to get all this data at once.

My quest now finally, once the site gets very large, when there are millions of friends records and bulletins in the DB this all slows down, what are my options to speed things up? Is there a better way to do this? Also I am planning on changing bulletins to include more then just bulletins but do more of user actions like the big sites do now so it will show status updates and blogs and bulletins and all

A: 

Facebook uses memcached to store SQL databases as distributed hash tables. That's probably your best bet.

Mark P Neyer
sounds easier said than done :-)
Brian Ramsay
I have a friend who also owns a large network and he mentions using some sort of hash table for fast lookups but he is very secretive about his codes and methods, all I know is his stuff works great and he mentions hash as well, I can never find much info on how to do such a thing though
jasondavis
A: 

What you are looking to do can likely be done in a number of ways. You can have a summary rollup table that combines all of the associated data (friends in this instance) for a given member.

That is a pretty basic approach but it can become much more sophisticated.

Summary rollups act as a persistent caching mechanism. You'll have to keep this up to date by some method - a cron job, MapReduce, etc. You dont want to compute all that data every time you need it - instead, compute it at regular intervals so that it is ready quickly.

Memcache is a great tool for caching but that caches data that has to be computed at some point anyway. Unfortunately, Memcache is not persistent. That means that if the memcached servier or service dies, so does your data.

You can explore some advanced cutting edge technologies such as MongoDB, CouchDB, Project Voldemort and neo4j for some even more efficient tools.

Id also recommend looking at the source code for the open source PHP based social network Elgg at http://www.elgg.org/

Bryan Migliorisi