'Followers' and efficiency

views:

answers:

+2 Q:

'Followers' and efficiency

I am designing an app that would involve users 'following' each other's activity, in the twitter sense, but I am not very experienced with database/query design/efficiency. Are there best practices for managing this, pitfalls to avoid, etc.? I gather this can create a very large load on the db if not done properly (or maybe even then?).

If it makes a difference it is likely that people will 'follow' only a relatively small number of people (but a person may have many followers). However this is not certain, and I wouldn't want to count on it.

Any advice gratefully received. Thanks.

+2 A:

Pretty simple and easy to do with full normalisation. If you have a table of users, each with a unique ID, you would have a TABLE_FOLLOWERS table with the columns, USERID and FOLLOWERID which would describe all the followers for each user as a one to one to many relationship.

Even with millions of assosciations on a half decent database server this will perform well and fast as long as you are using a good database (IE, not MS-Access).

Tom Gullen 2010-07-29 08:59:42

+1 A:

That depends on how many users you expect to need to support; how many followers you expect users to have; and what sort of funding/development-effort you expect to have access to should your answers to the previous questions prove optimistic.

For a small scale project I would likely ignore the database, design the application as a simple object model with User objects that maintain a List[followers]. Keep it all in RAM for normal operation and use an ORM to persist to a database periodically (probably postgresql or mysql).

For a larger project I would not be using a relational database at all; but exactly what I would use would depend on the specific details of the project.

If you are only trying to spike the concept, go with the ORM approach; but, keep in mind it won't scale.

Recurse 2010-07-29 10:29:38

Would you mind pointing me in the direction of some introductory material on RAM object storage? What technologies are we talking about, in particular? Something like Redis?

Chris 2010-07-29 12:07:19

For a spike I actually mean maintaining simple datastructures directly in RAM. Assuming 100,000 users, with an average of 100 followers, and a simple ~100-byte object per user, and 4-byte references you only require ~40MB for the follower graph and 10MB for the user DB. Even with indexing overhead of 3x that is easily something you can fit in RAM and persist without too much difficulty to a DB.

Recurse 2010-07-29 12:46:41

+1 A:

You probably should read http://highscalability.com/ and it's articles on how this is managed by the big sites.

Christian 2010-07-29 10:33:04

+1 A:

The model is fairly simple. The problem is in the size of the Subscription table; if there are 1 million users, and each subscribes to 1000, then the Subscription table has 1 billion rows.

alt text

Damir Sudarevic 2010-07-29 12:11:15

ansaurus

tags:

views:

answers:

'Followers' and efficiency

related questions