How are application like twitter implemented?

views:

205

answers:

+1 Q:

How are application like twitter implemented?

Suppose A follows 100 person,

then will need 100 join statement,

which is horrible for database I think.

Or there are other ways ?

+6 A:

Why would you need 100 Joins?

You would have a simple table "Follows" with your ID and the other persons ID in it...

Then you retrieve the "Tweets" by joining something like this:

Select top 100 tweet.* from tweet inner join followers on follower.id = tweet.AuthorID where followers.masterID = yourID

now you just need a decent caching and make sure you use a non locking querry and you have all information... (Well maybe add some userdata into the mix)

Edit:

ID - tweetid AuthorID - ID of the poster

Followers

MasterID - (Basically your ID) FollowerID - (ID of the person following you)

The Followes table has a composite ID based on master and followerID It should have 2 indexes - one on "masterID - followerID" and one on "FollowerID and MasterID"

Heiko Hatzfeld 2009-09-12 13:51:07

+1, it's worth noting that this is a classic N-to-M relation, and the standard solution is a link table like you describe.

Henk Holterman 2009-09-12 14:39:04

what's the solution for N*M*M relation then?

Shore 2009-09-14 07:50:52

whats N-M-M? Please decribeThe above table can be used in both directions. Say column 1 is master and 2 is follower.. so a row 1-2 means user 2 follows user one, and 2-1 means that user 1 follows user 2...

Heiko Hatzfeld 2009-09-14 21:16:12

N-to-M-to-M

Shore 2009-09-16 18:02:43

You "only" have a relation between 2 tabels. So n-m-m would be implemented as n-m and m-m (which could also be called an extension table - assuming the 2 M's are allways identical)

Heiko Hatzfeld 2009-09-17 05:50:16

+1 A:

The real trick is to minimize your database usage (e.g., cache, cache, cache) and to understand usage patterns. In the specific case of Twitter, they use a bunch of different techniques from queuing, an insane amount of in-memory caching, and some really clever data flow optimizations. Give Scaling Twitter: Making Twitter 10000 percent faster and the other associated articles a read. Your question about how you implement "following" is to denormalize the data (precalculate and maintain join tables instead of performing joins on the fly) or don't use a database at all. <-- Make sure to read this!

D.Shawley 2009-09-12 14:11:41

Interesting link but I'm -1ing this since it's *way too soon* to start worrying about denormalising for scalability reasons -- the OP doesn't even know what a normalised DB is! Do you really think performance is top priority here? Please put a disclaimer at the top saying "When your site has 1 million active users, you should maybe think about considering the following."

j_random_hacker 2009-09-13 06:22:00

The OP did ask how an application like Twitter is implemented? I agree that it is overkill in this case. We have run into scaling problems with considerably fewer than one million users - 10000 or so client requests per second can cause problems if your processing requires more than just fetch a few rows and return them.

D.Shawley 2009-09-13 13:09:00

Sure, 1 million users was just a wild guess. If you put your more realistic estimate in a disclaimer at the top I'll happily +1.

j_random_hacker 2009-09-14 03:10:44

ansaurus

tags:

views:

answers:

How are application like twitter implemented?

tweet

Followers

related questions