views:

256

answers:

6

It seems like implementing web-app like twitter/facebook-wall needs 1 huge "feeds" relational table (+ a user table) and an awesome caching mechanism.. ( can you recommend one? )

my main question is, how would you implement such a "feature" using a non-relational DB, e.g. a key/value kind of DB?

Obviously, I had like to support the amount of users using twitter concurrently and in general.

Thanks

A: 

You can read how twitter did it over here: http://highscalability.com/blog/2010/2/19/twitters-plan-to-analyze-100-billion-tweets.html

Also read this: http://highscalability.com/scaling-twitter-making-twitter-10000-percent-faster

No data models but quite a lot of information about how ;)

WoLpH
A: 

Obviously, I had like to support the amount of users using twitter concurrently and in general.

Sorry but this requirement is far away from being realistic. Twitter has a huge server farm to shard the data for supporting their massive concurrency, do you have as much servers as twitter?

There is an architectural idea implementing a twitter clone with redis: TwitterAlikeExample

Tobias P.
it's a theoretical question, so let's assume that I do have such a farm..
MrOhad
for the redis architecture: add sharding and replication as twitter does.
Tobias P.
A: 

Have a look at Kestrel, the message queue system that Twitter uses

http://github.com/robey/kestrel

http://www.google.com/search?q=kestrel+twitter

dwich
A: 

Using some distributed store will be good idea 1. Cassandra 2. HBase

Why you need twitter like concurrency right now?. Do you have that many users, right now?. Hope you get that kind of userbase but investing for it right now doesn't look a good idea.

Zimbabao
it's a theoretical question, lets say I am using Cassandra, what know, how would I design the key/values ...
MrOhad
+2  A: 

I'd use Redis. Queue of keys per user + set of blobs retrieved by these keys.

vartec
A: 

I'll throw MongoDB into the list.

Schema is going to be pretty simple.

TWEETS
UserName (or a UserID if want to normalize a bit)
TweetID (a unique number)
Timestamp
Tweet (text of tweet)

USER
UserID (optional) UserName
Name, Email, Personal info (web url, etc) Password (hash)
Followers (repeating user ref)
Following (repeating user ref)

boomhauer
ok, how do you query efficiently for all the tweets of the people that one is following
MrOhad