ansaurus

Question

how to Solve the "Digg" problem in MongoDB

Answer 1

+1 A:

One way to do this would be to add an array of "friends" to each post.

{
  date: Date(...)
  friends: ['me', 'you', 'thatguy']
  ...
}
db.posts.ensureIndex({friends:1, date:-1})

Then you could easily display my page by doing this: db.posts.find({friends:'me'}).sort({date:-1})

This will work as long as each user has less than about 200,000 friends; you may need to special-case posts from users with more than that. One way would be to just split the friend list into multiple chunks of 100,000 and make one post entry per chunk

mstearn 2010-05-13 20:09:01

I am assuming that in your example "friends" is an array of references to other users .I am assuming that the users are stored in a users collection and the "friends" array is merely a list of reference to users in the users collection. So if I wanted to find all posts by my friends, then would I not have to do a "join" ? (I apologize that for using the terms such as joins etc. as I am coming from a SQL background)

2010-05-14 12:31:30

Friends should be an array of whatever you use as the _id in your user objects. This can be either a string or an ObjectID; both have advantages and disadvantages.I wouldn't do a join or join-like operation. Instead, just store the current user's _id in a cookie or in your session object. Then just replace the string 'me' in my example query with the current user's _id and it will do what you want. By default, if you query and array field using a single value in the query, mongo will check if the query value is in the array.

mstearn 2010-05-14 21:12:37

Thanks !! I would be trying this solution.

2010-05-16 00:12:56

Answer 2

+1 A:

There are many possible solutions with mongo. You can still store the diggs in a top-level table (a.k.a. collection) like a relational db, but additionally could store diggs as an array in the item collection or the user collection. Similarly the friend relation could be kept as an array in the user collection in either the forward or reverse direction.

Probably the most straight-forward approach would be an array of diggs in the items, and an array of friends in the users. Then a simple indexed query to retrieve a user's friends is followed by an "in" query on the indexed items.diggs.userid field.

Mongo's own documentation of the $in operator actually uses this example.

Coady 2010-05-13 22:38:00

For Digg, the "in" query was the problem. The friends list came from MemCached and they did a "In" query which was a bottleneck. (From my experience with using 'in' SQL , it is not the optimized way of queries )From your experience , how has the $in operator performance been , with let say about 50 values to be used in "in" set.

2010-05-14 12:40:44

ansaurus

tags:

views:

answers:

how to Solve the "Digg" problem in MongoDB

related questions