views:

80

answers:

1

Hey Everyone,

Let me explain my problem, and hopefully someone can offer some good advice.

I am currently working on a web-app that stores information and meta-data for a large amount of applications. For each application there could be anywhere from 10 to 100's of comments that are tied to the application and an application version id. I am using MongoDB because of a need for easy future scalability and speed. I have read that comments should be embedded in a collection for read performance reasons, but I'm not sure that this works in my case. I read on another post:

In general, if you need to work with a given data set on its own, make it a collection.
By: @kb

In my case however I don't need to work on the collection by themselves. Let me explain further. I will have a table of apps (that can be filtered) and will dynamically load entries as you scroll, or filter, through the list of apps. If I embed the comments within the application collection, I am sending ALL the comments when I dynamically load the application entry into the table. However, I would like to do "lazy loading" in that I only want to load the comments when the user requests to see them (by clicking on the entry in the table).

As an example, my table might look like the following

| app name | version | rating | etc. | view comments |
------------------------------------------------------
| app1     | v.1.0   | 4 star | etc. | click me!     |
| app2     | v.2.4.5 | 3 star | etc. | click me!     |
| ...

My question is what would be more efficient? Are reads fast enough on MongoDB that it really doesn't matter that I am pulling all the comments with each application? If a user did not filter any of the applications and scrolled all the way to the bottom, they might load somewhere between 125k to 250k entries/applications.

+1  A: 

I would suggest thinking more specifically about your query - you specify which parts of an object you'd like to return. This should allow you to avoid the overhead of getting a bunch of embedded comments when you're only interested in displaying some specific bits of information about the application.

You can do something like: db.collection.find({ appName : 'Foo'}, {comments : 0 }); to retrieve the application object with appName Foo, but specifically exclude the comments object (more likely array of objects) embedded within it.

From the MongoDB docs

Retrieving a Subset of Fields By default on a find operation, the entire document/object is returned. However we may also request that only certain fields are returned. Note that the _id field is always returned automatically.

// select z from things where x=3
db.things.find( { x : 3 }, { z : 1 } );

You can also remove specific fields that you know will be large:

// get all posts about mongodb without comments
db.posts.find( { tags : 'mongodb' }, { comments : 0 } );

EDIT Also remember the limit(n) function to retrieve only n apps at a time. For instance, getting n=50 apps without their comments would be:

db.collection.find({}, {comments : 0 }).limit(50);
nearlymonolith