MongoDB: What's the point of using MapReduce without parallelism?

views:

575

answers:

+3 Q:

MongoDB: What's the point of using MapReduce without parallelism?

Quoting http://www.mongodb.org/display/DOCS/MapReduce#MapReduce-Parallelism

As of right now, MapReduce jobs on a single mongod process are single threaded. This is due to a design limitation in current JavaScript engines. We are looking into alternatives to solve this issue, but for now if you want to parallelize your MapReduce jobs, you will need to either use sharding or do the aggregation client-side in your code.

Without parallelism, what are the benefits of MapReduce compared to simpler or more traditional methods for queries and data aggregation?

To avoid confusion: the question is NOT "what are the benefits of document-oriented DB over traditional relational DB"

+2 A:

The main reason to use MapReduce over simpler or more traditional queries is that it simply can do things (i.e., aggregation) that simple queries cannot.

Once you need aggregation, there are two options using MongoDB: MapReduce and the group command. The group command is analogous to SQL's "group by" and is limited in that it has to return all its results in a single database response. That means group can only be used when you have less than 4MB of results. MapReduce, on the other hand, can do anything a "group by" can, but outputs results to a new collection so results can be as large as needed.

Also, parallelism is coming, so it's good to have some practice :)

kristina 2010-05-08 16:29:15

is there a roadmap plan for parallelism for map reduce? trying to decide if it's worth the wait.

2010-05-15 02:39:37

+3 A:

M/R is already parallel in MongoDB if you're running a sharded cluster. This is the main point of M/R anyway - to put the computation on the same node as the data.

mdirolf 2010-05-17 14:39:53

Am I right in assuming, that the current way of taking advantage of a multi-core computer is by running multiple MongoDB instances on the same machine?

Zsolt Török 2010-10-01 14:57:18

+1 A:

super fast map/reduce is on the roadmap

it will not be in the 1.6 release (summer release)

so late this year likely

dm 2010-05-17 16:18:51

@dm

is there a roadmap plan for parallelism for map reduce? trying to decide if it's worth the wait

Am in the same situation mongoDB is the kind of data store i need, but my app just has to have aggregation and super fast aggregation. Will we see this in 1.7/1.8 ?

topdog 2010-09-08 16:03:30

ansaurus

tags:

views:

answers:

MongoDB: What's the point of using MapReduce without parallelism?

related questions