tags:

views:

36

answers:

1

I have a mongodb collection that looks something like this:

db.scores.insert({"name": "Bob", value: 96.3, timeStamp:'2010-9-27 9:32:00'})
db.scores.insert({"name": "John", value: 98.3, timeStamp:'2010-9-27 9:28:00'})
db.scores.insert({"name": "Bob", value: 99.3, timeStamp:'2010-9-27 9:29:00'})
db.scores.insert({"name": "John", value: 97.3, timeStamp:'2010-9-27 9:31:00'})

How can I query this to give me each name once, with the most recent value:

{name: "John", value: 97.3}
{name: "Bob", value: 96.3}
+2  A: 

You're trying to do data aggregation, so you'll have to write a map-reduce query.

Mapping

The map function is basically the GROUP BY clause from SQL. In your case we'll group on the name, so we'll use the name as the key. The value will contain the other data we need, in this case the value and timestamp.

map = function () {
  emit(this.name, { timeStamp: this.timeStamp, value: this.value });
}

The map function will result in an array of values for each key. So all the documents with the name "Bob" are mapped to the key "Bob". For your sample data, the result may look like this:

{ _id: "Bob", values: [
  { timeStamp: "2010-9-27 9:32:00", value: 96.3 },
  { timeStamp: "2010-9-27 9:29:00", value: 99.3 }
]},
{ _id: "John", values: [
  { timeStamp: "2010-9-27 9:28:00", value: 98.3 },
  { timeStamp: "2010-9-27 9:31:00", value: 97.3 }
]},

Reducing

The reduce function is responsible for reducing the array of values for each key to a single value. In your case, we're only interested in the data with the highest timestamp:

reduce = function (key, values) {
  var maxData = { timeStamp: new Date(0) };

  values.forEach(function (data) {
    if (data.timeStamp > maxData.timeStamp) {
      maxData = data;
    }
  });

  return maxData;
}

Note that it is important that the return value of a reduce function has the same format as the values it accepts. This is due to the fact that the reduce function may be called multiple times for a single key, each time accepting only a segment of the total values.

Running the reduce function will result in the following:

{ _id: "Bob", value: { timeStamp: "2010-9-27 9:32:00", value: 96.3 } },
{ _id: "John", value: { timeStamp: "2010-9-27 9:31:00", value: 97.3 } }

Finalizing

A finalizer function may be used to extract only the data we need, which is the value.

finalize = function (key, value) {
  return value.value;
}

This will result in:

{ "_id" : "Bob", "value" : 96.3 },
{ "_id" : "John", "value" : 97.3 }

Running map-reduce

You can run a map-reduce with these functions as follows:

res = db.scores.mapReduce(map, reduce, { finalize: finalize });

The collection db[res.result] will contain the result of the query.

One last problem

The reduce function compares the timestamps as Date objects. Judging from your sample code, you have stored the timestamps as string, instead of Date objects or integer timestamps. So before you'll be able to compare dates, you will have to convert these strings into Date objects.

Niels van der Rest