views:

39

answers:

1

Hi,

I have some documents which have 2 sets of attributes: tag and lieu. Here is an example of what they look like:

{
  title: "doc1",
  tag: ["mountain", "sunny", "forest"],
  lieu: ["france", "luxembourg"]
},
{
  title: "doc2",
  tag: ["sunny", "lake"],
  lieu: ["france", "germany"]
},
{
  title: "doc3",
  tag: ["sunny"],
  lieu: ["belgium", "luxembourg", "france"]
}

How can I map/reduce and query my DB to be able to retrieve only the intersection of documents that match these criteria:

  • lieu: ["france", "luxembourg"]
  • tag: ["sunny"]

Returns: doc1 and doc3

I cannot figure out any format map/reduce could return to be able to have only one query. What I am doing now is: emit every lieu/tag as key and the documents' id related as value, then reduce for every keys have an array of docs' ids. Then from my app I query this view, on the app side do an intersection of the documents (only take the docs that have the 3 keys (luxembourg, france and sunny) and then requery couchdb with these docs' ids to retrieve the actual docs. I feel that's not the right/best way to do it?

I am using lists to do the intersection job, it works quite well. But I still need to do an other request to get the documents using the documents ids. Any idea what could I do differently to retrieve the documents directly?

Thank you!

A: 

This is going to be awkward. The basic idea is that you have to build a view where the map function emits every possible combination of tags and countries as the key, and there's no reduce function. This way, looking for ["france","luxembourg"] would return all documents that emitted that key (and therefore are in the intersection), because views without a reduce function return the emitting document for every entry. This way, you only have to do one request.

This causes a lot of emits to happen, but you can lower that number by sorting the tags both when emitting and when searching (automatically turn ["luxembourg","france"] into ["france","luxembourg"]), and by taking advantage of the ability of CouchDB to query prefixes (this means that emitting ["belgium","france","luxembourg"] will let you match searches for ["belgium"] and ["belgium","france"]).

In your example above, for the countries, you would only emit:

// doc 1
emit(["luxembourg"],null);
emit(["france","luxembourg"],null);

// doc 2
emit(["germany"],null);
emit(["france","germany"],null);

// doc 3
emit(["luxembourg"],null);
emit(["belgium","luxembourg"],null);
emit(["france","luxembourg"],null);
emit(["belgium","france","luxembourg"],null);

Anyway, for complex queries like this one, consider looking into a CouchDB-Lucene combination.

Victor Nicollet