views:

87

answers:

2

I am not sure how I can design the following problem in CouchDB.

I have a logger web app that keeps track of how many items are in a warehouse. To simplify the problem we just need to know the total number items currently in warehouse and how long is each item stays in warehouse before it ships. Lets say the warehouse only have shoes but each shoe have different id and need to keep track by id.

MySQL schema looks like this

    id       name        date-in       data-out
    1        shoe       08/0/2010      null
    2        shoe       07/20/2010     08/01/2010


The output will be
    Number of shoe in warehouse:  1 
    Average time in warehouse:    14 days

Thanks

+1  A: 

If each shoe is a document, with a date_in and date_out, then your reduce function will +1 if the date_out is null, and +0 (no change) if date_out is not null. That will give you the total count of shoes in the warehouse.

To compute the average time, for each shoe, you know the time in the warehouse. So the reduce function simply accumulates the average. Since reduce functions must be commutative and associative, you use a different average algorithm. The easiest way is to reduce to a [sum, count] array, where sum is an accumulator of all time for all shoes, and count is a counter for the number of shoes counted. Then the client simply divides sum / count to compute the final average.

I think you could combine both of these into one big reduce if you want, perhaps building up a {"shoes in warehouse": 1, "average time in warehouse": [253, 15]} kind of object.

However, if you can accept two different views for this data, then there is a shortcut for the average. In the map, emit(null, time) where time is the time spent in the warehouse. In the reduce, set the entire reduce value to _stats (see Built-in reduce functions). The view output will be an object with the sum and count already computed.

jhs
+2  A: 

jhs' answer is great, but I just wanted to add something:

To use the build-in reduce function for the avg calculation (_stats in your case), you have to use two "separate" views. But if your map-function is exactly the same, CouchDB will detect that and not generate a whole new index for that second view. This way you can have one map function feeding multiple reduce functions.

tisba
I am new to CouchDB and would really like to understand what you just said. Could you explain more maybe add little example.
Mark K
I'll try to explain :)Every view in CouchDB consists of a map function and an optional reduce function. They are specified in so called design documents (documents with the "_design/"-prefix). Each design document can specify multiple views which will be processed as a group.Now, if you specify two views inside the same design document with byte-identical map functions, but different reduce functions, CouchDB will only run the map phase once.
tisba