ansaurus

Question

Stopping a Reduce() operation mid way. Functional way of doing partial running sum

Answer 1

+3 A:

Try the following

let sumUntil list stopAfter = 
    let rec inner list sum = 
        if sum >= stopAfter then sum
        else 
            match list with
            | [] -> sum
            | h::t-> inner t (sum + h)
    inner list 0

F# interactive result

> sumUntil [1;2;3;4;5] 8;;
val it : int = 10

JaredPar 2010-06-28 06:07:36

In other words, not use reduce at all? I just keep thinking there must be someway in the lambda/function thats passed to a reduce that there should be a way to make some state changes and/or stop abort the processing

Chaitanya 2010-06-28 06:19:48

Right, `reduce` is no good for this. It has the wrong type signature, and it always processes the entire list.

Brian 2010-06-28 06:29:37

This is only returning the sum though. Not the number of elements that we had added up. But I guess it would be easy to change the inner recursive loop to take a counter and thread that counter through while incrementing it every time we call the inner recursive loop

Chaitanya 2010-06-28 11:16:21

Answer 2

+5 A:

I think that the 'most functional' way to do this is probably via lazy evaluation. If you're in a lazy language like Haskell, or in an eager language but using a lazy list data structure (like LazyList in the F# PowerPack), you can create e.g. a 'scan' of the running sums, and then leave it in the hands of the consumer of the list to decide how much she wants/needs to evaluate.

Or, you know, write a simple recursive function, like @JaredPar's answer. For some reason I often get a mental block on that, preventing me from noticing that "not everything has to be a fold, you can in fact write your own recursive functions" :)

Brian 2010-06-28 06:08:06

Indeed. I am in that block now ... I keep thinking there's gotta be a way to fold or partially fold this thing. I know there are other ways of doing it, but fold/reduce keeps beckoning me

Chaitanya 2010-06-28 07:12:14

Answer 3

+7 A:

Reduce is often used in combination with map. Google for example has developed a map-reduce framework for querying their databases and this map-reduce pattern is now used in several other projects (e.g. CouchDB, Hadoop, etc).

First, you need to map the input variables [2, 1, 3, 4, 5] to something like:

[(1, 2), (1, 1), (1, 3), (1, 4), (1, 5)]

In that case, x[0] will represent the number of the elements to get the sum x[1]. Of course, the number of elements is 1 at the beginning for each single element.

The next thing then, is to operate on those tuples:

reduce(
    lambda a, b: a if a[1] + b[1] > 8 else (a[0] + b[0], a[1] + b[1]),
    map(lambda x: (1, x), input))

This will return (3, 6), meaning the partial sum is 6 using 3 elements.

I hope you got the idea behind map-reduce-algorithms.

Regards,
Christoph

tux21b 2010-06-28 06:24:59

Oooohhhh.... niiice. I had read about map reduce, but I guess I didn't fully grok it. Very nicely done.

Chaitanya 2010-06-28 07:08:08

Here are two links which might interest you: Google's Map-Reduce paper (http://labs.google.com/papers/mapreduce.html) and a course Map Reduce in a Week (http://code.google.com/edu/submissions/mapreduce/listing.html).

tux21b 2010-06-28 07:21:26

And a Python framework (based on Erlang) for doing efficient map-reduce computing is Disco. With that you can use multiple cores / computers and work with (nearly) unlimited data sets... http://discoproject.org/

tux21b 2010-06-28 07:54:31

I'm not downvoting, but this can hardly be idiomatic FP..?Chaitanya has picked his golden hammer, and you're helping him/her use it to bash a square peg into a round hole.

eevar 2010-06-28 07:59:12

Hehe, well put! Anyway, map-reduce algorithms can be more maintainable and faster because they can scale well (of course not with the Python build-ins!). And for a short Python solution where performance doesn't matter at all, they might be shorter to write. So Map-Reduce isn't that bad, but it's surely not the fastest Python-only solution... no doubt on that :)

tux21b 2010-06-28 08:20:12

Nice description of map/reduce, but if the input contained a million values and we hit the exit condition after three of them that's a lot of empty work being done. When you hit the exit condition use an exception to exit the loop.

Duncan 2010-06-28 09:00:07

@eevar - Surely this is more idiomatic FP than using a loops with counter increments. JaredPar and Tomas have similar FP'ish solutions, but I like this because its shorter and also uses map/reduce :-p. So can you explain which part of it you find non FP - idiomatic.@Duncan - Yes for a relatively small program that doesn't do partitioning map reduce might be an overkill, and python's implementation is loops anyway. I wasn't concerned about python or F# particularly, but more about thinking "Functionally". I have some data, the how, what and when of running functions over it to manipulate it.

Chaitanya 2010-06-28 10:54:06

Answer 4

+2 A:

This is a function that implements that functional program:

>>> def limited_reduce(reducer, pred, lst):
...  i = 0
...  y = lst[0]
...  while pred(y) and i < len(lst):
...    i += 1
...    y = reducer(lst[i], y)
...  return (i, y)

or recursively:

>>> def limited_reduce(reducer, pred, lst):
...   def helper(i, accum, rest):
...     if not rest or not pred(accum): return (i, accum)
...     return helper(i+1, reducer(rest[0], accum), rest[1:])
...   return helper(0, lst[0], lst[1:])

There's probably a way to clean it up a bit, but you would use it like this:

>>>> limited_reduce(lambda x,y: x+y, lambda r: r < 6, [1,2,1,3,2])
(3, 7)

Stephen 2010-06-28 06:25:49

Good solution, +1 from me. But note that your `reduce` is `foldr` and requires a sequence, unlike the builtin `reduce`.

Philipp 2010-06-28 06:33:19

@Philipp : Thanks! Good point about the sequence. Now you've got me reading about `foldr` :)

Stephen 2010-06-28 06:49:57

Answer 5

A:

Here is a slight variation of Stephen's code, using foldl instead of foldr (I hope) and not requiring a sequence:

#!/usr/bin/env python

import operator
import functools

def limited_reduce(op, it, start, pred):
    if not pred(start):
        return 0, start
    for i, x in enumerate(it):
        y = op(start, x)
        if pred(y):
            start = y
        else:
            break
    return i, start

print limited_reduce(operator.add, xrange(1, 6), 0,
                     functools.partial(operator.gt, 8))

Philipp 2010-06-28 06:41:03

Answer 6

A:

The only way to get out of the builtin reduce part way through is to throw an exception. Fortunately it's not hard to get the desired result this way:

def interruptible_reduce(fn, *args):
    try:
        return reduce(fn, *args)
    except StopIteration, e:
        return e.args[0]

def reducefn(a, b):
    total = a[1] + b[1]
    if total > 8:
        raise StopIteration(a)
    return (a[0]+b[0], total)

input = [2, 1, 3, 4, 5]

>>> from itertools import imap
>>> interruptible_reduce(reducefn, imap(lambda x: (1,x), input))
(3, 6)

Duncan 2010-06-28 08:58:30

Answer 7

+6 A:

I agree with JaredPar that writing your own recursive function that behaves similarly to fold, but allows you to stop the computation earlier is the best approach. The way I would write it is a bit more general (so that you can use the function for any situation where you need folding that can stop earlier):

// Generalized 'fold' function that allws you to stop the execution earlier
// The function 'f' has a type 'State -> 'T -> Option<'State>
// By returning 'None' we can stop the execution (and return the 
// current state), by returning Some(newState), we continue folding
let rec foldStop f state input = 
  match input with
  | x::xs -> 
      match f state x with
      | None -> state
      | Some(newState) -> foldStop f newState xs
  | [] -> state

// Example that stops folding after state is larger than 10
foldStop (fun st n -> if st > 10 then None else Some(st + n)) 0 [ 1 .. 10 ]

This is a very general function and you can use it for all similar scenarios. The nice thing about writing it is that you will never need to write similar explicit recursion again (because you can just use foldStop once you have it).

Note that you can use foldStop to implement fold by always wrapping the result of the accumulation function in 'Some' (so it is more general):

let fold f state input = 
  foldStop (fun st n -> Some(f st n)) state input

Tomas Petricek 2010-06-28 10:27:44

But I want to return the final state when I stopped as well as place where I stopped as well. My F# isn't fluent enough, but this would require changing the state and the input function as follows :foldStop (fun (st,i) n -> if st > 10 then None else Some(st + n, i + 1)) (0,0) [ 1 .. 10 ]

Chaitanya 2010-06-28 11:42:38

@Chaitanya: Yes, that would require changing the code a little bit (or you would need to update the condition to stop on the next state). Alternatively, you could use `Choice` instead of `Option` (that allows you to return the state, but still break the computation by returning a speical case).

Tomas Petricek 2010-06-28 12:13:50

Answer 8

+2 A:

Another functional approch could be using a "continution"-based version of reduce/fold:

let rec foldC fn acc cont = function
    | []      -> acc
    | x :: xs -> fn x acc (fun acc -> foldC fn acc cont xs)

Call with 'id' (fun x -> x) as 'initial continuation':

foldC (fun x sum c -> 
           if (sum + x) > 8 
           then sum 
           else c (sum + x))
      0
      (fun x -> x) 
      [1; 2; 3; 4; 5]

And you will get your '6'.

Note that this version of foldC is not tail recursive - or otherwise recommended - thought...

Johan Kullbom 2010-06-28 13:02:41

Answer 9

+4 A:

Let's imagine Python had two functions called ireduce (similar to reduce but it would yield intermediate values; it's called scanl in some languages) and ilast (return last item in iterable). Then we could write this pure functional code:

from itertools import takewhile
from operator import add
a = [1, 2, 3, 4, 5]
print ilast(enumerate(takewhile(lambda x: x < 8, ireduce(add, a, 0))))
# (3, 6)

For a ireduce implementation see http://gnosis.cx/publish/programming/charming_python_b13.html

For ilast see http://stackoverflow.com/questions/2138873/cleanest-way-to-get-last-item-from-python-iterator

And how would this look in Haskell? (probably the prettiest functional language out there)

> last $ zip [0..] (takeWhile (< 8) (scanl (+) 0 a))
(3,6)

tokland 2010-06-28 14:09:55

Hmmm ... Haskell one of those languages I keep wanting to learn but never get around to it

Chaitanya 2010-06-29 00:36:39

Answer 10

+2 A:

I think this does what you are after, using functions built-in to the F# Seq module:

let answer =
    [1; 2; 3; 4; 5]
    |> Seq.scan (fun (count,sum) x -> (count+1, sum + x) ) (0,0)
    |> Seq.find (fun (_,x) -> x > 8)

The "scan" function is similar to "fold", but returns a sequence containing intermediate (and final) states, rather than just the final state. In this case, the state is a tuple containing a count and sum of items thus far processed, starting with (0,0). This gets computed and fed, one at a time, into the "find" function, which returns the first element which matches the supplied condition (v>8), in this case (4,10).

The only issue you'd need to handle with the above is the case where the "find" condition is never satisfied, in which case a KeyNotFoundException is thrown. You could use "tryFind" which returns an option value. However, I can't see a graceful way to return the last element computed if no earlier state matches the condition, short of pre-computing the length of the sequence:

let xs = [1; 2; 3; 4; 5]
let len = Seq.length xs
let answer =
    xs
    |> Seq.scan (fun (count,acc) v -> (count+1, v + acc) ) (0,0)
    |> Seq.find (fun (count,v) -> v > 99 || count = len)

James Hugard 2010-06-28 21:00:38

ansaurus

tags:

views:

answers:

Stopping a Reduce() operation mid way. Functional way of doing partial running sum

related questions