views:

328

answers:

10

Hi,

I have been doing some functional programming and had a question. Perhaps I might be missing something but is there any way to stop a "reduce()" function midway? Lets say when I reach a certain condition? The idea somehow seems anti functional. I haven't seen any such option in python or F#,

As an example, lets say I have a list such as [1,2,3,4,5]. I want to sum the elements in this list until the sum is not greater than some number (lets say 8), and return/mark/store/identify somehow, the number of elements I have actually added.

If we looked at python for example for I might try something like

reduce(lambda a,b : a if a + b > 8 else a + b, input)

This gives me the right answer 6, but how do I find that I had added 3 elements to get here. There is no counter as such. I can't do assignments inside lambdas. I think F# has the same situation.

I know I can use a for loop or use a function that can store state etc. But what would be the functional way of doing/thinking about this. Reduce() wants to run until the end, but somewhere along this line of processing, we either want to stop it (because we don't care about processing the rest of the elements) or at least make a note of the place where we stopped caring.

+3  A: 

Try the following

let sumUntil list stopAfter = 
    let rec inner list sum = 
        if sum >= stopAfter then sum
        else 
            match list with
            | [] -> sum
            | h::t-> inner t (sum + h)
    inner list 0    

F# interactive result

> sumUntil [1;2;3;4;5] 8;;
val it : int = 10
JaredPar
In other words, not use reduce at all? I just keep thinking there must be someway in the lambda/function thats passed to a reduce that there should be a way to make some state changes and/or stop abort the processing
Chaitanya
Right, `reduce` is no good for this. It has the wrong type signature, and it always processes the entire list.
Brian
This is only returning the sum though. Not the number of elements that we had added up. But I guess it would be easy to change the inner recursive loop to take a counter and thread that counter through while incrementing it every time we call the inner recursive loop
Chaitanya
+5  A: 

I think that the 'most functional' way to do this is probably via lazy evaluation. If you're in a lazy language like Haskell, or in an eager language but using a lazy list data structure (like LazyList in the F# PowerPack), you can create e.g. a 'scan' of the running sums, and then leave it in the hands of the consumer of the list to decide how much she wants/needs to evaluate.

Or, you know, write a simple recursive function, like @JaredPar's answer. For some reason I often get a mental block on that, preventing me from noticing that "not everything has to be a fold, you can in fact write your own recursive functions" :)

Brian
Indeed. I am in that block now ... I keep thinking there's gotta be a way to fold or partially fold this thing. I know there are other ways of doing it, but fold/reduce keeps beckoning me
Chaitanya
+7  A: 

Reduce is often used in combination with map. Google for example has developed a map-reduce framework for querying their databases and this map-reduce pattern is now used in several other projects (e.g. CouchDB, Hadoop, etc).

First, you need to map the input variables [2, 1, 3, 4, 5] to something like:

[(1, 2), (1, 1), (1, 3), (1, 4), (1, 5)]

In that case, x[0] will represent the number of the elements to get the sum x[1]. Of course, the number of elements is 1 at the beginning for each single element.

The next thing then, is to operate on those tuples:

reduce(
    lambda a, b: a if a[1] + b[1] > 8 else (a[0] + b[0], a[1] + b[1]),
    map(lambda x: (1, x), input))

This will return (3, 6), meaning the partial sum is 6 using 3 elements.

I hope you got the idea behind map-reduce-algorithms.

Regards,
Christoph

tux21b
Oooohhhh.... niiice. I had read about map reduce, but I guess I didn't fully grok it. Very nicely done.
Chaitanya
Here are two links which might interest you: Google's Map-Reduce paper (http://labs.google.com/papers/mapreduce.html) and a course Map Reduce in a Week (http://code.google.com/edu/submissions/mapreduce/listing.html).
tux21b
And a Python framework (based on Erlang) for doing efficient map-reduce computing is Disco. With that you can use multiple cores / computers and work with (nearly) unlimited data sets... http://discoproject.org/
tux21b
I'm not downvoting, but this can hardly be idiomatic FP..?Chaitanya has picked his golden hammer, and you're helping him/her use it to bash a square peg into a round hole.
eevar
Hehe, well put! Anyway, map-reduce algorithms can be more maintainable and faster because they can scale well (of course not with the Python build-ins!). And for a short Python solution where performance doesn't matter at all, they might be shorter to write. So Map-Reduce isn't that bad, but it's surely not the fastest Python-only solution... no doubt on that :)
tux21b
Nice description of map/reduce, but if the input contained a million values and we hit the exit condition after three of them that's a lot of empty work being done. When you hit the exit condition use an exception to exit the loop.
Duncan
@eevar - Surely this is more idiomatic FP than using a loops with counter increments. JaredPar and Tomas have similar FP'ish solutions, but I like this because its shorter and also uses map/reduce :-p. So can you explain which part of it you find non FP - idiomatic.@Duncan - Yes for a relatively small program that doesn't do partitioning map reduce might be an overkill, and python's implementation is loops anyway. I wasn't concerned about python or F# particularly, but more about thinking "Functionally". I have some data, the how, what and when of running functions over it to manipulate it.
Chaitanya
+2  A: 

This is a function that implements that functional program:

>>> def limited_reduce(reducer, pred, lst):
...  i = 0
...  y = lst[0]
...  while pred(y) and i < len(lst):
...    i += 1
...    y = reducer(lst[i], y)
...  return (i, y)

or recursively:

>>> def limited_reduce(reducer, pred, lst):
...   def helper(i, accum, rest):
...     if not rest or not pred(accum): return (i, accum)
...     return helper(i+1, reducer(rest[0], accum), rest[1:])
...   return helper(0, lst[0], lst[1:])

There's probably a way to clean it up a bit, but you would use it like this:

>>>> limited_reduce(lambda x,y: x+y, lambda r: r < 6, [1,2,1,3,2])
(3, 7)
Stephen
Good solution, +1 from me. But note that your `reduce` is `foldr` and requires a sequence, unlike the builtin `reduce`.
Philipp
@Philipp : Thanks! Good point about the sequence. Now you've got me reading about `foldr` :)
Stephen
A: 

Here is a slight variation of Stephen's code, using foldl instead of foldr (I hope) and not requiring a sequence:

#!/usr/bin/env python

import operator
import functools

def limited_reduce(op, it, start, pred):
    if not pred(start):
        return 0, start
    for i, x in enumerate(it):
        y = op(start, x)
        if pred(y):
            start = y
        else:
            break
    return i, start

print limited_reduce(operator.add, xrange(1, 6), 0,
                     functools.partial(operator.gt, 8))
Philipp
A: 

The only way to get out of the builtin reduce part way through is to throw an exception. Fortunately it's not hard to get the desired result this way:

def interruptible_reduce(fn, *args):
    try:
        return reduce(fn, *args)
    except StopIteration, e:
        return e.args[0]

def reducefn(a, b):
    total = a[1] + b[1]
    if total > 8:
        raise StopIteration(a)
    return (a[0]+b[0], total)

input = [2, 1, 3, 4, 5]

>>> from itertools import imap
>>> interruptible_reduce(reducefn, imap(lambda x: (1,x), input))
(3, 6)
Duncan
+6  A: 

I agree with JaredPar that writing your own recursive function that behaves similarly to fold, but allows you to stop the computation earlier is the best approach. The way I would write it is a bit more general (so that you can use the function for any situation where you need folding that can stop earlier):

// Generalized 'fold' function that allws you to stop the execution earlier
// The function 'f' has a type 'State -> 'T -> Option<'State>
// By returning 'None' we can stop the execution (and return the 
// current state), by returning Some(newState), we continue folding
let rec foldStop f state input = 
  match input with
  | x::xs -> 
      match f state x with
      | None -> state
      | Some(newState) -> foldStop f newState xs
  | [] -> state

// Example that stops folding after state is larger than 10
foldStop (fun st n -> if st > 10 then None else Some(st + n)) 0 [ 1 .. 10 ]

This is a very general function and you can use it for all similar scenarios. The nice thing about writing it is that you will never need to write similar explicit recursion again (because you can just use foldStop once you have it).

Note that you can use foldStop to implement fold by always wrapping the result of the accumulation function in 'Some' (so it is more general):

let fold f state input = 
  foldStop (fun st n -> Some(f st n)) state input
Tomas Petricek
But I want to return the final state when I stopped as well as place where I stopped as well. My F# isn't fluent enough, but this would require changing the state and the input function as follows :foldStop (fun (st,i) n -> if st > 10 then None else Some(st + n, i + 1)) (0,0) [ 1 .. 10 ]
Chaitanya
@Chaitanya: Yes, that would require changing the code a little bit (or you would need to update the condition to stop on the next state). Alternatively, you could use `Choice` instead of `Option` (that allows you to return the state, but still break the computation by returning a speical case).
Tomas Petricek
+2  A: 

Another functional approch could be using a "continution"-based version of reduce/fold:

let rec foldC fn acc cont = function
    | []      -> acc
    | x :: xs -> fn x acc (fun acc -> foldC fn acc cont xs) 

Call with 'id' (fun x -> x) as 'initial continuation':

foldC (fun x sum c -> 
           if (sum + x) > 8 
           then sum 
           else c (sum + x))
      0
      (fun x -> x) 
      [1; 2; 3; 4; 5]

And you will get your '6'.

Note that this version of foldC is not tail recursive - or otherwise recommended - thought...

Johan Kullbom
+4  A: 

Let's imagine Python had two functions called ireduce (similar to reduce but it would yield intermediate values; it's called scanl in some languages) and ilast (return last item in iterable). Then we could write this pure functional code:

from itertools import takewhile
from operator import add
a = [1, 2, 3, 4, 5]
print ilast(enumerate(takewhile(lambda x: x < 8, ireduce(add, a, 0))))
# (3, 6)

For a ireduce implementation see http://gnosis.cx/publish/programming/charming_python_b13.html

For ilast see http://stackoverflow.com/questions/2138873/cleanest-way-to-get-last-item-from-python-iterator

And how would this look in Haskell? (probably the prettiest functional language out there)

> last $ zip [0..] (takeWhile (< 8) (scanl (+) 0 a))
(3,6)
tokland
Hmmm ... Haskell one of those languages I keep wanting to learn but never get around to it
Chaitanya
+2  A: 

I think this does what you are after, using functions built-in to the F# Seq module:

let answer =
    [1; 2; 3; 4; 5]
    |> Seq.scan (fun (count,sum) x -> (count+1, sum + x) ) (0,0)
    |> Seq.find (fun (_,x) -> x > 8)

The "scan" function is similar to "fold", but returns a sequence containing intermediate (and final) states, rather than just the final state. In this case, the state is a tuple containing a count and sum of items thus far processed, starting with (0,0). This gets computed and fed, one at a time, into the "find" function, which returns the first element which matches the supplied condition (v>8), in this case (4,10).

The only issue you'd need to handle with the above is the case where the "find" condition is never satisfied, in which case a KeyNotFoundException is thrown. You could use "tryFind" which returns an option value. However, I can't see a graceful way to return the last element computed if no earlier state matches the condition, short of pre-computing the length of the sequence:

let xs = [1; 2; 3; 4; 5]
let len = Seq.length xs
let answer =
    xs
    |> Seq.scan (fun (count,acc) v -> (count+1, v + acc) ) (0,0)
    |> Seq.find (fun (count,v) -> v > 99 || count = len)
James Hugard