views:

88

answers:

3

I am familiar with standard zipWith functions which operate on corresponding elements of two sequences, but in a functional language (or a language with some functional features), what is the most succinct way to conditionally select the pairs of elements to be zipped, based on a third sequence?

This curiosity arose while scratching out a few things in Excel.
With numbers in A1:A10, B1:B10, C1:C10, D1, E1 and F1, I'm using a formula like this:

{=AVERAGE(IF((D1<=(A1:A10))*((A1:A10)<=E1),B1:B10/C1:C10))}

Each half of the multiplication in the IF statement will produce an array of Boolean values, which are then multiplied (AND'ed) together. Those Booleans control which of the ten quotients will ultimately be averaged, so it's as though ten separate IF statements were being evaluated.

If, for example, only the second and third of the 10 values in A1:A10 satisfy the conditions (both >=D1 and <=E1), then the formula ends up evaluating thusly:

AVERAGE(FALSE,B2/C2,B3/C3,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE)

The AVERAGE function happens to ignore Boolean and text values, so we just get the average of the second and third quotients.

Can this be done as succinctly with Haskell? Erlang? LINQ or F#? Python? etc..

NOTE that for this particular example, the formula given above isn't entirely correct--it was abbreviated to get the basic point across. When none of the ten elements in A1:A10 satisfies the conditions, then ten FALSE values will be passed to AVERAGE, which will incorrectly evaluate to 0.
The formula should be written this way:

{=AVERAGE(IF(NOT(OR((D1<=(A1:A10))*((A1:A10)<=E1))),NA(),
             IF((D1<=(A1:A10))*((A1:A10)<=E1),B1:B10/C1:C10)))}

Where the NA() produces an error, indicating that the average is undefined.

Update:

Thanks for the answers. I realized that my first question was pretty trivial, in terms of applying a function on pairs of elements from the second and third lists when the corresponding element from the first list meets some particular criteria. I accepted Norman Ramsey's answer for that.

However, where I went to next was wondering whether the function could be applied to a tuple representing corresponding elements from an arbitrary number of lists--hence my question to Lebertram about the limits of zipWithN.

Apocalisp's info on applicative functors led me to info on python's unpacking of argument lists--applying a function to an arbitrary number of arguments.

For the specific example I gave above, averaging the quotients of elements of lists (where nums is the list of lists), it looks like python can do it like this:

from operator import div

def avg(a): return sum(a,0.0)/len(a)
avg([reduce(div,t[1:]) for t in zip(*nums) if d<=t[0] and t[0]<=e])

More generally, with a function f and a predicate p (along with avg) this becomes:

avg([f(t[1:]) for t in zip(*nums) if p(t[0])])
+1  A: 

Haskell:

average . map fromJust . filter isJust $ zipWith3 (\a b c -> if a >= d1 && a <= e1 then Just b/c else Nothing) as bs cs
  where average xs = let (sum,n) = foldl' (\(s,m) x -> (s+x,m+1)) (0,0) xs in sum / (fromIntegral n)
Lebertram
Ok cool. So the `average` function you define will ignore the `Nothing` values? What if the original formula happened to be `{=AVERAGE(IF((D1<=(A1:A10))*((A1:A10)<=E1),B1:B10/C1:C10/F1:F10))}`or even `{=AVERAGE(IF((D1<=(A1:A10))*((A1:A10)<=E1),B1:B10/C1:C10/F1:F10/G1:G10))}` ? Are there `zipWith4`, `zipWith5`, `zipWithN` functions out there, or can the normal `zipWith` s be nested to arbitrary levels?
Dave
It won't, this will probably not even typecheck... You can change the first line to `average . map fromJust . filter isJust $ ...`, which will erase all `Nothing` values and unwrap the other values.zipWith functions are defined up to zipWith7, but you can always write your own for higher order functions. Due to Haskell's strict type system it's impossible to create a general zipWith function.
Lebertram
+3  A: 

What you're looking for is Applicative Functors. Specifically the "zippy" applicative from the linked paper.

In Haskell notation, let's call your function f. Then with applicative programming, it would look as succinct as this:

f d e as bs cs = if' <$> ((&&) <$> (d <=) <*> (e >=))
                     <$> as <*> ((/) <$> bs <*> cs) <*> (repeat 0)
   where if' x y z = if x then y else z
         (<*>)     = zipWith ($)

The result of f is a list. Simply take the average. To generify a little:

f g p as bs cs = if' <$> p <$> as <*> (((Some .) . g) <$> bs <*> cs)
                                  <*> (repeat None)

Here, p is a predicate, so you would call it with:

average $ fromMaybe 0 <$> f (/) ((&&) <$> (d <=) <*> (e >=)) as bs cs

... given the same definition of <*> as above.

Note: I haven't tested this code, so there might be missing parentheses and the like, but this gets the idea across.

Apocalisp
wow, this is intense. I think I get it, in terms of how the zip is repeatedly applied to perform the transposition. This is actually the answer to the question that I should have asked initially, which is whether a `zipWith` function can be applied not only conditionally, but on an arbitrary number of lists / vectors. Thanks for the "applicative" hint--that's what got me to the solution I'm going with for now (see my update above).
Dave
+3  A: 

How to conditionally select elements in zip?

Zip first, select later.

In this case, I'm doing the selection with catMaybes, wihch is often useful in this setting. Getting the to typecheck was a huge pain (must put fromIntegral in exactly the right spot), but here's the code I would write, relying on the optimizer as usual:

average as bs cs d1 e1 = avg $ catMaybes $ zipWith3 cdiv as bs cs
  where cdiv a b c = if a >= d1 && a <= e1 then Just (b/c) else Nothing
        avg l = sum l / fromIntegral (length l)

Function cdiv stands for "conditional division".

To get catMaybes you have to import Data.Maybe.

This code typechecks, but I haven't run it.

Norman Ramsey
Thanks, yeah this is the sort of approach that I was thinking about when I wrote the question initially. But see the answer from Apocalisp and my update above.
Dave