views:

187

answers:

5

I have recently been teaching myself Haskell, and one of my exercises was to re-implement the filter function. However, of all the exercises I have performed, my answer for this one seems to me the most ugly and long. How could I improve it? Are there any Haskell tricks I don't yet know?

myfilter :: (a -> Bool) -> [a] -> [a]
myfilter f (x:xs) = if f x
    then x : myfilter f xs
    else myfilter f xs
myfilter _ [] = []

Thank You

+2  A: 

You could at least DRY it up a bit by pulling out that common myfilter f xs code:

myfilter :: (a -> Bool) -> [a] -> [a]
myfilter f (x:xs) = if f x
    then x : rest
    else rest
        where rest = myfilter f xs
myfilter _ [] = []
Mark Rushakoff
+1  A: 

In Haskell, most of the time you can (and should) use guards instead of if-then-else:

myfilter :: (a -> Bool) -> [a] -> [a]
myfilter f (x:xs)
   | f x       = x : myfilter f xs
   | otherwise = myfilter f xs
myfilter _ [] = []

This ends up being basically the same definition as used in the standard library.

sth
+2  A: 

For comparison, here's Wikipedia's implementation:

myfilter :: (a -> Bool) -> [a] -> [a]
myfilter _ []                 = []
myfilter f (x:xs) | f x       = x : myfilter f xs
                  | otherwise = myfilter f xs
Gabe
+10  A: 

The simplest way to neaten your implementation is to use guards. Instead of pattern = value, you can write write pattern | boolean = value; this will only match when boolean is true. Thus, we can get

filter1 :: (a -> Bool) -> [a] -> [a]
filter1 p (x:xs) | p x       = x : filter1 p xs
                 | otherwise = filter1 p xs
filter1 _ []                 = []

(Note that otherwise is just a synonym for True.) Now, we have filter p xs in two places, so we can move it out into a where clause; these are shared by everything which shares a common pattern, even if it has a different guard:

filter2 :: (a -> Bool) -> [a] -> [a]
filter2 p (x:xs) | p x       = x : xs'
                 | otherwise = xs'
  where xs' = filter2 p xs
filter2 _ []                 = []

(This implementation is the one used by GHCs Prelude.)

Now, neither of these are tail-recursive. This can be disadvantageous, but it does make the function lazy. If we want a tail-recursive version, we could write

filter3 :: (a -> Bool) -> [a] -> [a]
filter3 p xs = let filter3' p (x:xs) ys | p x       = next $! x:ys
                                        | otherwise = next $! ys
                     where next = filter3' p xs
                   filter3' _ []     ys             = reverse ys
               in filter3' p xs []

Note, however, that this would fail on infinite lists (though all the other implementations will work), thanks to the reverse, so we make it strict with $!. (I think I did this right—I could have forced the wrong variable. I think I got this one right, though.)

Those implementations all look like yours. There are, of course, others. One is based on foldr:

filter4 :: (a -> Bool) -> [a] -> [a]
filter4 p = let check x | p x       = (x :)
                        | otherwise = id
            in foldr check []

We take advantage of point-free style here; since xs would be the last argument to both filter4 and foldr check [], we can elide it, and similarly with the last argument of check.

You could also take advantage of the list monad:

import Control.Monad
filter5 :: MonadPlus m => (a -> Bool) -> m a -> m a
filter5 p xs = do x <- xs
                  guard $ p x
                  return x

The list monad represents nondeterminism. You pick an element x from xs, make sure that it satisfies p, and then return it if it does. All of these results are then collected together. But note that this is now more general; this works for any MonadPlus (a monad which is also a monoid; that is, which has an associative binary operation mplus++ for lists—and an identity element mzero[] for lists), such as [] or Maybe. For instance, filter5 even $ Just 1 == Nothing, and filter5 even $ Just 2 == Just 2.

We can also adapt the foldr-based version to get a different generic type signature:

import Control.Monad
import qualified Data.Foldable as F
import qualified Data.Monoid   as M
filter6 :: (F.Foldable f, MonadPlus m, M.Monoid (m a))
        => (a -> Bool) -> f a -> m a
filter6 p = let check x | p x       = return x
                        | otherwise = mzero
            in F.foldMap check

The Data.Foldable module provides the Foldable type class, which represents any structure which can be folded like a list (placing the result in a generic Monoid instead.) Our filter requires a MonadPlus constraint on the result as well so that we can write return x. The foldMap function requires a function which converts everything to elements of a Monoid, and then concatenates all of them together. The mismatch between the f a on the left and the m a on the right means you could, for instance, filter6 a Maybe and get back a list.

I'm sure that there are (many!) other implementations of filter, but these are the 6 that I could think of relatively quickly. Now, which of these do I actually like best? It's a tossup between the straightforward filter2 and the foldr-based filter4. And filter5 is nice for its generic type signature. (I don't think I've ever needed a type signature like filter6's.) The fact that filter2 is used by GHC is a plus, but GHC also uses some funky rewrite rules, so it's not obvious to me that it's superior without those. Personally, I would probably go with filter4 (or filter5 if I needed the genericity), but filter2 is definitely fine.

Antal S-Z
The original filter is not tail recursive, but tail corecursive (if I may make up a term). This is the way list processing functions in Haskell should be, since then they can work lazily and in constant space. I've found tail recursion to be fairly rare in Haskell, actually.
luqui
luqui: Oh, definitely. But if you have something that *is* strict, it can be nice to have the tail-recursive version—that's why `foldl'` is nice, but `foldl` is less useful.
Antal S-Z
Note that the recursion in `filter1` is mediated by a heap-allocated thunk. In the first case, `filter1` returns a heap-allocated cons cell, which when forced, will call `filter1`. So it's an indirect mutual recursion that runs in constant stack space.
Norman Ramsey
Norman: I think I follow that. Good call. I'm not entirely used to reasoning about laziness yet. (Also, thanks for catching the `filter`/`filter1` mistake.)
Antal S-Z
Typo in your first two examples, "f x" should be "p x".
Anton Hansson
Anton: Good catch. Fixed.
Antal S-Z
+4  A: 

How about a list comprehension?

myfilter f xs = [x | x <- xs, f x]
gr3dman
How about `myfilter = filter`?
KennyTM