ansaurus

Question

Iteration of a randomized algorithm in fixed space and linear time

Answer 1

+5 A:

Importing Control.Monad.State.Strict instead of Control.Monad.State yields a significant performance improvement. Not sure what you're looking for in terms of asymptotics, but this might get you there.

Additionally, you get a performance increase by swapping the iterateM and the mapM so that you don't keep traversing the list, you don't have to hold on to the head of the list, and you don't need to deepseq through the list, but just force the individual results. I.e.:

let end = flip evalState rnd $ mapM (iterateM iters randStep) start

If you do so, then you can change iterateM to be much more idiomatic as well:

iterateM 0 _ x = return x
iterateM n f !x = f x >>= iterateM (n-1) f

This of course requires the bang patterns language extension.

sclv 2010-07-13 13:31:11

It seems to solve my problem. I'll test it this evening and will probably accept this answer.

jetxee 2010-07-13 13:47:42

Swapping iterateM and mapM is allowed for this example, but will not work for more complex randomized algorithms (I have GA in mind). But thanks for idea.

jetxee 2010-07-13 14:05:23

Well, swapping is a mild performance improvement. Control.Monad.State.Strict is a much bigger one. In general, however, its better to avoid DeepSeq and instead structure your functions such that they force evaluation to head normal form, and your data structures such that they are necessarily strict enough already.

sclv 2010-07-13 14:19:18

I agree. But if I want to compose a long sequence of non-deterministic computations like `a -> m a`, it turns out that some additional strictness is almost a requirement (see http://stackoverflow.com/questions/2236829/ ). As soon as there is something like `[a] -> m [a]`, there appears `deepseq`. I'd like to know better way to write such code in Haskell.

jetxee 2010-07-13 14:32:06

You need a strict list type. `data StrictList a = Cons !a !(StrictList a) | Nil`. Then if the head is forced, the whole structure is forced. So if you know you always want your list to be used strictly, use a type that enforces it :-)

sclv 2010-07-13 14:47:53

Thank you. I'll positively consider it. As far as I can tell Data.Sequence.Seq is strict too, right?

jetxee 2010-07-13 15:07:54

It's spine strict, but not element strict. Also, it has vastly different performance characteristics than lists. (Summary -- better asymptotics for many operations, significantly poorer constant factors .)

sclv 2010-07-13 16:33:02

Answer 2

+20 A:

Some things to consider:

Use the mersenne-random generator, it is often >100x faster than StdGen

For raw all-out performance, write a custom State monad, like so:

import System.Random.Mersenne.Pure64

data R a = R !a {-# UNPACK #-}!PureMT

-- | The RMonad is just a specific instance of the State monad where the
--   state is just the PureMT PRNG state.
--
-- * Specialized to a known state type
--
newtype RMonad a = S { runState :: PureMT -> R a }

instance Monad RMonad where
    {-# INLINE return #-}
    return a = S $ \s -> R a s

    {-# INLINE (>>=) #-}
    m >>= k  = S $ \s -> case runState m s of
                                R a s' -> runState (k a) s'

    {-# INLINE (>>) #-}
    m >>  k  = S $ \s -> case runState m s of
                                R _ s' -> runState k s'

-- | Run function for the Rmonad.
runRmonad :: RMonad a -> PureMT -> R a
runRmonad (S m) s = m s

evalRmonad :: RMonad a -> PureMT -> a
evalRmonad r s = case runRmonad r s of R x _ -> x

-- An example of random iteration step: one-dimensional random walk.
randStep :: (Num a) => a -> RMonad a
randStep x = S $ \s -> case randomInt s of
                    (n, s') | n < 0     -> R (x+1) s'
                            | otherwise -> R (x-1) s'

Like so: http://hpaste.org/fastcgi/hpaste.fcgi/view?id=27414#a27414

Which runs in constant space (modulo the [Double] you build up), and is some 8x faster than your original.

The use of a specialized state monad with local defintion outperforms the Control.Monad.Strict significantly as well.

Here's what the heap looks like, with the same paramters as you:

alt text

Note that it is about 10x faster, and uses 1/5th the space. The big red thing is your list of doubles being allocated.

Inspired by your question, I captured the PureMT pattern in a new package: monad-mersenne-random, and now your program becomes this:

Using monad-mersenne-random

The other change I made was to worker/wrapper transform iterateM, enabling it to be inlined:

 {-# INLINE iterateM #-}
 iterateM n f x = go n x
     where
         go 0 !x = return x
         go n !x = f x >>= go (n-1)

Overall, this brings your code from, with K=500, N=30k

Original: 62.0s
New: 0.28s

So that is, 220x faster.

The heap is a bit better too, now that iterateM unboxes. alt text

Don Stewart 2010-07-13 17:39:16

Fantastic results. Thank you, Don. I considered mersenne-random to be premature optimization (and din't try it), and assumed something is wrong with the way I use State or iterateM'. It turns out that the custom monad and mersenne-random-pure64 work very well after all. I'll consider using them. Just a couple of questions: is it essential to {-# UNPACK #-} PureMT and {-# INLINE #-} monad implementation? I didn't notice significant difference without them.

jetxee 2010-07-13 18:41:55

@jetxee it may not matter in this example, as the monad is not exported from the module anyway.

Don Stewart 2010-07-13 20:39:34

I updated the post with two changes: a new monad-mersenne-random package, and a worker/wrapper iterateM.

Don Stewart 2010-07-14 00:21:19

This is just amazing. Thank you, Don.

jetxee 2010-07-14 09:39:27

Answer 3

A:

This is probably a small point compared to the other answers, but is your ($!!) function correct?

You define

($!!) :: (NFData a) => (a -> b) -> a -> b
f $!! x = x `deepseq` f x

This will fully evaluate the argument, however the function result won't necessarily be evaluated at all. If you want the $!! operator to apply the function and fully evaluate the result, I think it should be:

($!!) :: (NFData b) => (a -> b) -> a -> b
f $!! x = let y = f x in y `deepseq` y

John 2010-07-13 21:10:47

That won't do anything except burn cycles. It says "Before you evaluate `f x`, make sure that you evaluate `f x`." See here: http://neilmitchell.blogspot.com/2008/05/bad-strictness.html

sclv 2010-07-13 22:03:39

@sclv, thanks for that link. I would agree that my suggestion is wrong because the questioner's version has similar semantics to $!.Being pedantic, is `deepseq x x` really the same as `seq x x`? I would think it says "Before you evaluate `x` to WHNF, fully evaluate `x`". This may (depending on `x`) do strictly more work than the `seq` version, which is undoubtedly wasteful. Whether it is useful is another matter.

John 2010-07-14 00:28:55

Yes, I am aware that with my definition of ($!!) the last function application may remain a thunk. But as long as this thunk as at most one-level "deep", I suppose it is OK. The point is not to force the function application, but to force evaluation of the previous state.

jetxee 2010-07-14 09:32:13

ansaurus

tags:

views:

answers:

Iteration of a randomized algorithm in fixed space and linear time

related questions