tags:

views:

208

answers:

2

Any pointers on how to solve efficiently the following function in Haskell, for large numbers (n > 108)

f(n) = max(n,f(n/2)+f(n/3)+f(n/4))

I've seen examples of memoization in Haskell to solve fibonacci numbers, which involved computing (lazily) all the fibonacci numbers up to the required n. But in this case, for a given n, we only need to compute very few intermediate results.

Thanks

+6  A: 

Not the most efficient way, but does memoize:

f = 0 : [ g n | n <- [1..] ]
    where g n = max n $ f!!(n `div` 2) + f!!(n `div` 3) + f!!(n `div` 4)

when requesting f !! 144, it is checked that f !! 143 exists, but its exact value is not calculated. It's still set as some unknown result of a calculation. The only exact values calculated are the ones needed.

So initially, as far as how much has been calculated, the program knows nothing.

f = .... 

When we make the request f !! 12, it starts doing some pattern matching:

f = 0 : g 1 : g 2 : g 3 : g 4 : g 5 : g 6 : g 7 : g 8 : g 9 : g 10 : g 11 : g 12 : ...

Now it starts calculating

f !! 12 = g 12 = max 12 $ f!!6 + f!!4 + f!!3

This recursively makes another demand on f, so we calculate

f !! 6 = g 6 = max 6 $ f !! 3 + f !! 2 + f !! 1
f !! 3 = g 3 = max 3 $ f !! 1 + f !! 1 + f !! 0
f !! 1 = g 1 = max 1 $ f !! 0 + f !! 0 + f !! 0
f !! 0 = 0

Now we can trickle back up some

f !! 1 = g 1 = max 1 $ 0 + 0 + 0 = 1

Which means the program now knows:

f = 0 : 1 : g 2 : g 3 : g 4 : g 5 : g 6 : g 7 : g 8 : g 9 : g 10 : g 11 : g 12 : ...

Continuing to trickle up:

f !! 3 = g 3 = max 3 $ 1 + 1 + 0 = 3

Which means the program now knows:

f = 0 : 1 : g 2 : 3 : g 4 : g 5 : g 6 : g 7 : g 8 : g 9 : g 10 : g 11 : g 12 : ...

Now we continue with our calculation of f!!6:

f !! 6 = g 6 = max 6 $ 3 + f !! 2 + 1
f !! 2 = g 2 = max 2 $ f !! 1 + f !! 0 + f !! 0 = max 2 $ 1 + 0 + 0 = 2
f !! 6 = g 6 = max 6 $ 3 + 2 + 1 = 6

Which means the program now knows:

f = 0 : 1 : 2 : 3 : g 4 : g 5 : 6 : g 7 : g 8 : g 9 : g 10 : g 11 : g 12 : ...

Now we continue with our calculation of f!!12:

f !! 12 = g 12 = max 12 $ 6 + f!!4 + 3
f !! 4 = g 4 = max 4 $ f !! 2 + f !! 1 + f !! 1 = max 4 $ 2 + 1 + 1 = 4
f !! 12 = g 12 = max 12 $ 6 + 4 + 3 = 13

Which means the program now knows:

f = 0 : 1 : 2 : 3 : 4 : g 5 : 6 : g 7 : g 8 : g 9 : g 10 : g 11 : 13 : ...

So the calculation is done fairly lazily. The program knows that some value for f !! 8 exists, that it's equal to g 8, but it has no idea what g 8 is.

rampion
Thanks for this. I'm still very new to Haskell, so there is plenty of stuff in your answer that I need to understand, but I will try it.
Angel de Vicente
+11  A: 

We can do this very efficiently by making a structure that we can index in sub-linear time.

But first,

{-# LANGUAGE BangPatterns #-}

import Data.Function (fix)

Lets define f, but make it use 'open recursion' rather than call itself directly.

f :: (Int -> Int) -> Int -> Int
f mf 0 = 0
f mf n = max n $ mf (div n 2) +
                 mf (div n 3) +
                 mf (div n 4)

You can get an unmemoized f by using fix f

This will let you test that f does what you mean for small values of f by calling, for example: fix f 123 = 144

We could memoize this by defining:

f_list :: [Int]
f_list = map (f faster_f) [0..]

faster_f :: Int -> Int
faster_f n = f_list !! n

That performs passably well, and replaces what was going to take O(n^3) time with something that memoizes the intermediate results.

But it still takes linear time just to index to find the memoized answer for mf. This means that results like:

*Main Data.List> faster_f 123801
248604

are tolerable, but the result doesn't scale much better than that. We can do better!

First lets define an infinite tree:

data Tree a = Tree (Tree a) a (Tree a)
instance Functor Tree where
    fmap f (Tree l m r) = Tree (fmap f l) (f m) (fmap f r)

And then we'll define a way to index into it, so we can find a node with index n in O(log n) time instead:

index :: Tree a -> Int -> a
index (Tree _ m _) 0 = m
index (Tree l _ r) n = case (n - 1) `divMod` 2 of
    (q,0) -> index l q
    (q,1) -> index r q

... and we may find a tree full of natural numbers to be convenient so we don't have to fiddle around with those indices:

nats :: Tree Int
nats = go 0 1
    where
        go !n !s = Tree (go l s') n (go r s')
            where
                l = n + s
                r = l + s
                s' = s * 2

Since we can index, you can just convert a tree into a list:

toList :: Tree a -> [a]
toList as = map (index as) [0..]

You can check the work so far by verifying that toList nats gives you [0..]

Now,

f_tree :: Tree Int
f_tree = fmap (f fastest_f) nats

fastest_f :: Int -> Int
fastest_f = index f_tree

works just like with list above, but instead of taking linear time to find each node, can chase it down in logarithmic time.

The result is considerably faster:

*Main> fastest_f 12380192300
67652175206

*Main> fastest_f 12793129379123
120695231674999

In fact it is so much faster that you can go through and replace Int with Integer above and get ridiculously large answers almost instantaneously

*Main> fastest_f' 1230891823091823018203123
93721573993600178112200489

*Main> fastest_f' 12308918230918230182031231231293810923
11097012733777002208302545289166620866358
Edward Kmett