views:

214

answers:

4

I am writing an algorithm for finding longs path over several turnpoints given a list of coordinates (that describe a path). The dynamic programming algorithm works nicely in O(kn^2), where k is the number of turnpoints and n number of points. To cut the story short: the slowest part is distance computation between 2 coordinates; the algorithm requires this to be 'k'-times recomputed for the same pair of points. Memoization is not an option (too many points). It is possible to 'invert' the algorithm - but somehow the inverted algorithm is very slow in haskell and eats too much memory.

It seems to me that the problem is following; you are given an array of arrays of fixed size (plus some dynamically computed value - e.g. this would be the result of zipping the value with the list:

arr = [ (2, [10,5,12]), (1, [2,8, 20]), (4, [3, 2, 10]) ]

I am trying to find a maximum over the elements of the list plus the fixed value:

[12, 9, 21]

What I am doing - something like:

foldl' getbest (replicate 3 0) arr
getbest acc (fixval, item) = map comparator $ zip acc item
comparator orig new
    | new + fixval > orig = new + fixval
    | otherwise = orig

The problem is that a new 'acc' gets built with each call to 'getbest' - which is n^2 which is a lot. Allocation is expensive and this is probably the problem. Do you have any idea how to do such thing efficiently?

To make it clear: this is the actual code of the function:

dynamic2FreeFlight :: Int -> [ Coord ] -> [ Coord ]
dynamic2FreeFlight numpoints points = reverse $ (dsCoord bestPoint) : (snd $ (dsScore bestPoint) !! (numpoints - 2))
    where
        bestPoint :: DSPoint
        bestPoint = maximumBy (\x y -> (getFinalPointScore x) `compare` (getFinalPointScore y)) compresult

        getFinalPointScore :: DSPoint -> Double
        getFinalPointScore sc = fst $ (dsScore sc) !! (numpoints - 2)

        compresult :: [ DSPoint ]
        compresult = foldl' onestep [] points 

        onestep :: [ DSPoint ] -> Coord -> [ DSPoint ]
        onestep lst point = (DSPoint point (genmax lst)) : lst
            where
                genmax :: [ DSPoint ] -> [ (Double, [ Coord ]) ]
                genmax lst = map (maximumBy comparator) $ transpose prepared
                comparator a b = (fst a) `compare` (fst b)
                distances :: [ Double ]
                distances = map (distance point . dsCoord) lst
                prepared :: [ [ (Double, [ Coord ]) ] ]
                prepared 
                    | length lst == 0 = [ replicate (numpoints - 1) (0, []) ]
                    | otherwise = map prepare $ zip distances lst
                prepare :: (Double, DSPoint) -> [ (Double, [ Coord ]) ]
                prepare (dist, item) = (dist, [dsCoord item]) : map addme (take (numpoints - 2) (dsScore item))
                    where
                        addme (score, coords) = (score + dist, dsCoord item : coords)
+1  A: 

I haven't checked the efficiency yet, but how about

map maximum $ transpose [ map (a+) bs | (a,bs) <- arr]

? Since the result is in terms of the sum anyway, the value and the list are added together first. Then we take the transpose of the list so that it's now column-major. Finally we compute the maximum of each column. (You'll need to import Data.List, BTW.)

KennyTM
I tried to use it this way - unfortunately, it didn't help :( It just eats huge amount of memory.
ondra
A: 

You might try using Data.Vector:

import qualified Data.Vector as V

best :: V.Vector (V.Vector Int) -> V.Vector Int -> V.Vector Int
best = (V.foldl1' (V.zipWith max) .) . (V.zipWith . flip $ V.map . (+))

convert :: [[Int]] -> V.Vector (V.Vector Int)
convert = V.fromList . map V.fromList

arr = convert [[10, 5, 12], [2, 8, 20], [3, 2, 10]]
val = V.fromList [2, 1, 4] :: V.Vector Int

This works:

*Main> best arr val
fromList [12,9,21] :: Data.Vector.Vector
Travis Brown
Yes, do see the comments under my answer - there is some debate about the performance of Stream on this problem.
TomMD
A: 
best = foldl' (zipWith max) (repeat 0) . map (\(fv,xs) -> map (+fv) xs)

Like Kenny's, we add first. Like yours, we make a single traversal, except using zipWith max, we do it more generally and succinctly. No serious benchmarks, but this should be pretty decent.

sclv
As in my vector answer, you can use `foldl1'` and omit the `repeat 0` starting value: `best = foldl1' (zipWith max) . map (\(fv,xs) -> map (+fv) xs)`
Travis Brown
+3  A: 

Benchmarking Travis Browns, SCLV, Kennys, and your answer using:

import Data.List
import Criterion.Main
import Criterion.Config
import qualified Data.Vector as V

-- Vector based solution (Travis Brown)
bestVector :: V.Vector (V.Vector Int) -> V.Vector Int -> V.Vector Int
bestVector = (V.foldl1' (V.zipWith max) .) . (V.zipWith . flip $ V.map . (+))

convertVector :: [[Int]] -> V.Vector (V.Vector Int)
convertVector = V.fromList . map V.fromList

arrVector = convertVector arr
valVector = V.fromList  val :: V.Vector Int

-- Shared arr and val
arr = [map (x*) [1, 2.. 2000] | x <- [1..1000]]
val = [1..1000]

-- SCLV solution
bestSCLV = foldl' (zipWith max) (repeat 0) . map (\(fv,xs) -> map (+fv) xs)

-- KennyTM Solution
bestKTM arr = map maximum $ transpose [ map (a+) bs | (a,bs) <- arr]

-- Original
getbest :: [Int] -> (Int, [Int]) -> [Int]
getbest acc (fixval, item) = map (uncurry comparator) $ zip acc item
 where
  comparator o n = max (n + fixval) o

someFuncOrig = foldl' getbest acc
  where acc = replicate 2000 0

-- top level functions
someFuncVector :: (V.Vector (V.Vector Int), V.Vector Int) -> V.Vector Int
someFuncVector = uncurry bestVector
someFuncSCLV = bestSCLV
someFuncKTM = bestKTM

main = do
  let vec = someFuncVector (arrVector, valVector) :: V.Vector Int
  print (someFuncOrig (zip val arr) == someFuncKTM (zip val arr)
        , someFuncKTM (zip val arr) == someFuncSCLV (zip val arr)
        , someFuncSCLV (zip val arr) == V.toList vec)
  defaultMain
        [ bench "someFuncVector" (whnf someFuncVector (arrVector, valVector))
        , bench "someFuncSCLV"   (nf someFuncSCLV (zip val arr))
        , bench "someFuncKTM"    (nf someFuncKTM (zip val arr))
        , bench "original"       (nf someFuncOrig (zip val arr))
        ]

Perhaps my benchmark is messed up somehow, but the results are rather disappointing.

Vector: 379.0164 ms (poor density too - what the heck?) SCLV: 207.5399 ms Kenny: 200.6028 ms Original: 138.4270 ms

[tommd@Mavlo Test]$ ./t
(True,True,True)
warming up
estimating clock resolution...
mean is 13.65277 us (40001 iterations)
found 3378 outliers among 39999 samples (8.4%)
  1272 (3.2%) high mild
  2106 (5.3%) high severe
estimating cost of a clock call...
mean is 1.653858 us (58 iterations)
found 3 outliers among 58 samples (5.2%)
  2 (3.4%) high mild
  1 (1.7%) high severe

benchmarking someFuncVector
collecting 100 samples, 1 iterations each, in estimated 54.56119 s
bootstrapping with 100000 resamples
mean: 379.0164 ms, lb 357.0403 ms, ub 401.0113 ms, ci 0.950
std dev: 112.6714 ms, lb 101.8206 ms, ub 125.4846 ms, ci 0.950
variance introduced by outliers: 4.000%
variance is slightly inflated by outliers

benchmarking someFuncSCLV
collecting 100 samples, 1 iterations each, in estimated 20.92559 s
bootstrapping with 100000 resamples
mean: 207.5399 ms, lb 207.4099 ms, ub 207.8410 ms, ci 0.950
std dev: 955.1629 us, lb 507.1857 us, ub 1.937356 ms, ci 0.950
found 3 outliers among 100 samples (3.0%)
  2 (2.0%) high severe
variance introduced by outliers: 0.990%
variance is unaffected by outliers

benchmarking someFuncKTM
collecting 100 samples, 1 iterations each, in estimated 20.14799 s
bootstrapping with 100000 resamples
mean: 200.6028 ms, lb 200.5273 ms, ub 200.6994 ms, ci 0.950
std dev: 434.9564 us, lb 347.5326 us, ub 672.6736 us, ci 0.950
found 1 outliers among 100 samples (1.0%)
  1 (1.0%) high severe
variance introduced by outliers: 0.990%
variance is unaffected by outliers

benchmarking original
collecting 100 samples, 1 iterations each, in estimated 14.05241 s
bootstrapping with 100000 resamples
mean: 138.4270 ms, lb 138.2244 ms, ub 138.6568 ms, ci 0.950
std dev: 1.107366 ms, lb 930.6549 us, ub 1.381234 ms, ci 0.950
found 15 outliers among 100 samples (15.0%)
  7 (7.0%) low mild
  7 (7.0%) high mild
  1 (1.0%) high severe
variance introduced by outliers: 0.990%
variance is unaffected by outliers
TomMD
Switching to the stream fusion version of `Vector` hugely speeds up my code in this benchmark (for me it drops from 476.9359 ms to 73.31412 us (!)). It just takes `import qualified Data.Vector.Fusion.Stream as V` and replacing `V.Vector` with `V.Stream`.
Travis Brown
Travis: I question the validity of your test. You probably left it evaluating `whnf` so it really isn't doing any work - try `nf (V.toList . someFuncVector)` and wait long enough you will see:`collecting 100 samples, 1 iterations each, in estimated 4284.720 s`. In other words, if we extrapolate (because I'm not waiting over an hour) it takes 42 seconds (not 73uS) for Stream to solve this problem.
TomMD
You're right–sorry about that. I should have known it was too good to be true.
Travis Brown
You're right -- your cleaned up version of the original is clearly the best. Not sure why, but I suspect that there's some fusions that the compiler can't do manually.
sclv
Pushing the addition to inside the max, in particular, really looks like it pays off. By the way, Ramping up the unfolding threshold helps, but it helps the original as well.
sclv
sclv: Agreed. I see the preprocessing of `(\(c,xs) -> map (+c) xs)` really costs and doesn't seem to fuse. I even tried importing `Data.List.Stream` for better fusion and it helped, but not too much.
TomMD