tags:

views:

152

answers:

1

I have a structure called Patch that represents a 2D array of data.

newtype Size = (Int, Int)
data Patch = Patch Size Strict.ByteString

I want to construct a larger Patch from a set of smaller Patches and their assigned positions. (The Patches do not overlap.) The function looks like this:

newtype Position = (Int, Int)

combinePatches :: [(Position, Patch)] -> Patch
combinePatches plan = undefined

I see two sub-problems. First, I must define a function to translate 2D array copies into a set of 1D array copies. Second, I must construct the final Patch from all those copies.

Note that the final Patch will be around 4 MB of data. This is why I want to avoid a naive approach.

I'm fairly confident that I could do this horribly inefficiently, but I would like some advice on how to efficiently manipulate large 2D arrays in Haskell. I have been looking at the "vector" library, but I have never used it before.

Thanks for your time.

+2  A: 

If the spec is really just a one-time creation of a new Patch from a set of previous ones and their positions, then this is a straightforward single-pass algorithm. Conceptually, I'd think of it as two steps -- first, combine the existing patches into a data structure with reasonable lookup for any give position. Next, write your new structure lazily by querying the compound structure. This should be roughly O(n log(m)) -- n being the size of the new array you're writing, and m being the number of patches.

This is conceptually much simpler if you use the Vector library instead of a raw ByteString. But it is simpler still if you simply use Data.Array.Unboxed. If you need arrays that can interop with C, then use Data.Array.Storable instead.

If you ditch purity, at least locally, and work with an ST array, you should be able to trivially do this in O(n) time. Of course, the constant factors will still be worse than using fast copying of chunks of memory at a time, but there's no way to keep that code from looking low-level.

sclv
Or use Data.Vector.Storable.Vector/MVector (and freeze from ST).
Don Stewart
Right -- you should be able to use slice and copy to pull it off in something that looks slightly cleaner than idiomatic C. But my preference, unless I really needed the extra performance, would be to write something with the Array libs that handled thinking about indexes for me, with a certain constant cost.
sclv