views:

309

answers:

2

I'm trying to write life in F# using accelerator v2, but for some odd reason my output isn't square despite all my arrays being square - It appears that everything but a rectangular area in the top left of the matrix is being set to false. I've got no idea how this could be happening as all my operations should treat the entire array equally. Any ideas?

open Microsoft.ParallelArrays
open System.Windows.Forms
open System.Drawing
type IPA = IntParallelArray
type BPA = BoolParallelArray
type PAops = ParallelArrays
let RNG = new System.Random()
let size = 1024
let arrinit i = Array2D.init size size (fun x y -> i)
let target = new DX9Target()
let threearr = new IPA(arrinit 3)
let twoarr =   new IPA(arrinit 2)
let onearr =   new IPA(arrinit 1)
let zeroarr =  new IPA(arrinit 0)
let shifts = [|-1;-1|]::[|-1;0|]::[|-1;1|]::[|0;-1|]::[|0;1|]::[|1;-1|]::[|1;0|]::[|1;1|]::[]
let progress (arr:BPA) = let sums = shifts //adds up whether a neighbor is on or not
                                    |> List.fold (fun (state:IPA) t ->PAops.Add(PAops.Cond(PAops.Rotate(arr,t),onearr,zeroarr),state)) zeroarr
                         PAops.Or(PAops.CompareEqual(sums,threearr),PAops.And(PAops.CompareEqual(sums,twoarr),arr)) //rule for life
let initrandom () = Array2D.init size size (fun x y -> if RNG.NextDouble() > 0.5 then true else false)

type meform () as self= 
    inherit Form()
    let mutable array = new BoolParallelArray(initrandom())
    let timer = new System.Timers.Timer(1.0) //redrawing timer
    do base.DoubleBuffered <- true
    do base.Size <- Size(size,size)
    do timer.Elapsed.Add(fun _ -> self.Invalidate())
    do timer.Start()
    let draw (t:Graphics) = 
        array <- array |> progress
        let bmap = new System.Drawing.Bitmap(size,size)
        target.ToArray2D array
        |> Array2D.iteri (fun x y t ->
                 if not t then bmap.SetPixel(x,y,Color.Black))
        t.DrawImageUnscaled(bmap,0,0)

    do self.Paint.Add(fun t -> draw t.Graphics)

do Application.Run(new meform())
+3  A: 

As Robert mentioned, I wrote an article that shows how to implement Game of Life in F# using Accelerator v2, so you can take a look at that for a working version. I remember having similar problem, but I don't know exactly in what scenario.

Anyway, if you're using DX9Target then the problem may be that this target isn't supposed to support operations with integers (because emulating integer arithmetics on GPU precisely is just not possible using DX9). I believe that this is also a reason why I ended up using FloatParallelArray in my implementation. Do you have any chance to try the X64MulticoreTarget to see if that would work?

EDIT: I did some further investigations and (unless I'm missing something important) it appears to be a bug with the CompareEqual method. Here is a much simpler example that shows the issue:

open Microsoft.ParallelArrays 

let target = new DX9Target() 
let zeros = new IntParallelArray(Array2D.create 4 4 0) 
let trues = target.ToArray2D(ParallelArrays.CompareEqual(zeros, zeros))

trues |> Array2D.iter (printfn "%A")

The expected result would be true (several times), but if you run it, it prints true only 4 times and then prints 12 times false. I'll ask someone from the Accelerator team and post an answer here. In the meantime, you can do the same thing as I did in my example - that is, simulate boolean operations using FPA and avoid using BPA and CompareEqual.

EDIT 2: Here is a reply from the Accelerator team members:

This is related to the lack of precise integer calculations on DX9 GPUs. Because of numerical jitter, a Boolean comparison of an integer with itself is not always computed as exactly equal. (...)

So, in summary, you cannot really rely on BPA. The only option is to do what I suggested - simulate booleans using FPA (and possibly compare the number with some small delta-neighborhood to avoid the jitter caused by GPUs). This shoudl however work with the X86MulticoreTarget - if you can find some minimal repro that shows in which situations the library crashes, that would be really useful!

Tomas Petricek
using x64 target just causes a crashFloatParallelArrays have the same problem using only the left quarter of the square
jpalmer
It looks like there is really some bug - see the editted answer.
Tomas Petricek
Added answer from the Accelerator team - this is indeed a bug (or more specifically, a technical limitation that cannot be solved for current GPU technologies).
Tomas Petricek
+1  A: 

About precision issues: DX9-class GPUs do not have dedicated integer hardware, so integer streams are interpreted as floating-point streams (with the lack of precision you've met).

DX10-class GPUs do now support precise 32 bits integers with all C bitwise operations. But this does not necessary mean that they have true 32 bits integers ALUs. For instance on current DX10 NVIDIA gen integer math is done with 24-bit integer units, thus 32-bit integer ops are emulated. Next gen DX11 NVIDIA will bring true 32-bit integer units.

Stringer Bell