views:

153

answers:

3

How would you make the folowing code functional with the same speed? In general, as an input I have a list of objects containing position coordinates and other stuff and I need to create a 2D array consisting those objects.

let m = Matrix.Generic.create 6 6 []
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]  
pos |> List.iter (fun (pz,py) ->
  let z, y = int pz, int py
  m.[z,y] <- (pz,py) :: m.[z,y]
)

It could be probably done in this way:

let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
Matrix.generic.init 6 6 (fun z y ->
  pos |> List.fold (fun state (pz,py) ->
    let iz, iy = int pz, int py
    if iz = z && iy = y then (pz,py) :: state else state
  ) []
)

But I guess it would be much slower because it loops through the whole matrix times the list versus the former list iteration...

PS: the code might be wrong as I do not have F# on this computer to check it.

+3  A: 

It depends on the definition of "functional". I would say that a "functional" function means that it always returns the same result for the same parameters and that it doesn't modify any global state (or the value of parameters if they are mutable). I think this is a sensible definition for F#, but it also means that there is nothing "dis-functional" with using mutation locally.

In my point of view, the following function is "functional", because it creates and returns a new matrix instead of modifying an existing one, but of course, the implementation of the function uses mutation.

let performStep m =
  let res = Matrix.Generic.create 6 6 [] 
  let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]   
  for pz, py in pos do
    let z, y = int pz, int py 
    res.[z,y] <- (pz,py) :: m.[z,y] 
  res

Mutation-free version: Now, if you wanted to make the implementation fully functional, then I would start by creating a matrix that contains Some(pz, py) in the places where you want to add the new list element to the element of the matrix and None in all other places. I guess this could be done by initializing a sparse matrix. Something like this:

let sp = pos |> List.map (fun (pz, py) -> int pz, int py, (pz, py)) 
let elementsToAdd = Matrix.Generic.initSparse 6 6 sp

Then you should be able to combine the original matrix m with the newly created elementsToAdd. This can be certainly done using init (however, having something like map2 would be maybe nicer):

let res = Matrix.init 6 6 (fun i j -> 
  match elementsToAdd.[i, j], m.[i, j] with
  | Some(n), res -> n::res
  | _, res -> res )

There is still quite likely some mutation hidden in the F# library functions (such as init and initSparse), but at least it shows one way to implement the operation using more primitive operations.

EDIT: This will work only if you need to add at most single element to each matrix cell. If you wanted to add multiple elements, you'd have to group them first (e.g. using Seq.groupBy)

Tomas Petricek
`initSparse` appears to require that the type being added supports numeric operations: `initSparse 4 4 [1,1,1]` works fine for me, but `initSparse 4 4 [1,1,"test"]` doesn't. I don't get a stack overflow, though...
kvb
Also, I don't think your `initSparse` example would have worked properly in this case anyway since there are multiple points in the mapped sequence having coordinates (1,4).
kvb
@kvb: good point - I think I would have to add somethink like `Seq.groupBy` before the call to `initSparse`.
Tomas Petricek
The issue with `initSparse` is probably something weird with my installation of F# (because it fails even for primitive types like `float`).
Tomas Petricek
A: 

Why do you want to do it functionally? The Matrix type is designed to be mutated, so the way you're doing it now looks good to me.

If you really want to do it functionally, though, here's what I'd do:

let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]  
let addValue m k v = 
  if Map.containsKey k m then
    Map.add k (v::m.[k]) m
  else
    Map.add k [v] m
let map = 
  pos 
  |> List.map (fun (x,y) -> (int x, int y),(x,y)) 
  |> List.fold (fun m (p,q) -> addValue m p q) Map.empty
let m = Matrix.Generic.init 6 6 (fun x y -> if (Map.containsKey (x,y) map) then map.[x,y] else [])

This runs through the list once, creating an immutable map from indices to lists of points. Then, we initialize each entry in the matrix, doing a single map lookup for each entry. This should take total time O(M + N log N) where M and N are the number of entries in your matrix and list respectively. I believe that your original solution using mutation takes O(M+N) time and your revised solution takes O(M*N) time.

kvb
Well, I try to avoid mutability so that the code would be simple to paralelize in the future and to avoid bugs. I also wanted to know how functional programers would solve it :) because I often run into similar problems where I have to mutate. The code you have writen looks prety complicated at the first sight.
Oldrich Svec
@Oldrich - if you're really trying to avoid mutability, then Matrix is the wrong datatype; you could create an immutable analog based on something like a list of lists instead. However, in some cases, a mutable solution really is cleanest (which is one reason that F# supports mutation). I think that your original solution is quite clean and understandable, so I wouldn't change it.
kvb
Sorry, I'm an F# newbie and I can't quite understand the part where you use the addValue function in the List.fold operation. It seems like you are associating the value q with the key p, but later you are using the two coordinates as keys in Map.containsKey. What am I overlooking?
kahoon
@kahoon - addValue takes a map from keys to lists of values, a key, and a value, and it returns a new map with that key and value added. In the fold function, `p` is a pair of ints which is being used as a key and `q` is a pair of floats being put into the list of values; later we access the map using an explicit pair made up of the `x` and `y` coordinates, which may be where the confusion arises. If it eases understanding, you could replace `p` everywhere by `(x,y)`.
kvb
@kvb - Thank you, now I understand.
kahoon
"The Matrix type is designed to be mutated". Actually, element-wise mutation is extremely slow.
Jon Harrop
+1  A: 

You can do something like this:

[1.3, 4.3; 5.6, 5.4; 1.5, 4.8]
|> Seq.groupBy (fun (pz, py) -> int pz, int py)
|> Seq.map (fun ((pz, py), ps) -> pz, py, ps)
|> Matrix.Generic.initSparse 6 6

But in your question you said:

How would you make the folowing code functional with the same speed?

And in a later comment you said:

Well, I try to avoid mutability so that the code would be simple to paralelize in the future

I am afraid this is a triumph of hope over reality. Functional code generally has poor absolute performance and scales badly when parallelized. Given the huge amount of allocation this code is doing, you're not likely to see any performance gain from parallelism at all.

Jon Harrop
Thank you for your comment. Could you please point me to a book or article where I can learn how to write a code which is easily parallelizable, preferably in F#? How do I know how much allocation there is in the code?
Oldrich Svec
Note that initSparse doesn't work with non-numeric types.
kvb
@Oldrich: I'm writing a book on it, so I cannot reccomend a book yet! If you can stomach it, the research papers on Cilk are a gold mine. In F#, you must also strive to avoid allocation. I know that from experience. In this case, sequence elements, tuples and sparse matrix elements are all heap allocated. I lectured about this recently: http://skillsmatter.com/podcast/agile-testing/jon-harrop-qr-decomposition
Jon Harrop
@kvb: Ugh. Probably use a hash table instead of a matrix then...
Jon Harrop