views:

133

answers:

3

I'm trying to code a small F# linear algebra library (for applications with small matrices, so memory isn't an issue), and I was wondering which data structure had the best performance characteristics, in terms of element lookup times, since I'll need that to define matrix operations?

+2  A: 

The most efficient representation is probably going to be a mutable array (two-dimensional should work pretty well). The lookup by index is O(1), so that's as efficient as it can get. Even though the array is mutable, you can still use it to develop a immutable (functional) matrix type - all you need to do is to avoid mutating the array. The compiler will not check this, but the data structure can be purely functional for the user.

Something like this could be a good starting point:

// Constructor takes the data of the matrix
type Matrix(data:float[,]) =
  // Expose data only for private purposes (cannot be mutated by the user!)
  member private x.Data = data
  // All operations create new array as the result
  static member (+) (a:Matrix, b:Matrix) = 
    // TODO: Check that arrays have the same size
    let res = Array2D.init (a.Data.GetLength(0)) (a.Data.GetLength(1)) 
      (fun i j -> a.Data.[i, j] + a.Data.[i, j])
    new Matrix<_>(res)
Tomas Petricek
As I understand it, in .NET multi-dimensional arrays are significantly slower than one-dimensional arrays, which is something to keep in mind.
kvb
@kvb: Interesting. I didn't know that - do you have any reference? Does that apply to both rectangular 2D arrays and "arrays of arrays"?
Tomas Petricek
@Tomas - I think that jagged arrays (that is, arrays of arrays) are reasonably performant. However, true multi-dimensional arrays are not well optimized. I don't have a great reference off-hand, but see e.g. http://stackoverflow.com/questions/468832/why-are-multi-dimensional-arrays-in-net-slower-than-normal-arrays.
kvb
+8  A: 

I'm a little unclear what's being asked.

Arrays are of course O(1), and so I expect they're the right answer. (Brian's rule of thumb: if you want something to be fast, then the answer is the same in every language - use an array of structs.)

If you need something sparser, there's the .NET Dictionary and HashSet classes (which uses hashes), and the F# Map and Set types (which use trees/comparison). Dictionary is probably a next best try.

But of course I expect either this depends a whole lot on the specifics (density, locality/access pattern, ...) or it does not matter at all (other factors overwhelm it).

At the end of day, like for every performance question: measure.

Brian
I upvoted because of *Brian's rule of thumb*. By the time I reached the end I wanted to upvote **measure**. Excellent answer.
Muhammad Alkarouri
I agree; the three fundamentals to efficient code are: benchmark, benchmark, benchmark.
TechNeilogy
+5  A: 

If "small" is 2- or 3-dimensional then structs. For slightly larger "small", use a reference type with explicit components. If the number of elements is more than about 30 then use a single array and do i + n*j yourself. Avoid .NET's 2D arrays because they are several times slower than necessary. Really avoid F#'s Matrix type for element-wise operations because it does something insane like dynamic dispatch (an order of magnitude slower). Arrays of arrays are good but indexing yourself permits more JIT optimizations of the indexing.

Jon Harrop
Short and to the point.Great answers all around though, I'll be coming back to this for reference :)
Alexander