tags:

views:

412

answers:

2

I have noticed that matlab does some matrix function really fast for example adding 5 to all elements of an n*n array happens almost instantly even if the matrix is large because you don't have to loop through every element, doing the same in java the for loop takes forever if the matrix is large.

I have two questions, are there efficient built-in classes in java for doing matrix operations, second how can I code something to update all elements of a big matrix in java more efficiently.

+3  A: 

Colt may be the fastest.

"Colt provides a set of Open Source Libraries for High Performance Scientific and Technical Computing in Java. " "For example, IBM Watson's Ninja project showed that Java can indeed perform BLAS matrix computations up to 90% as fast as optimized Fortran."

JAMA!

"JAMA is a basic linear algebra package for Java. It provides user-level classes for constructing and manipulating real, dense matrices."

Or the Efficient Java Matrix Library

"Efficient Java Matrix Library (EJML) is a linear algebra library for manipulating dense matrices. Its design goals are; 1) to be as computationally efficient as possible for both small and large matrices, and 2) to be accessible to both novices and experts."

John Ellinwood
thanks. do you know how they achieve the efficiency? I means whats going on in the code, because if I were to write something like matlab or JAMA I couldn't think of a way to update the whole matrix except running super slow loops.
anon
From the Colt page, they note that its a combination of both algorithm choice and data structure. They provide the source code in their distribution. You could check it out and see.
John Ellinwood
Im only speculating but they should be somehow making use of packed instructions (also called SIMD) to achieve some parallelism. for example: http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions#Example
Amro
Check this blog entry "Cache Locality For Performance" http://blogs.microsoft.co.il/blogs/pavely/archive/2009/06/22/cache-locality-for-performance.aspx
Mikhail
+3  A: 

Just stumbled into this posting and thought I would throw my two cents in. I am author of EJML and I am also working on a performance and stability benchmark for java libraries. While several issues go into determining how fast an algorithm is, Mikhail is correct that caching is a very important issue in performance of large matrices. For smaller matrices the libraries overhead becomes more important.

Due to overhead in array access, pure Java libraries are slower than highly optimized c libraries, even if the algorithms are exactly the same. Some libraries get around this issue by making calls to native code. You might want to check out

http://code.google.com/p/matrix-toolkits-java/

which does exactly that. There will be some overhead in copying memory from java to the native library, but for large matrices this is insignificant.

For a benchmark on pure java performance (the one that I'm working on) check out:

http://code.google.com/p/java-matrix-benchmark/

Another benchmark is here:

http://www.ujmp.org/java-matrix/benchmark/

Either of these benchmarks should give you a good idea of performance for large matrices.

Peter Abeles
I fixed the links for you, and now that you have more than 10 Rep you shouldn't run into any more problems posting more than 1 link in your answers. =)
gnovice