blas

BLAS Library Benchmark

Is there a benchmark that compares the different BLAS (Basic Linear Algebra Subprograms) libraries? I am especially interested in sparse matrix multiplication for single- and multi-core systems? ...

What is a good free (open source) BLAS/LAPACK library for .net (C#)?

Dear all I have a project written in C# where I need to do various linear algebraic operations on matrices (like LU-factorization). Since the program is mainly a prototype created to confirm a theory, a C# implementation will suffice (compared to a possibly speedier C++ one), but I would still like a good BLAS or LAPACK library availab...

finding matrix through optimisation

I am looking for algorithm to solve the following problem : I have two sets of vectors, and I want to find the matrix that best approximate the transformation from the input vectors to the output vectors. vectors are 3x1, so matrix is 3x3. This is the general problem. My particular problem is I have a set of RGB colors, and another se...

Does Blitz++ use BLAS routines when it is possible and appropriate to

I know that Blitz++ gets its performance plus by extensive usage of expression templates and template metaprogramms. But at some point you can't get more out of your code by using these techniques - you have to multiply and sum some floats up. At this point you can get a final performance kick by using the highly optimized (especially fo...

How do I use the BLAS library provided by MATLAB?

I have noticed that MATLAB provides the BLAS and LAPACK headers among others: $ ls ${MATLAB_DIR}/extern/include/ blas.h engine.h lapack.h mat.h mclmcr.h mex.h mwutil.h blascompat32.h fintrf.h libmatlbm.mlib matrix.h mclmcrrt.h mwdebug.h tmwtypes.h emlrt.h ...

Prefetch for Intel Core 2 Duo

Has anyone had experience using prefetch instructions for the Core 2 Duo processor? I've been using the (standard?) prefetch set (prefetchnta, prefetcht1, etc) with success for a series of P4 machines, but when running the code on a Core 2 Duo it seems that the prefetcht(i) instructions do nothing, and that the prefetchnta instruction i...

matrix multiplication for integral types using BLAS

Is there an equivalent of dgemm (from BLAS) for integral types? I only know of dgemm, sgemm for double precision / single precision matrices, but would like to have it for matrices that are of integral type such as int (or short int...). Note: I'm not looking for a solution that involves converting to float/double, and am looking for a ...

What is the BigO of linear regression?

How large a system is it reasonable to attempt to do a linear regression on? Specifically: I have a system with ~300K sample points and ~1200 linear terms. Is this computationally feasible? ...

prcomp error in R

I am using R. I want to run prcomp on a matrix. The code works fine with one installation of R on a Linux box but breaks on another identical (or so I thought) installation of R on a different Linux box. The codes are dataf = read.table("~/data/testdata.txt") pca = prcomp(dataf) The error msg on the bad instance is > dataf = read.tab...

CMake and BLAS for a C program

I'm trying to use CMake to build a program relying on blas, I'm detecting blas using : include (${CMAKE_ROOT}/Modules/FindBLAS.cmake) The problem is, FindBLAS require a fortran compiler and complain with -- Looking for BLAS... - NOT found (Fortran not enabled) As blas is already installed on my machine (ATLAS Blas), and gfortran is...

BLAS and CUBLAS

I'm wondering about Nvidia's CUBLAS Library. Does anybody have experience with it? For example if I write a C program using BLAS will I be able to replace the calls to BLAS with calls to CUBLAS? Or even better implement a mechanism which let's the user choose at runtime? What about if I use the BLAS Library provided by Boost with C++? ...

Linking LAPACK/BLAS libraries

Background: I am working on a project written in a mix of C and Fortran 77 and now need to link the LAPACK/BLAS libraries to the project (all in a Linux environment). The LAPACK in question is version 3.2.1 (including BLAS) from netlib.org. The libraries were compiled using the top level Makefile (make lapacklib and make blaslib). Probl...

mystified by qr.Q(): what is an orthonormal matrix in "compact" form?

R has a qr() function, which performs QR decomposition using either LINPACK or LAPACK (in my experience, the latter is 5% faster). The main object returned is a matrix "qr" that contains in the upper triangular matrix R (i.e. R=qr[upper.tri(qr)]). So far so good. The lower triangular part of qr contains Q "in compact form". One can extra...

What does BLAS DGEMV error code -6 mean?

I have a program that runs through R but uses the BLAS routines. It runs through correctly about 8 times but then throws an error: BLAS/LAPACK routine 'DGEMV ' gave error code -6 What does this error code mean? ...

Installing C++ Armadillo library on Mac OS X

I am trying to use the C++ armadillo library (armadillo-0.9.10) on a Mac Pro. I follow the manual installation instruction in the README.txt file. I have modified the config.hpp file to indicate that I have LAPACK and BLAS installed. I then try to compile the examples. I successfully compile and run example1.cpp, but when I try to run...

Bignum, Linear Algebra and Digital Signal Processing on iPhone OS (iOS 4)

I think I've found some gems in the iPhone OS (iOS 4). I found that there're 128-bit, 256-bit, 512-bit and 1024-bit integer data types, provided by the Accelerate Framework. There're also Apple's implementation of Basic Linear Algebra Subprograms (BLAS), Apple's implementation of LAPACK (Linear Algebra PACKage), and Digital Signal Proce...

Multiplying three matrices in BLAS with the middle one being diagonal

A is an MxK matrix, B is a vector of size K, and C is a KxN matrix. What set of BLAS operators should I use to compute the matrix below? M = A*diag(B)*C One way to implement this would be using three for loops like below for (int i=0; i<M; ++i) for (int j=0; j<N; ++j) for (int k=0; k<K; ++k) M(i,j) = A(i,k)*B(...

Transpose in BLAS or do it myself first?

Hi there. I'm putting together some scientific code in Fortran 77, and I am having a debate on what would be faster. Basically, I have an MxN matrix, let's call it A. M is larger than N. Later on in the code, I need to multiply transpose(A) by a bunch of vectors. My question is, would it be faster to take A, transpose it on my o...

cblas_dgemm - works ONLY if (beta) is power-of-two

Hi, I am totally stumped. I have a fairly large recursive program written in c that calls cblas_dgemm(). The result is verified independently by a program that works correctly. C = alpha*A*B + beta*C On repeated tests using random matrices and all possible combination of parameters the program gives correct answer ONLY if abs(beta) ...

cblas_dgemm - correct parameters : incorrect error message

I am trying to compute: C = 1*(A*B') + 0*C using cblas_dgemm(). As far as I can tell, the parameters are correct. The error message itself does not make sense: "ldb must be >= MAX(K,1): ldb=3 K=3Parameter 11 to routine cblas_dgemm was incorrect" But, ldb = k = 3! Here is the detailed output of all three matrices and the parameters. ...