ansaurus

Question

C++ Memory Efficient Solution for Ax=b Linear Algebra System

Answer 1

+6 A:

Assuming your huge matrices are sparse, which I hope they are at that size, have a look at the PARDISO project which is a sparse linear solver, this is what you'll need if you want to handle matrices as big as you said. Allows efficient storage of only non-zero values, and is much faster than solving the same system of dense matrices.

DeusAduro 2009-08-07 00:09:17

Not to mention the O(m^3) time complexity of the naive solution! Even the clever one Knuth talks about is O(m^2.7ish)... If these matricies aren't sparse you need a cluster and a first class numeric analyist...

dmckee 2009-08-07 00:14:42

+1 for sparse matrix idea.I found numerus libraries and ther comparisons in PARDISO paper about comparing varous sparse matrix libraries ftp://ftp.numerical.rl.ac.uk/pub/reports/ghsRAL200505.pdf This can be used to find other recognised sparse matrix libraries.

ralu 2009-08-07 00:23:39

Answer 2

+3 A:

Not sure about C++ implementations, but there are several things you can do if memory is an issue depending on the type of matrix you're dealing with:

If your matrix is sparse or banded, you can use a sparse or bandwidth solver. These don't store zero elements outside the band.
You can use a wavefront solver, which stores the matrix on disk and only brings in the matrix wavefront for decomposition.
You can avoid solving the matrix altogether and use iterative methods.
You can try Monte Carlo methods of solution.

duffymo 2009-08-07 00:10:54

@duffymo: thanks. I have looked at iterative approach implementation in C++ but they still require storing it in matrix. http://freenet-homepage.de/guwi17/ublas/examples/ If I am wrong, Do you know any mem efficient implementation in C++ for iterative?

neversaint 2009-08-07 00:37:46

Correct, foolishbrat. I should have remembered that. I'd investigate parallel algorithms, because the problem of partitioning the work out to N processors and knitting it back together to get the result is germane to the problem of moving it temporarily out to disk.

duffymo 2009-08-07 00:43:06

Answer 3

+5 A:

I assume that your matrix is dense. If it is sparse, you can find numerous specialised algorithms as already mentioned by DeusAduro and duffymo.

If you don't have a (large enough) cluster at your disposal, you want to look at out-of-core algorithms. ScaLAPACK has a few out-of-core solvers as part of its prototype package, see the documentation here and Google for more details. Searching the web for "out-of-core LU / (matrix) solvers / packages" will give you links to a wealth of further algorithms and tools. I am not an expert on those.

For this problem, most people would use a cluster, however. The package you will find on almost any cluster is ScaLAPACK, again. In addition, there are usually numerous other packages on the typical cluster, so you can pick and choose what suits your problem (examples here and here).

Before you start coding, you probably want to quickly check how long it will take to solve your problem. A typical solver takes about O(3*N^3) flops (N is dimension of matrix). If N = 100000, you are hence looking at 3000000 Gflops. Assuming that your in-memory solver does 10 Gflops/s per core, you are looking at 3 1/2 days on a single core. As the algorithms scale well, increasing the number of cores should reduce the time close to linearly. On top of that comes the I/O.

stephan 2009-08-11 19:07:42

Caveat: the above O(3*N^3) assumes that you use complex numbers. For real numbers, divide everything by 6, i.e. somewhere around O(0.5 * N^3).

stephan 2009-08-13 06:54:16

Answer 4

+3 A:

Have a look at the list of freely available software for the solution of linear algebra problems, compiled by Jack Dongarra and Hatem Ltaief.

I think that for the problem size you're looking at, you probably need an iterative algorithm. If you don't want to store the matrix A in a sparse format, you can use a matrix-free implementation. Iterative algorithms typically do not need to access individual entries of the matrix A, they only need to compute matrix-vector products Av (and sometimes A^T v, the product of the transposed matrix with the vector). So if the library is well-designed, it should be enough if you pass it a class that knows how to do matrix-vector products.

Jitse Niesen 2009-08-12 09:43:13

Answer 5

+9 A:

Short answer: Don't use Boost's LAPACK bindings, these were designed for dense matrices, not sparse matrices, use UMFPACK instead.

Long answer: UMFPACK is one of the best libraries for solving Ax=b when A is large and sparse.

Below is sample code (based on umfpack_simple.c) that generates a simple A and b and solves Ax = b.

#include <stdlib.h>
#include <stdio.h>
#include "umfpack.h"

int    *Ap; 
int    *Ai;
double *Ax; 
double *b; 
double *x; 

/* Generates a sparse matrix problem: 
   A is n x n tridiagonal matrix
   A(i,i-1) = -1;
   A(i,i) = 3; 
   A(i,i+1) = -1; 
*/
void generate_sparse_matrix_problem(int n){
  int i;  /* row index */ 
  int nz; /* nonzero index */
  int nnz = 2 + 3*(n-2) + 2; /* number of nonzeros*/
  int *Ti; /* row indices */ 
  int *Tj; /* col indices */ 
  double *Tx; /* values */ 

  /* Allocate memory for triplet form */
  Ti = malloc(sizeof(int)*nnz);
  Tj = malloc(sizeof(int)*nnz);
  Tx = malloc(sizeof(double)*nnz);

  /* Allocate memory for compressed sparse column form */
  Ap = malloc(sizeof(int)*(n+1));
  Ai = malloc(sizeof(int)*nnz);
  Ax = malloc(sizeof(double)*nnz);

  /* Allocate memory for rhs and solution vector */
  x = malloc(sizeof(double)*n);
  b = malloc(sizeof(double)*n);

  /* Construct the matrix A*/
  nz = 0;
  for (i = 0; i < n; i++){
    if (i > 0){
      Ti[nz] = i;
      Tj[nz] = i-1;
      Tx[nz] = -1;
      nz++;
    }

    Ti[nz] = i;
    Tj[nz] = i;
    Tx[nz] = 3;
    nz++;

    if (i < n-1){
      Ti[nz] = i;
      Tj[nz] = i+1;
      Tx[nz] = -1;
      nz++;
    }
    b[i] = 0;
  }
  b[0] = 21; b[1] = 1; b[2] = 17;
  /* Convert Triplet to Compressed Sparse Column format */
  (void) umfpack_di_triplet_to_col(n,n,nnz,Ti,Tj,Tx,Ap,Ai,Ax,NULL);

  /* free triplet format */ 
  free(Ti); free(Tj); free(Tx);
}


int main (void)
{
    double *null = (double *) NULL ;
    int i, n;
    void *Symbolic, *Numeric ;
    n = 500000;
    generate_sparse_matrix_problem(n);
    (void) umfpack_di_symbolic (n, n, Ap, Ai, Ax, &Symbolic, null, null);
    (void) umfpack_di_numeric (Ap, Ai, Ax, Symbolic, &Numeric, null, null);
    umfpack_di_free_symbolic (&Symbolic);
    (void) umfpack_di_solve (UMFPACK_A, Ap, Ai, Ax, x, b, Numeric, null, null);
    umfpack_di_free_numeric (&Numeric);
    for (i = 0 ; i < 10 ; i++) printf ("x [%d] = %g\n", i, x [i]);
    free(b); free(x); free(Ax); free(Ai); free(Ap);
    return (0);
}

The function generate_sparse_matrix_problem creates the matrix A and the right-hand side b. The matrix is first constructed in triplet form. The vectors Ti, Tj, and Tx fully describe A. Triplet form is easy to create but efficient sparse matrix methods require Compressed Sparse Column format. Conversion is performed with umfpack_di_triplet_to_col.

A symbolic factorization is performed with umfpack_di_symbolic. A sparse LU decomposition of A is performed with umfpack_di_numeric. The lower and upper triangular solves are performed with umfpack_di_solve.

With n as 500,000, on my machine, the entire program takes about a second to run. Valgrind reports that 369,239,649 bytes (just a little over 352 MB) were allocated.

Note this page discusses Boost's support for sparse matrices in Triplet (Coordinate) and Compressed format. If you like, you can write routines to convert these boost objects to the simple arrays UMFPACK requires as input.

codehippo 2009-08-14 19:30:28

+1 for school pride :)

ccook 2010-10-22 04:06:19

Answer 6

+1 A:

As the accepted answer suggests there is UMFPACK. But if you are using BOOST you can still use the compact matrices in BOOST and use UMFPACK to solve the system. There is a binding which makes it easy:

http://mathema.tician.de/software/boost-numeric-bindings

Its about two years dated, but its just a binding (along with a few others).

see related question: http://stackoverflow.com/questions/3989094/umfpack-and-boosts-ublas-sparse-matrix

ccook 2010-10-22 04:03:37

ansaurus

tags:

views:

answers:

C++ Memory Efficient Solution for Ax=b Linear Algebra System

related questions