views:

220

answers:

2

Assume you have an NxM matrix A of full rank, where M>N. If we denote the columns by C_i (with dimensions Nx1), then we can write the matrix as

A = [C_1, C_2, ..., C_M]

How can you obtain the first linearly independent columns of the original matrix A, so that you can construct a new NxN matrix B that is an invertible matrix with a non-zero determinant.

B = [C_i1, C_i2, ..., C_iN]

How can you find the indices {i1, i2, ..., iN} either in matlab or python numpy? Can this be done using singular value decomposition? Code snippets will be very welcome.

EDIT: To make this more concrete, consider the following python code

from numpy import *
from numpy.linalg.linalg import det

M = [[3, 0, 0, 0, 0],
     [0, 0, 1, 0, 0],
     [0, 0, 0, 0, 1], 
     [0, 2, 0, 0, 0]]
M = array(M)

I = [0,1,2,4]
assert(abs(det(M[:,I])) > 1e-8)

So given a matrix M, one would need to find the indices of a set of N linearly independent column vectors.

+1  A: 

My first thought would be to try each possible combination of N out of the M columns. That could be done like this (in Python):

import itertools
import numpy.linalg

# 'singular' returns whether a matrix is singular.
# You could use something more efficient than the determinant
# (I'm not sure what options there are in NumPy)
singular = lambda m: numpy.linalg.det(m) == 0

def independent_square(A):
    N,M = A.shape
    for colset in itertools.combinations(xrange(M), N):
        B = A[:,colset]
        if not singular(B):
            return B

If you want the column indices instead of the resulting square matrix, just replace return B with return colset. Or you could get both with return colset,B.

I don't know of any way SVD would help here. In fact, I can't think of any purely mathematical operation that would convert A to B (or even any that would figure out the MxN column selection matrix Q such that B=A.Q), other than by trial and error. But if you want to find out whether one exists, math.stackexchange.com would be a good place to ask.

If all you need is a way to do it computationally, the above code should suffice.

David Zaslavsky
Whoever downvoted, please justify why you think this procedure is wrong.
David Zaslavsky
I didn't downvote it but it's really expensive. Determinate is expensive to calculate: row reduction is much better if numpy offers it (I don't know). A better algorithm would be inductive: assume n-1 columns are linearly independent. Then continue trying the next unused column until one that is also linearly independent is found. repeat. Remember, even if you probably wont have to search through all of them, (and they're not created unless you do) there are M!/(N!(M - N)!) combinations that you're willing to search through there.
aaronasterling
I don't know if Numpy offers row reduction either... the determinant was just the first way I could think of to determine linear independence of the columns, since I couldn't find a `rank` function (well there is one, but it doesn't do what you'd think). I guess I should edit to say that any procedure for finding whether the columns are linearly independent would work there.
David Zaslavsky
You wanted to know why this is a poor solution? The problem with this solution is it may be incredibly inefficient, depending on the values of M and N. For example, with N = 100 and M = 50, there are over 1e29 possible combinations of columns. A brute force search through the columns is just crazy. Given that there is a very good solution in a QR factorization, use of an inefficient algorithm like a brute force search is silly. Worse, I believe your original response suggested a determinant to determine singularity. det is a terrible thing to do to any matrix.
woodchips
I wanted to know why it was _wrong_, not _poor_. Besides, the QR method doesn't work in Python, because NumPy/SciPy's `qr` method doesn't give the `E` matrix from your solution.
David Zaslavsky
Nobody ever said that it was mathematically wrong. I suggested it was terrible advice to offer as a way to solve this problem, and gave a good reason why. If there is no pivoted QR in Python, then there are still alternatives. Nothing stopped you from suggesting a variation of Gram-Schmidt, which is not that unlike a QR. So write a pivoted form of GS, which merely says that you subtract off that component that points in the direction of the previous columns found so far, then pivot on the column with the largest remaining magnitude. (This WOULD be trivial to write.)
woodchips
+5  A: 

Easy, peasy in MATLAB. Use QR, specifically, the pivoted QR.

M = [3 0 0 0 0;
     0 0 1 0 0;
     0 0 0 0 1; 
     0 2 0 0 0]

[Q,R,E] = qr(M)
Q =
     1     0     0     0
     0     0     1     0
     0     0     0     1
     0     1     0     0

R =
     3     0     0     0     0
     0     2     0     0     0
     0     0     1     0     0
     0     0     0     1     0

E =
     1     0     0     0     0
     0     1     0     0     0
     0     0     1     0     0
     0     0     0     0     1
     0     0     0     1     0

The first 4 columns of E designate the columns of M to be used, i.e., columns [1,2,3,5]. If you want the columns of M, just form the product M*E.

M*E
ans =
     3     0     0     0     0
     0     0     1     0     0
     0     0     0     1     0
     0     2     0     0     0

By the way, Using det to determine if a matrix is singular is the absolutely, positively, definitely worst way you can do that.

Use rank instead.

Essentially, you should virtually NEVER use det in MATLAB unless you understand why it is such a bad thing to do, and you choose to use it despite that fact.

woodchips
Just a note, as the OP was interested in python solutions as well, see here for scipy's scipy.linalg.qr: http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.qr.html
Joe Kington