views:

242

answers:

3

I’m trying to implement a Blind Source Separation (BSS) algorithm and I’m running into trouble determining the efficacy of the algorithm.

I’m trying to create test cases where I work backwards and start with a signal vector s, which is normally unknown, and then create a mixing matrix A, which I use to transform s to create the observation vector x, which is what is normally observed by things like sensory equipment. Therefore I have a model that looks like

x = A * s.

I then put x into the BSS algorithm and get s’, which is a reconstruction of the signal vector.

Now this is where I have a lot of problems; how can I compare s’ and s, and get a quantitative measure of how similar the two vectors are? The algorithm I am studying can only reconstruct the signal vector up to a negative sign (therefore it is possible for s’ to be similar to -s, or stated in another way, for s' to be similar in "shape" to s but flipped) and cannot guarantee the preservation of the amplitudes of the signals. So I want to compare the “shapes” of the signals to each other, while also anticipating the fact that while their "shapes" might be similar, they might be flipped.

Just to clarify, when I say signal, I really mean a matrix that could be something like 50 x 10000 (50 different channels with 10000+ data points taken over time). Another problem that arises from the BSS algorithm is that the ordering of the channels is not guaranteed to be preserved. So given s'1, s'2, s'3, ... , s'N, which would be the different channels of s', might be reconstructed (again, perhaps flipped and of different amplitudes compared to the original channels in s), but the ordering is not guaranteed to be preserved. So s'1 might correlate instead to s23, and s'2 to s5, and so on and so forth.

So I was wondering if there's a fast and efficient way of comparing the similarity between two different matrices, which are assumed to be composed of vectors that should correlate to one another, although not in the same order, sign, or amplitude.

What would be the best way to solve this? Appreciate the help!

A: 

You're suffering from combinatoric explosion. Ordering issue aside: I'd suggest a base line removal and then use cross-corrlation to find the best (s) and least (-s) correlation signals. The 'flipping' of the signal is thus resolved.

2D cross correlation is offered in many image processing libraries which could be used to determined quickly which columns are ordered properly - those with high (s) and low (-s) correlations. The columns in between need re-ordering.

If you view your matrix as an image, there are many image processing algorithms that can measure the similarity of two images that are invariant across various transforms (shear, scale,...). However your column mixing is a odd nonlinear transform, which doesn't fit the image processing paradigm well.

Paul
I don't think 2D cross-correlation will work at all given the order of rows and the sign is mixed. There should be no relevant structural information to use.
kigurai
+1  A: 

A simple "greedy" method, avoiding the combinatorial explosion

For reconstructed signals q_i and ground truth s_j, let NxN matrix

M[i,j] = abs(corr(q_i,s_j))

with "corr()" some correlation function giving results in [-1,1], e.g. pearson moment-product (MATLAB: corrcoef() ) or spearman rank-corrrelation.

Step 1) Find (i,j) = argmax(l,m) M[l,m]. This is a pair of reconstructed index i,j. Push it on a list. This is the best matching pair of what's currently eligible.

Step 2) Blank out row i and column j with NaN's or something.

Step 3) If you haven't done N of them yet, (matrix is not all NaN's) go to Step 1.

(equivalently you can delete a row & column, remembering the original indices).

For all the pairs in the list, average the original M[i,j]. This is the average correlation (absolute valued) of the pairs you found with the greedy algorithm .

Matt Kennel
Sorry, but I don't quite understand what you're doing at Step 1. Is this MATLAB syntax? (I'm not that fluent in MATLAB) - could someone clarify? Thanks!
oort
A: 

If you don't know which channels are correlated, I assume you want to separate them out and compare them individually, and that this is what the question is about. (I don't know of an algorithm that can do it all in some exceptionally efficient way -- I'll just address the problem of the scale factor)

If there is an arbitrary (nonzero?) scale factor involved, you can take the dot product of the two vectors, and divide by the product of their norms. The result will be in the range of [-1,1], and you can choose the match with the largest absolute value.

comingstorm