ansaurus

Question

Answer 1

A:

(a) I don't fully understand what you're trying to accomplish, but if you want to compare if both matrices have the same value at the same index it's sufficient to use elementwise matrix multiplication (which should be implemented for sparse as well):

matrix3 = element_prod (matrix1, matrix2);

That way you'll get for each index:

0 (false) * 1 (true) = 0 (false)
0*0 = 0
1*1 = 1

So resulting matrix3 will have your solution in one line :)

count0 2009-08-27 14:53:34

That is an interesting suggestion, but I need < to order the matrices. Essentially, what I need is a way to compute XOR on these matrices that uses the sparse matrix iterators and performs the XOR in the order j*maxJ+i so I can stop computing when I find a TRUE in the XOR knowing which matrix is "less".

scandido 2009-08-28 15:36:36

Answer 2

A:

It seems to me we're talking about implementing bitwise,elementwise operators on boost::sparse_matrix, since comparing if one vector (or matrix) is smaller than another without using any standard vector norms demands special operators (or special mappings/norms).

To my knowledge boost does not provide special operators for binary matrices (not to speak of sparse binary matrices). There are unlikely any straightforward solutions to this using BLAS level matrix/vector algebra. Binary matrices have an own place in the linear algebra field, so there are tricks and theorems but i doubt those are easier than your solution.

Your question could be reformulated as: How do i sort efficiently astronomically large numbers represented by a 2d-bitmap (n=100 so 100x100 elements would give you a number like 2^10000).

Good question !

count0 2009-08-28 17:52:57

Yes, there is obviously not a magic bullet. The fact that the number of true entries in the matrix is sparse, so we can get a solution that runs in acceptable time (via the algorithm I mentioned). The problem is that the code for getting the next iterator for both matrices simultaneously is not necessarily clean, and I wonder if there is a good way to iterate through two sparse 2D matrices together.

scandido 2009-08-28 20:40:59

Answer 3

+1 A:

I like this question, by the way.

Let me pseudocode out what I think you're asking

declare list of sparse matrices ListA
declare map MatMAp with a sparse Matrix type mapping to a double, along with a
`StrictWeakMatrixOrderer` function which takes two sparse matrices.

Insert ListA into MatMap.

The Question: How do I write a StrictWeakMatrixOrderer efficiently?

This is an approach. I'm inventing this on the fly....

Define a function flatten() and precompute the flattened matrices, storing the flattened vectors in a vector(or another container with a random indexing upper bound). flatten() could be as simple as concatenating each row(or column) with the previous one(which can be done in linear time if you have a constant-time function to grab a row/column).

This yields a set of vectors with size on the order of 10^6. This is a tradeoff - saving this information instead of on-the-fly computing it. This is useful if you're going to be doing a lot of compares as you go along.

Remember, zeros contain information - dropping them will possibly yield two vectors equal to each other, whereas their generating matrix may not be equal.

Then, we have transformed the algorithm question from "order matrices" into "order vectors". I've never heard of a distance metric for matrices, but I've heard of distance metrics for vectors.

You could use a "sum of differences" ordering aka Hamming distance. (foreach element that's different, add 1). That will be a O(n) algorithm:

for i = 0 to max.
  if(a[i] != b[i])
     distance++

return distance

The Hamming distance satisfies these conditions

d(a,b) = d(b,a)
d(a,a) = 0
d(x, z) <= d(x, y) + d(y, z)

Now to do some off-the-cuff analysis....

10^6 elements in a matrix(or its corrosponding vector).
O(n) distance metric.
- But it's O(n) compares. If each array access has O(m) time, then you would have an O(n*(n+n)) = O(n^2) metric. So you have to have < O(n) access. It turns out that std::vector [] operator provides "amortized constant time access to arbitrary elements" according to SGI's STL site.
Providing you have sufficient memory to store k*2*10^6, where k is the number of matrices you are managing, this is a working solution that uses lots of memory in exchange for being linear.

Paul Nathan 2009-08-28 19:56:06

The key type is the sparse matrix and that maps to a double. (The application is related to a probability distribution over the sparse matrices.) Otherwise, your assessment is correct. Regarding your question, no it does not matter so long as it implements a strict weak ordering and only identical matrices are equivalent.

scandido 2009-08-28 20:45:57

How big is a matrix? is the widest side 10^3? 10^5? What % of the matrix is zeros?

Paul Nathan 2009-08-28 21:04:50

The size of the matrix varies, but typically it is from 10^2 to 10^3. The sparseness of the matrix varies quite a bit as the matrix represents a state of a diffusion process.

scandido 2009-08-28 21:37:41

This is a viable alternative, but in my particular application less desirable than the initial solution I proposed because of the memory usage. Thank you for the suggestion, someone else approaching this problem may benefit from your answer quite a bit.

scandido 2009-08-30 18:01:29

ansaurus

tags:

views:

answers:

iterating through TWO sparse matrices

related questions