ansaurus

Question

Capturing Non-Zero Elements, Counts and Indexes of Sparse Matrix

Answer 1

A:

Ap is easy: simply start with zeroes and increment each time you meet a nonzero number. I don't see you trying to write anything into @Ap, so it's no surprise it doesn't end up as you wish.

Ai and Ax are trickier: you want a columnwise ordering while you're scanning rowwise. You won't be able to do anything in-place since you don't know yet how many elements the columns will yield, so you can't know in advance the elements' position.

Obviously, it would be a hell lot easier if you could just alter the requirement to have a rowwise ordering instead. Failing that, you could get complex and collect (i, j, x) triplets. While collecting, they'd naturally be ordered by (i, j). Post-collection, you'd just want to sort them by (j, i).

JB 2009-08-25 09:38:20

Answer 2

A:

The code you provided works on a row-by-row basis. To get results sequential by columns you have to accumulate your values into separate arrays, one for each column:

# will look like ([], [], [] ...), one [] for each column.
my @columns;

while (<MATRIX>) {
    my @row = split qr'\s+';
    for (my $col = 0; $col < @row; $col++) {

        # push each non-zero value into its column
        push @{$columns[$col]}, $row[$col] if $row[$col] > 0;

    }
}

# now you only need to flatten it to get the desired kind of output:
use List::Flatten;
@non_zero = flat @columns;

See also List::Flatten.

Inshallah 2009-08-25 09:50:47

Answer 3

+1 A:

This is what you are looking for, I guess:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper::Simple;

my @matrix;

# Populate @matrix
while (<>) {
    push @matrix, [ split /\s+/ ];
}

my $columns = @{ $matrix[0] };
my $rows    = @matrix;

my ( @Ap, @Ai, @Ax );
my $ap = 0;

for ( my $j = 0 ; $j <= $rows ; $j++ ) {
    for ( my $i = 0 ; $i <= $columns ; $i++ ) {
        if ( $matrix[$i]->[$j] ) {
            $ap++;
            push @Ai, $i;
            push @Ax, $matrix[$i]->[$j];
        }
    }
    push @Ap, $ap;
}

print Dumper @Ap;
print Dumper @Ai;
print Dumper @Ax;

Alan Haggai Alavi 2009-08-25 10:09:08

@AHA: Thanks. But your approach still slurp the whole matrix into @matrix. In practice matrix dimension is 10^6. I think my RAM can't afford it.

neversaint 2009-08-25 10:52:55

@foolishbrat: Oh I see.@Inshalla: Sorry, I did not understand.

Alan Haggai Alavi 2009-08-25 11:20:48

Answer 4

+1 A:

A common strategy for storing sparse data is to drop the values you don't care about (the zeroes) and to store the row and column indexes with each value that you do care about, thus preserving their positional information:

[VALUE, ROW, COLUMN]

In your case, you can economize further since all of your needs can be met by processing the data column-by-column, which means we don't have to repeat COLUMN for every value.

use strict;
use warnings;
use Data::Dumper;

my ($r, $c, @dataC, @Ap, @Ai, @Ax, $cumul);

# Read data row by row, storing non-zero values by column.
#    $dataC[COLUMN] = [
#        [VALUE, ROW],
#        [VALUE, ROW],
#        etc.
#    ]
$r = -1;
while (<DATA>) {
    chomp;
    $r ++;
    $c = -1;
    for my $v ( split '\s+', $_ ){
        $c ++;
        push @{$dataC[$c]}, [$v, $r] if $v;
    }
}

# Iterate through the data column by column
# to compute the three result arrays.
$cumul = 0;
@Ap = ($cumul);
$c = -1;
for my $column (@dataC){
    $c ++;
    $cumul += @$column;
    push @Ap, $cumul;
    for my $value (@$column){
        push @Ax, $value->[0];
        push @Ai, $value->[1];
    }
}

__DATA__
2   3   0   0   0
3   0   4   0   6
0  -1  -3   2   0
0   0   1   0   0
0   4   2   0   1

FM 2009-08-25 13:00:48

Answer 5

+1 A:

Updated based on FM's comment. If you do not want to store any of the original data:

#!/usr/bin/perl

use strict;
use warnings;

my %matrix_info;

while ( <DATA> ) {
    chomp;
    last unless /[0-9]/;
    my @v = map {0 + $_ } split;
    for (my $i = 0; $i < @v; ++$i) {
        if ( $v[$i] ) {
            push @{ $matrix_info{$i}->{indices} }, $. - 1;
            push @{ $matrix_info{$i}->{nonzero} }, $v[$i];
        }
    }
}

my @cum_count = (0);
my @row_indices;
my @nonzero;

for my $i ( sort {$a <=> $b } keys %matrix_info ) {
    my $mi = $matrix_info{$i};
    push @nonzero, @{ $mi->{nonzero} };

    my @i = @{ $mi->{indices} };

    push @cum_count, $cum_count[-1] + @i;
    push @row_indices, @i;
}

print(
    "\@Ap = [@cum_count]\n",
    "\@Ai = [@row_indices]\n",
    "\@Ax = [@nonzero]\n",
);

__DATA__
2   3   0   0   0
3   0   4   0   6
0  -1  -3   2   0
0   0   1   0   0
0   4   2   0   1

Output:

C:\Temp> m
@Ap = [0 2 5 9 10 12]
@Ai = [0 1 0 2 4 1 2 3 4 2 1 4]
@Ax = [2 3 3 -1 4 4 -3 1 2 2 6 1]

Sinan Ünür 2009-08-25 14:22:41

Perhaps I'm misunderstanding something, but your first answer gives the impression that `@cum_count` is like the OP's cumulative count (`@Ap`), but it is not. Your counts are row-wise tallies; he wants column-wise. That said, the tallies the OP needs are available in `%row_indices` (for example, `scalar @{$row_indices{0}}`), so the answer is still a good one.

FM 2009-08-25 15:05:52

@FM Ah! Good catch!

Sinan Ünür 2009-08-25 15:57:08

ansaurus

tags:

views:

answers:

Capturing Non-Zero Elements, Counts and Indexes of Sparse Matrix

related questions