ansaurus

Question

How can I partition a Perl array into equal sized chunks?

Answer 1

+1 A:

Try this:

$VAR = [map $_ % 3 == 0 ? ([ $array[$_], $array[$_ + 1], $array[$_ + 2] ]) 
                        : (),
            0..$#array];

Adrian Pronk 2009-09-29 06:38:12

I'm not sure whether +1 it for cuteness or -1 it for sheer hackiness :) Unvoted it stays.

DVK 2009-09-29 08:25:53

-1, because I would 100% make a mistake somewhere in there :)

Karel Bílek 2009-09-30 00:00:59

Answer 2

+3 A:

Or this:

my $VAR;
while( my @list = splice( @array, 0, 3 ) ) {
    push @$VAR, \@list;
}

Tore Busch 2009-09-29 07:47:36

Answer 3

+4 A:

Another answer (a variation on Tore's, using splice but avoiding the while loop in favor of more Perl-y map)

my $result = [ map { [splice(@array, 0, 3)] } (1 .. (scalar(@array) + 2) % 3) ];

DVK 2009-09-29 08:24:39

I wouldn't call it more Perl-y just because it uses `map()` - it's largely much more cluttered and harder to grok. The most "Perl-y" solution is `natatime()` because it's from CPAN.

Chris Lutz 2009-09-30 03:26:23

Hmm... I can't say I greatly disagree with you re: possibly harder to grok. But having been a professional Perl developer for many years, I have encountered enough bad-to-horrible junk on CPAN that I don't necessarily consider "uses something from CPAN" to be a Good Householding Seal Of Approval of a perl solution. Mind you, List::MoreUtils, from my cursory examination today, appears to be a very neat and useful module, so it is definitely not included in the gripe above :)

DVK 2009-09-30 12:32:52

@DVK - When I say "because it's from CPAN," I'm lovingly poking fun at the trends of my favorite language, not offering it as the be-all end-all of solutions. We really need to find a way to express sarcasm on the internets.

Chris Lutz 2009-09-30 17:50:49

Sorry. After 2 sleepless nights, my sarcasm module is not loading.

DVK 2009-09-30 18:06:52

Answer 4

+8 A:

my @VAR;
push @VAR, [ splice @array, 0, 3 ] while @array;

or you could use natatime from List::MoreUtils

use List::MoreUtils qw(natatime);

my @VAR;
{
  my $iter = natatime 3, @array;
  while( my @tmp = $iter->() ){
    push @VAR, \@tmp;
  }
}

Brad Gilbert 2009-09-29 14:18:00

@Brad - +1 for List::MoreUtils - it's a great gem all-around even outside of this answer.

DVK 2009-09-30 12:31:16

Also, please note that - at least as of 2/2009 - there was a memory leak in XS version of natatime (no leak in PP version). See http://www.perlmonks.org/?node_id=742364

DVK 2009-09-30 12:41:05

Answer 5

+2 A:

I really like List::MoreUtils and use it frequently. However, I have never liked the natatime function. It doesn't produce output that can be used with a for loop or map or grep.

I like to chain map/grep/apply operations in my code. Once you understand how these functions work, they can be very expressive and very powerful.

But it is easy to make a function to work like natatime that returns a list of array refs.

sub group_by ($@) {
    my $n     = shift;
    my @array = @_;

    croak "group_by count argument must be a non-zero positive integer"
        unless $n > 0 and int($n) == $n;

    my @groups;
    push @groups, [ splice @array, 0, $n ] while @array;

    return @groups;
}

Now you can do things like this:

my @grouped = map [ reverse @$_ ],
              group_by 3, @array;

** Update re Chris Lutz's suggestions **

Chris, I can see merit in your suggested addition of a code ref to the interface. That way a map-like behavior is built in.

# equivalent to my map/group_by above
group_by { [ reverse @_ ] } 3, @array;

This is nice and concise. But to keep the nice {} code ref semantics, we have put the count argument 3 in a hard to see spot.

I think I like things better as I wrote it originally.

A chained map isn't that much more verbose than what we get with the extended API. With the original approach a grep or other similar function can be used without having to reimplement it.

For example, if the code ref is added to the API, then you have to do:

my @result = group_by { $_[0] =~ /foo/ ? [@_] : () } 3, @array;

to get the equivalent of:

my @result = grep $_->[0] =~ /foo/,
             group_by 3, @array;

Since I suggested this for the sake of easy chaining, I like the original better.

Of course, it would be easy to allow either form:

sub _copy_to_ref { [ @_ ] }

sub group_by ($@) {
    my $code = \&_copy_to_ref;
    my $n = shift;

    if( reftype $n eq 'CODE' ) {
        $code = $n;
        $n = shift;
    }

    my @array = @_;

    croak "group_by count argument must be a non-zero positive integer"
        unless $n > 0 and int($n) == $n;

    my @groups;
    push @groups, $code->(splice @array, 0, $n) while @array;

    return @groups;
}

Now either form should work (untested). I'm not sure whether I like the original API, or this one with the built in map capabilities better.

Thoughts anyone?

** Updated again **

Chris is correct to point out that the optional code ref version would force users to do:

group_by sub { foo }, 3, @array;

Which is not so nice, and violates expectations. Since there is no way to have a flexible prototype (that I know of), that puts the kibosh on the extended API, and I'd stick with the original.

On a side note, I started with an anonymous sub in the alternate API, but I changed it to a named sub because I was subtly bothered by how the code looked. No real good reason, just an intuitive reaction. I don't know if it matters either way.

daotoad 2009-09-30 00:59:15

Why not have `group_by` take a code reference as the first argument, so we can determine what to do with our group? Usage: `group_by { [ @_ ] } 3, @array;`

Chris Lutz 2009-09-30 03:43:47

Ideal syntax would be `group_by 3 { [ @_ ] } @array;` but of course we'd need to explicitly declare the anonymous `sub` for Perl not to whine.

Chris Lutz 2009-09-30 03:54:48

The only problem with the second version that uses an optional code reference is that the `map { code } @list` syntax only works if the subroutine is prototyped to have the first argument be a code reference. As written, you would need to explicitly specify that the code block was a `sub` (or declare the sub somewhere else and pass a reference to it). Also, I wouldn't have bothered writing a named subroutine for `_copy_to_ref()` and just said `my $code = sub { [ @_ ] };` but that's just me. It might be more efficient to do it your way.

Chris Lutz 2009-09-30 17:39:28

ansaurus

tags:

views:

answers:

How can I partition a Perl array into equal sized chunks?

related questions