views:

537

answers:

6

I am learning Perl at my work and enjoying it. I usually do my work in Python but boss wants Perl.

Most of the concepts in Python and Perl match nicely: Python dictionary=Perl hash; Python tuple=Perl list; Python list=Perl array; etc.

Question: Is there a Perl version of the Python form of an Iterator / Generator?

An example: A Classic Python way to generate the Fibonacci numbers is:

#!/usr/bin/python

def fibonacci(mag):
     a, b = 0, 1
     while a<=10**mag:
         yield a
         a, b = b, a+b

for number in fibonacci(15):  
     print "%17d" % number

Iterators are also useful if you want to generate a subsection of a much larger list as needed. Perl 'lists' seem more static - more like a Python tuple. In Perl, can foreach be dynamic or is only based on a static list?

The Python form of Iterator is a form that I have gotten used to, and I do not find it documented in Perl... Other than writing this in loops or recursively or generating a huge static list, how do I (for ex) write the Fibonacci subroutine it in Perl? Is there a Perl yield that I am missing?

Specifically -- how do I write this:

#!/usr/bin/perl
use warnings; use strict; # yes -- i use those!

sub fibonacci {
   # What goes here other than returning an array or list? 
}

foreach my $number (fibonacci(15)) { print $number . "\n"; }

Thanks in advance to being kind to the newbie...

+4  A: 

There's a good practical example here and a PDF article here... but I'm too rusty in Perl to try to implement your challenge directly (as you'll see, both the example and the approach in the PDF use a less direct approach).

Alex Martelli
+1 both excellent articles.
drewk
+6  A: 

There is a similar method to produce a Iterator / Generator, but it is not a "first class citizen" as it is on Python.

In Perl, if you do not see what you want (after a MANDATORY trip to CPAN FIRST!), you can roll your own that is similar to a Python iterator based on Perl closures and an anonymous subroutine.

Consider:

use strict; use warnings;

sub fibo {
    my ($an, $bn)=(1,0);
    my $mag=(shift || 1);
    my $limit=10**$mag;
    my $i=0;

    return sub {
        ($an, $bn)=($bn, $an+$bn);      
        return undef if ($an >=$limit || wantarray );
        return $an;
    }
}

my $num;
my $iter=fibo(15);
while (defined($num=$iter->()) ) { printf "%17d\n", $num; }

The sub fibo maintains a Perl closure that allows persistent variables to be maintained. You can do the same by having a module, similar to C / C++. Inside fibo an anonymous subroutine does the work of returning the next data item.

To quote from the Perl Bible "You will be miserable until you learn the difference between scalar and list context" -- p 69 (A highly recommended book btw...)

In this case, the annon sub only returns a single value. The only looping mechanism that I know of in Perl that can work in scalar context is while; The others try to fill the list before proceeding I think. Therefor, if you called the anon sub in list context, it will dutifully return the next fibonacci number, unlike Python's for iterators, and the loop would terminate. That is why I put the return undef if .... wantarray because it does not work in list context as written.

There are ways to fix that. Indeed, you can write subroutines that act like map foreach etc but it is not as straightforward as Python's yield. You will need an additional function to use inside a foreach loop. The tradeoff is the Perl approach has tremendous power and flexibility.

You can read more about Perl iterators in Mark Jason Dominus' excellent book "Higher Order Perl" Chapter 4 is all about Interators brian d foy also has an excellent article on Interators in the Perl Review.

drewk
Why not calculate `10**$mag` once in `fibo` instead of repeating the calculation every time the iterator is called?
cjm
@cjm: you are right, that would be better... I'll edit it tomorrow.
drewk
+1. Thanks! This **exactly** produces the output of my Python script - faster I think - and your narrative was very helpful. Can you show code on how to use `fibo` and `iter` in a `foreach` form? While your code works and text seems correct, you have only received 2 votes as I write this. Am I missing something? Is there a reason that this should not be the accepted answer?
carrot-top
+16  A: 

The concept of an iterator is a little different in Perl. You basically want to return a one-use subroutine "closed" over the persistent variables.

use bigint;
use strict;
use warnings;

sub fibonacci {
    my $limit = 10**( shift || 0 );
    my ( $a, $b ) = ( 0, 1 );
    return sub { 
        return if $a > $limit;
        ( my $r, $a, $b ) = ( $a, $b, $a + $b );
        return $r;
    };
}
my $fit = fibonacci( 15 );
my $n = 0;
while ( defined( my $f = $fit->())) { 
     print "F($n): $f\n";
     $n++;
}

And if you don't like the while loop, then here is two shots at some syntactic sugar, which basically accomplish a each item loop.:

sub iterate ($$) {
    my $iter   = shift;
    my $action = shift;
    while ( defined( my $nextval = $iter->())) { 
        local *_ = \$nextval;
        $action->( $_ );
    }
    return;
}

iterate fibonacci( 15 ) => sub { print "$_\n"; };

sub iter (&$) { 
    my $action = shift;
    my $iter   = shift;
    while ( defined( my $nextval = $iter->())) { 
        local *_ = \$nextval;
        $action->( $_ );
    }
    return;
}

iter { print "$_\n" } fibonacci( 15 );
Axeman
++ : I really learned a lot from this answer.
Zaid
A good answer, but your limit is wrong. If you look at the Python it is 10**limit, not just 15 steps...
drewk
@drewk: fixed it. I omitted the exponential in order to test it and forgot to put it back in.
Axeman
why the `(*)` glob context?
Eric Strom
@Eric Strom: I think because you need a second argument and over the years I have found that `'*'` is the one that Perl complains the least about. It won't say "Expected glob or bareword but got x" and yet if I don't put a second argument it will complain, and if I pass some form of subroutine, it wont' complain.
Axeman
I would usually use `($)` context in those cases, that way it avoids any bugs like passing a bareword that you expect to be a function call but is misspelled and passed in silently as a string
Eric Strom
Thank you, and I learned from your text. The program seems broken, however, in two respects: 1) recall that the Fibonacci numbers are `0 1 1 2 3 5 8 ...` and you generate `1 1 2 3 5 ..` skipping `0`; your limit is incorrect since it is just a simple counter. The limit I was looking for is breaking at values > 10**15 since that is the point on my machine where I start getting sci notation. The limit is an fix, but the skipping `0` is not intuitive. If you look at the Python, the generator stats with a,b=0,1 but that does not work on yours. I think because of no `yield` which diff than `return`
carrot-top
@carrot-top: I was a math major on the 4.0 list. I don't recall ever using a fib series beginning at 0, but it's an easy enough change.
Axeman
@carrot-top: added `use bigint` to fix the overflow problem, made F_0 = 0.
Axeman
@carrot-top: Okay. I fixed it so that the parameter limits, `$a`. There was a little conceptual problem. Normally, when I see a number passed to a generator of a series, it is the maximum term, not a limit on the range. Of course with `bigint` you don't have a magnitude problem. In testing this, I generated F_8678 1,814 digits long.
Axeman
+10  A: 

The excellent Higher-Order Perl book (available for free at the specified link) contains a lot of information on related topics, and in particular has a whole chapter on iterators. By "higher order" the author implies using Perl's abilities as a functional language with first-class functions to implement all kinds of cool stuff. It really is a very good book - I read most of it, and the chapters on iterators and streams are terrific. I highly recommend to at least skim through it if you plan to write Perl code.

Eli Bendersky
+16  A: 

For an even more flexible solution than Python's generators, I have written the module List::Gen on CPAN which provides random access lazy generator arrays:

use List::Gen;

my $fib; $fib = cache gen {$_ < 2  ? $_ : $$fib[$_ - 1] + $$fib[$_ - 2]};

say "@$fib[0 .. 15]";  #  0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610

Since generators pretend to be arrays, they can mix seamlessly with normal perl code. There is also an object oriented approach:

my $fib; $fib = cache gen {$_ < 2 ? $_ : $fib->get($_ - 1) + $fib->get($_ - 2)};

say join ' ' => $fib->slice(0 .. 15);

In each case, the generator is lazy, calculating nothing upon creation, and then calculating only those values required to satisfy the slices. The recursive definition of the Fibonacci sequence calls itself many times, so the cache function is used to make sure each value is only calculated once.

You can also use generators as iterators:

while (my $num = $fib->next) {
    last if $num > 10**15;
    print "$_\n";
}

$fib->next can also be written $fib->(). Since the generator is still random access, you can $fib->reset() or $fib->index = 10;

Let me know if you have any questions.

Update:

I have released a new version of the module (0.80) that makes it easier to use iterative algorithms in generators. Here is an example that closely mirrors the OP's example:

use List::Gen '*';

sub fibonacci {
    my $limit   = 10**shift;
    my ($x, $y) = (0, 1);

    While {$_ < $limit} gather {
        ($x, $y) = ($y, take($x) + $y)
    }
}

say for @{fibonacci 15};

if you use bigint; before or at the top of the sub, you can of course:

say for @{fibonacci 400}; # or more
Eric Strom
Not entirely true that they "mix seamlessly" as "arrays", because foreach will still not work with the laziness. You could override `@{}`, but that would still require you to generate the entire list right then.
Axeman
@Axeman => with a syntax like this `for (@$fib) { ... }` or `for (@{ gen {...} }) {...}` the behavior IS lazy. The generator itself is infinite, but perl won't care, the for loop's iterator will pull items sequentially out of the generator.
Eric Strom
`perl -e 'use List::Gen; for (@{;gen {$_**2} 0, 1_000_000_000}) { last if $_ > 100; print "$_\n"}'` returns 10 values and exits without any other values being created, and without any startup delay
Eric Strom
@Eric Strom: so you just TIE an array, and send back maxint for FETCHSIZE?
Axeman
I'm actually not right now, so that's probably a bug I need to fix. for an infinite array its going to send back 9**9**9, i should have it send back 2**31-1 with the array syntax. But that is only with the argument-less `gen`, you can always specify the size of the range as shown in my second comment. I should have a new version up today or tomorrow that translates 9**9**9 to maxint with the array syntax. But yes, when you dereference a tied array reference inside a foreach, perl will sequentially ask the tied array for its values, and never allocate space for the whole array
Eric Strom
unfortunately, for everything to work right, you have to do the dereference directly in the foreach, you cant pass a whole tied array back. it will still generate values lazily, but perl will unnecessarily allocate space for the array, which for an array of maxint size, isnt going to end well :)
Eric Strom
+1 This seems like a great candidate and great module! Your interpretation of my limit is different. The Python script is meant to go to an indeterminate point where the Fibonacci result (as a stand in for other indeterminate things) is < 10**limit, not 15 steps. This is a very very simple approach and the limit was easy to fix. Thanks! +1!!
carrot-top
@Eric Strom: Your module looks good. I was working at lazy lists for Perl 5. But I have perfectionist streaks where I can't see the good enough sometimes. Perl's asking for the array size only initially, made me give up on for-loops to process a lazy list of initially unknown size but limited elements. The `last` works fine in this case, but it's the extra part that I just could not put up with at the time.
Axeman
@Axeman, I just released a new version of `List::Gen` that fixes the infinite array in a foreach loop bug, and many other additions, including a lazy grep that works properly in a foreach (without the `last if ...` check), let me know what you think
Eric Strom
+2  A: 

In this case, memoization can be used.

use strict;
use warnings;

use Memoize;
memoize('fib');

foreach my $i (1..15) {
  print "$i -> ",fib($i),"\n";
}

sub fib {
  my $n = shift;
  return $n if $n < 2;
  fib($n-1) + fib($n-2);
}
Alexandr Ciornii
Yeah, but that's a recursive function, not an iterator.
Oesor
Oesor: Question was how to implement Fibonacci numbers in general
Alexandr Ciornii
I apreciate the intro memoize. Oesor is right: I know how to do Fibonacci numbers recursively. I wanted to know how to do an interator
carrot-top