views:

288

answers:

7

Is there any way to have a subroutine send data back while still processing? For instance (this example used simply to illustrate) - a subroutine reads a file. While it is reading through the file, if some condition is met, then "return" that line and keep processing. I know there are those that will answer - why would you want to do that? and why don't you just ...?, but I really would like to know if this is possible. Thank you so much in advance.

A: 

If you really want do this you can by using threading. One option would be to fork a separate thread that reads the file and when it finds a certain line, place it in an array that is shared between threads. Then the other thread could take the lines, as they are found, and process them. Here is an example that reads a file, looks for an 'X' in a file's line, and does an action when it is found.

use strict;
use threads;
use threads::shared;

my @ary : shared;

my $thr = threads->create('file_reader');

while(1){
    my ($value);
    {
        lock(@ary);
        if ($#ary > -1){
            $value = shift(@ary);
            print "Found a line to process:  $value\n";
        }
        else{
            print "no more lines to process...\n";
        }            
    }

    sleep(1);
    #process $value
}


sub file_reader{

            #File input
    open(INPUT, "<test.txt");
    while(<INPUT>){
        my($line) = $_;
        chomp($line);

        print "reading $line\n";

        if ($line =~ /X/){
            print "pushing $line\n";
            lock(@ary);
            push @ary, $line;
        }
        sleep(4)
    }
    close(INPUT);
}

Try this code as the test.txt file:

line 1
line 2X
line 3
line 4X
line 5
line 6
line 7X
line 8
line 9
line 10
line 11
line 12X
Narthring
+4  A: 

Some languages offer this sort of feature using "generators" or "coroutines", but Perl does not. The generator page linked above has examples in Python, C#, and Ruby (among others).

Greg Hewgill
Sort of. Generators stop processing when they return a result, and start again when the next result is requested. There is nothing about them that will create a second thread of execution.
mobrule
@mobrule: The question didn't mention threads, but just "keep processing". Generators allow the local context of the generator to persist between calls, which can be considered the same thing (at a conceptual level).
Greg Hewgill
@Greg Hewgill - True enough. It is unclear what the OP means by "keep processing".
mobrule
+2  A: 

The easiest way to do this in Perl is probably with an iterator-type solution. For example, here we have a subroutine which forms a closure over a filehandle:

open my $fh, '<', 'some_file.txt' or die $!;
my $iter = sub { 
    while( my $line = <$fh> ) { 
        return $line if $line =~ /foo/;
    }

    return;
}

The sub iterates over the lines until it finds one matching the pattern /foo/ and then returns it, or else returns nothing. (undef in scalar context.) Because the filehandle $fh is defined outsite the scope of the sub, it remains resident in memory between calls. Most importantly, its state, including the current seek position in the file, is retained. So each call to the subroutine resumes reading the file where it last left off.

To use the iterator:

while( defined( my $next_line = $iter->() ) ) { 
    # do something with each line here
}
friedo
This seems like a complicated version of `while( <$fh> ) { next unless /foo/; ... }`. Notice the your iterator subroutine finished processing and returned. It didn't keep processing.
brian d foy
@brian: In this simple example yes, your code is shorter and simpler, but closures give much more flexibility. It's not too hard to refactor the above so that sequence-transforming functions (for example filters like the above, but also mutators and generators) can be chained together or joined a la LINQ. The effect is similar to map etc. but with the distinction that evaluation can be "paused" at any time.
j_random_hacker
I'm not arguing against closures are all. You'll find that many of my CPAN modules do just what you say, and I talk about the things you mention in several chapters of Mastering Perl. However, they are still just subroutines that you call and whose return value you use. There is no special magic beyond that.
brian d foy
+7  A: 

A common way to implement this type of functionality is with a callback function:

{
    open my $log, '>', 'logfile' or die $!;
    sub log_line {print $log @_}
}

sub process_file {
    my ($filename, $callback) = @_;
    open my $file, '<', $filename or die $!;
    local $_;
    while (<$file>) {
        if (/some condition/) {
             $callback->($_)
        }
        # whatever other processing you need ....
    }
}

process_file 'myfile.txt', \&log_line;

or without even naming the callback:

process_file 'myfile.txt', sub {print STDERR @_};
Eric Strom
@Eric: What's with the closure at the beginning?
Zaid
This isn't returning data to the higher level. It's just a subroutine calling another subroutine.
brian d foy
just there for good measure, since `$log` is only used by `log_line` there's no reason for it to have file scope
Eric Strom
The closure are the beginning makes `$log` private to `log_line`.
brian d foy
@brian d foy => if the callback routine is created in that higher level, then yes it is sending data to the higher level. the callback could easily be closed around lexicals in the higher level.
Eric Strom
This did the trick. Thank you all for responding
Perl QuestionAsker
No, it's not really sending information up the call stack. It's passing stuff sideways.
brian d foy
@Perl QuestionAsker: did the trick for what? If you had a use case in mind, add it to the question. There's a difference between asking if a language has a particular feature and asking how to accomplish a concrete task.
brian d foy
Sorry for the confusion. It was a can-it-be-done, that is all. How it came about is I have an application that (among other things) watches a live logfile and acts on certain events in the log. While a simple open(FILE, 'logfile'); while(<FILE>){ if ( /some_cond/ ){ close(FILE) would work ok, I was curious if the function that reads the file could be run in the background, so to speak - and if an event is detected, send the event to the main portion of the script for further processing, but continue reading the log file, watching for new entries ... clear as mud?
Perl QuestionAsker
In other words, if the reader code is in a sub-routine, and I am essentially doing a 'tail -f' on the file, it would never return - but I would need it to return the event, but continue reading
Perl QuestionAsker
It seems possible using this method.
Perl QuestionAsker
i.e. - am I missing something, or would that work?
Perl QuestionAsker
or better yet (sorry).... } elsif (@_ =~ /some_other_cond/){ do_something_else(@_) }});
Perl QuestionAsker
So, you're really asking how to accomplish a task instead of accomplishing it in a particular way.
brian d foy
I am not sure what you mean. It was simply a question of can it be done this way, that is all. I am always learning, and trying to learn not only the 'best' ways to do thing, but I try to learn ALL ways to do something. I never know when a way that seems silly in one instance would work great in another.
Perl QuestionAsker
A: 

If your language supports closures, you may be able to do something like this:

By the way, the function would not keep processing the file, it would run just when you call it, so it may be not what you need.

(This is a javascript like pseudo-code)

function fileReader (filename) {
    var  file = open(filename);

    return function () {
        while (s = file.read()) {
            if (condition) {
                return line;
            }
        }
        return null;
   }     
}

a = fileReader("myfile");
line1 = a();
line2 = a();
line3 = a();
Francisco Soto
Care to comment, downvoter?
Francisco Soto
@Francisco Soto: Perhaps it's because the question is tagged `perl` and your answer provides a javascript implementation.
dreamlax
It says pseudo code, do not know the syntax in perl to do that. (Havent use perl in so many years.)
Francisco Soto
A: 

What about a recursive sub? Re-opening existing filehandles do not reset the input line number, so it carries on from where it's left off.

Here is an example where the process_file subroutine prints out blank-line-separated "\n\n" paragraphs that contain foo.

sub process_file {

    my ($fileHandle) = @_;
    my $paragraph;

    while ( defined(my $line = <$fileHandle>) and not eof(<$fileHandle>) ) {

        $paragraph .= $line;
        last unless length($line);
    }

    print $paragraph if $paragraph =~ /foo/;
    goto &process_file unless eof($fileHandle);  
       # goto optimizes the tail recursion and prevents a stack overflow
       # redo unless eof($fileHandle); would also work
}

open my $fileHandle, '<', 'file.txt';
process_file($fileHandle);
Zaid
if the file is large, this is going to very quickly eat up your call stack. correct the last line of `process_file` with: `goto ` to eliminate growing the stack during the tail recursion
Eric Strom
@Eric: Implemented
Zaid
Eric Strom
@Zaid => i made the change and added some comments
Eric Strom
@Eric: Thanks! The documentation on `goto` is a bit too vague for my liking. Now I understand your previous comment about removing the argument list...
Zaid
+3  A: 

The Coro module looks like it would be useful for this problem, though I have no idea how it works and no idea whether it does what it advertises.

mobrule