tags:

views:

111

answers:

2

I know that the BEGIN block is compiled and executed before the main body of a Perl program. If you're not sure of that just try running the command perl -cw over this:

#!/ms/dist/perl5/bin/perl5.8

use strict;
use warnings;

BEGIN {
    print "Hello from the BEGIN block\n";
}

END {
    print "Hello from the END block\n";
}

I have been taught that early compilation and execution of a BEGIN block lets a programmer ensure that any needed resources are available before the main program is executed.

And so I have been using BEGIN blocks to make sure that things like DB connections have been established and are available for use by the main program. Similarly, I use END blocks to ensure that all resources are closed, deleted, terminated, etc. before the program terminates.

After a discussion this morning, I am wondering if this the wrong way to look at BEGIN and END blocks.

What is the intended role of a BEGIN block in Perl?

Update 1: Just found out why the DBI connect didn't work. After being given this little Perl program:

use strict;
use warnings;

my $x = 12;

BEGIN {
    $x = 14;
}

print "$x\n";

when executed it prints 12.

Update 2: Thanks to Eric Strom's comment below this new version makes it clearer:

use strict;
use warnings;

my $x = 12;
my $y;

BEGIN {
    $x = 14;
    print "x => $x\n";
    $y = 16;
    print "y => $y\n";
}

print "x => $x\n";
print "y => $y\n";

and the output is

x => 14
y => 16
x => 12
y => 16

Once again, thanks Eric!

+13  A: 

While BEGIN and END blocks can be used as you describe, the typical usage is to make changes that affect the subsequent compilation.

For example, the use Module qw/a b c/; statement actually means:

BEGIN {
   require Module;
   Module->import(qw/a b c/);
}

similarly, the subroutine declaration sub name {...} is actually:

BEGIN {
   *name = sub {...};
}

Since these blocks are run at compile time, all lines that are compiled after a block has run will use the new definitions that the BEGIN blocks made. This is how you can call subroutines without parenthesis, or how various modules "change the way the world works".

END blocks can be used to clean up changes that the BEGIN blocks have made but it is more common to use objects with a DESTROY method.

If the state that you are trying to clean up is a DBI connection, doing that in an END block is fine. I would not create the connection in a BEGIN block though for several reasons. Usually there is no need for the connection to be available at compile time. Performing actions like connecting to a database at compile time will drastically slow down any editor you use that has syntax checking (because that runs perl -c).

Eric Strom
Thanks Eric. I wasn't sure about adding a caveat to my question about using END blocks in a non-OO program.
Rob Wells
+4  A: 

Have you tried swapping out the BEGIN{} block for an INIT{} block? That's the standard approach for things like modperl which use the "compile-once, run-many" model, as you need to initialize things anew on each separate run, not just once during the compile.

But I have to ask why it's all in special block anyway. Why don't you just make some sort of prepare_db_connection() function, and then call it as you need to when the program starts up?

Something that won't work in a BEGIN{} will also have the same problem if it's main-line code in a module file that gets used. That's another possible reason to use an INIT{} block.

I've also seen deadly-embrace problems of mutual recursion that have to be unravelled using something like an require instead of use, or an INIT{} instead of a BEGIN{}. But that's pretty rare.

Consider this program:

% cat sto-INIT-eg
#!/usr/bin/perl -l
print               "    PRINT: main running";
die                 "    DIE:   main dying\n";
die                 "DIE XXX /* NOTREACHED */";
END         { print "1st END:   done running"    }
CHECK       { print "1st CHECK: done compiling"  }
INIT        { print "1st INIT:  started running" }
END         { print "2nd END:   done running"    }
BEGIN       { print "1st BEGIN: still compiling" }
INIT        { print "2nd INIT:  started running" }
BEGIN       { print "2nd BEGIN: still compiling" }
CHECK       { print "2nd CHECK: done compiling"  }
END         { print "3rd END:   done running"    }

When compiled only, it produces:

% perl -c sto-INIT-eg 
1st BEGIN: still compiling
2nd BEGIN: still compiling
2nd CHECK: done compiling
1st CHECK: done compiling
sto-INIT-eg syntax OK

While when compiled and executed, it produces this:

% perl sto-INIT-eg 
1st BEGIN: still compiling
2nd BEGIN: still compiling
2nd CHECK: done compiling
1st CHECK: done compiling
1st INIT:  started running
2nd INIT:  started running
    PRINT: main running
    DIE:   main dying
3rd END:   done running
2nd END:   done running
1st END:   done running

And the shell reports an exit of 255, per the die.

You should be able to arrange to have the connection happen when you need it to, even if a BEGIN{} proves too early.

Hm, just remembered. There's no chance you're doing something with DATA in a BEGIN{}, is there? That's not set up till the interpreter runs; it's not open to the compiler.

tchrist