Leaving out the bytecode backend tchrist already covered and only talking about the C
backend, all perlcc
does is translating the optree of your compiled perl program into a C program, which it then compiles. That C program will, when run, then reconstruct that optree into memory, and basically execute it like perl usually would. The point of that is really just to speed up compile time of regular perl code.
That optree of your program is then available in the PL_main_root
global variable. We already have a module called B::Deparse
, which is able to consume optrees and turn them into source code that's roughly equivalent to the original code that the optree was compiled from. It happens to have a compile
method that returns a coderef that'll, when executed, print the deparse result of PL_main_root
.
Also there's the C function Perl_eval_pv
, which you can use to evaluate Perl snippets from C space.
$ echo 'print 42, "\\n"' > foo.pl
$ perl foo.pl
42
$ perlcc foo.pl
$ ./a.out
42
$ gdb a.out
...
(gdb) b perl_run
Breakpoint 1 at 0x4570e5: file perl.c, line 2213.
(gdb) r
...
Breakpoint 1, perl_run (my_perl=0xa11010) at perl.c:2213
(gdb) p Perl_eval_pv (my_perl, "use B::Deparse; B::Deparse->compile->()", 1)
print 42, "\n";
$1 = (SV *) 0xe47b10
Of course the usual B::Deparse caveats apply, but this will certainly be handy for reverse-engeneering. Actually reconstructing the original source code won't be possible in most cases, even if it worked for the above example.
The exact gdb magic you'll have to do to get B::Deparse to give you something sensible also depends largely on your perl. I'm using a perl with ithreads, and therefore multiplicity. That's why I'm passing around the my_perl
variable. Other perls might not need that. Also, if anyone stripped the binary compiled by perlcc, things will get a bit harder, but the same technique will still work.
Also you can use that to compile any optree you can somehow get ahold of at any time during program execution. Have a look at B::Deparse's compile sub and do something similar, except provide it with a B
object for whatever optree you want dumped instead of B::main_root
.
The same thing applies to the mentioned bytecode backend of perlcc. I'm not entirely sure about the optimized C backend called CC
.