ansaurus

Question

How can I get the syntax tree from a coderef in Perl?

Answer 1

A:

Perl 5 does not let you manipulate the bytecode on the fly like that, but you can create anonymous functions. If I understand your example correctly, and I doubt I do, you already have two functions that are being referenced by $f1 and $c, and you want to create a new reference $f that holds the results of the first two multiplied by each other. This is simple:

my $f = sub { $f1->(@_) * $c->(@_[1 .. 9]) };

$f->(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

Note the use of the arrow operator rather than the & to dereference the coderefs. This style is much more common (and in my opinion more readable).

Chas. Owens 2010-10-05 00:23:42

Actually, I want the reverse - given `f`, determine what is `f1`.

jpalecek 2010-10-05 00:58:44

Answer 2

+7 A:

For introspection of optrees the B family of modules is usually used.

Given an code reference $cv, first create a B object for that:

my $b_cv = B::svref_2object($cv);

Now you can call the various methods documented in B on that to retrieve various things from the optree.

Using only optree introspection you can already achieve amazing things. See DBIx::Perlish for a pretty advanced example of this.

There also happens to be a B::Generate module, that allows building new optrees that do whatever you want, or to manipulate existing optrees. However, B::Generate isn't as mature as one would hope, and there's a lot of missing features and quite a few bugs.

Actual optree creation and manipulation is usually best done using perl's C api, as documented in perlapi, perlguts, and perlhack, among others. You'll probably have to learn some XS as well, to expose the optree manipulation functions you wrote back to perl space, but that's the easy part really.

Building optrees (not necessarily based on other existing optrees that are being introspected) seems to have become somewhat popular recently, especially since Syntax Plugins have been added to the core in perl 5.12.0. You can find various examples like Scope::Escape::Sugar on cpan.

However, dealing with perl's optrees is still somewhat fiddly and not exactly beginner-friendly. It shouldn't be necessary for any of the most arcane things. Something like using B::Deparse->new->coderef2text($cv) and then maybe mangling very slightly with the evaluated source code is really as far as I would want to go with optree introspection from pure-perl space.

You might want to step back a bit and explain the actual problem you're trying to solve. Maybe there's a much simpler solution that doesn't involve messing with optrees at all.

rafl 2010-10-05 00:32:08

See edit for motivation.

jpalecek 2010-10-05 01:12:46

+1 nice answer, and `Scope::Escape::*` looks very interesting. Any other good ones you recommend?

Eric Strom 2010-10-05 01:16:35

Thank you. Though, unfortunately, that didn't help me much in understanding your real problem, and that's entirely my fault - your clarifications seem good for someone with the right background. So, lacking any suggestions on how to better approach your issue, I'd be happy to help you introspect whatever code you're faced with. But for that, you'd have to show actual code.

rafl 2010-10-05 01:18:59

I can't think of any other syntax plugin users on CPAN right now. However, optree munging in general is relatively common. You might find Parse::Perl, and many of the `B::Hooks::OP::Check` dependants interesting. A prior attempt to do what syntax plugins now provide is Devel::Declare. You'll also find a lot of interesting modules providing mostly new syntax, but also new semantics, based on that.

rafl 2010-10-05 01:23:26

@Eric Strom: Another example module might be [`Text::Xslate`](http://search.cpan.org/dist/Text-Xslate/). I believe this compiles a template straight down to opcode.

draegtun 2010-10-05 09:03:55

Answer 3

+1 A:

Given your restated question -- I think what you should do here, instead of trying to munge coderefs, is to delay having a coderef as long as possible.

Create an object representing an instance of your computation.
Write the methods on this object needed to evaluate the value of the computation. No codegen, just do it the long slow way. This is just to give you a baseline of code for the next steps that's easily tested and hopefully easily understood.
Write tests to ensure the correctness of what you did in Step 2. (Swap this before Step 2 if you're that kind of person.)
Implement what you're asking about in this question, by writing methods to transform a computation object into a new one that represents a more-optimized form of the same computation. Use your tests to ensure that computations still give the right result after optimization.
Write code that takes a computation object, and generates a sub (whether by string eval or using B) that carries out that computation. Use your tests to ensure that computations still give the right result after they've been compiled.

Optional step to insert anywhere between 2 and 5:

Write some syntactic sugar (probably using overload, but other tools are possible too) to let you construct "computation objects" using nice-looking expressions that resemble the computation itself, instead of lots and lots of object constructors.

hobbs 2010-10-05 06:59:07

ansaurus

tags:

views:

answers:

How can I get the syntax tree from a coderef in Perl?

related questions