tags:

views:

282

answers:

8

What syntax, if any, is able to take a reference of a builtin like shift?

$shift_ref = $your_magic_syntax_here;

The same way you could to a user defined sub:

sub test { ... }

$test_ref = \&test;

I've tried the following, which all don't work:

\&shift
\&CORE::shift
\&{'shift'}
\&{'CORE::shift'}

Your answer can include XS if needed, but I'd prefer not.

Clarification: I am looking for a general purpose solution that can obtain a fully functional code reference from any builtin. This coderef could then be passed to any higher order function, just like a reference to a user defined sub. It seems to be the consensus so far that this is not possible, anyone care to disagree?

+3  A: 

No, you can't. What is the underlying problem you are trying to solve? There may be some way to do whatever that is.

Re the added part of the question "Your answer can include XS if needed, but I'd prefer not.", calling builtins from XS is really hard, since the builtins are set up to assume they are running as part of a compiled optree and have some global variables set. Usually it's much easier to call some underlying function that the builtin itself uses, though there isn't always such a function, so you see things like:

buffer = sv_2mortal(newSVpvf("(caller(%d))[3]", (int) frame));
caller = eval_pv(SvPV_nolen(buffer), 1);

(doing a string eval from XS rather than go through the hoops required to directly call pp_caller).

ysth
I cannot tell you how relieved I am that you *can't* do this...
Telemachus
no particular problem, more an academic question than anything else. @Telemachus => why not? it would allow the builtins to be passed to currying functions or any other higher order function without a wrapper that has to deal with conforming to the builtin's prototype, thereby limiting its functionality
Eric Strom
Note that if somebody made a good case for it, it's probably not totally out of the question to add a new C function to the perlapi for a given builtin. Also, I guess a patch would be in order, too.,
tsee
@tsee: absolutely
ysth
A: 

No. See this.

newacct
Eric Strom
+1  A: 

You could do this if you patched the internal method first (which would give you the coderef of your patch):

use strict;
use warnings;

BEGIN {
    *CORE::GLOBAL::die = sub { warn "patched die: '$_[0]'"; exit 3 };
}

print "ref to patched die: " . \&CORE::GLOBAL::die . "\n";
die "ack, I am slain";

gives the output:

ref to patched die: CODE(0x1801060)
patched die: 'ack, I am slain' at patch.pl line 5.

BTW: I would appreciate if anyone can explain why the override needs to be done as *CORE::GLOBAL::die rather than *CORE::die. I can't find any references for this. Additionally, why must the override be done in a BEGIN block? The die() call is done at runtime, so why can't the override be done at runtime just prior?

Ether
Isn't there a nicer syntax for overriding built-ins? Or will that not let you take a reference to the overridden version?
Chris Lutz
@Chris: as far as I know, patching the symbol table is the only way.
Ether
this is cleaner than an outright wrapper function, but your patched builtin still would need to contain some sort of switch statement to massage the argument list based on type, since as far as i know its not possible to bypass a builtin's prototype
Eric Strom
by the way, the "use strict 'refs';" line is redundant since the pragma will fall out of scope at the end of the BEGIN block
Eric Strom
@Eric Good point. `no strict 'refs'` is actually totally unnecessary because of how the override is done; it would only be needed for `*{'Core::GLOBAL::die'} = ...`
Ether
perlsub (http://perldoc.perl.org/perlsub.html#Overriding-Built-in-Functions) has a section that suggests a nicer way to override built-ins, but not quite to the depth that you override it (which may or may not be a good thing).
Chris Lutz
The GLOBAL is required to scare people off.
ysth
@Eric Strom: erk. overriding a built-in so you can take a reference to it seems like a very poor idea to me.
ysth
@ysth: I think this entire conversation is academic - we're discussing the "how", without any regard to "why". :)
Ether
+1  A: 

I was playing around with general purpose solutions to this one, and came up with the following dirty hack using eval. It basically uses the prototype to pull apart @_ and then call the builtin. This has only been lightly tested, and uses the string form of eval, so some may say its already broken :-)

use 5.10.0;
use strict;
use warnings;

sub builtin {
    my ($sub, $my, $id) = ($_[0], '');
    my $proto = prototype $sub         //
                prototype "CORE::$sub" //
                $_[1]                  //
                ($sub =~ /map|grep/ ? '&@' : '@;_');
    for ($proto =~ /(\\?.)/g) { $id++;
        if (/(?|(\$|&)|.(.))/) {
            $my  .= "my \$_$id = shift;";
            $sub .= " $1\$_$id,";
        } elsif (/([@%])/) {
            $my  .= "my $1_$id = splice \@_, 0, \@_;";
            $sub .= " $1_$id,";
        } elsif (/_/) {
            $my  .= "my \$_$id = \@_ ? shift : \$_;";
            $sub .= " \$_$id,"
        }
    }
    eval "sub ($proto) {$my $sub}"
        or die "prototype ($proto) failed for '$_[0]', ".
               "try passing a prototype string as \$_[1]"
}

my $shift = builtin 'shift';
my @a = 1..10;
say $shift->(\@a);
say "@a";

my $uc = builtin 'uc';
local $_ = 'goodbye';
say $uc->('hello '), &$uc;

my $time = builtin 'time';
say &$time;

my $map = builtin 'map';
my $reverse = builtin 'reverse';
say $map->(sub{"$_, "}, $reverse->(@a));

my %h = (a=>1, b=>2);
my $keys = builtin 'keys';
say $keys->(\%h);

# which prints
# 1
# 2 3 4 5 6 7 8 9 10
# HELLO GOODBYE
# 1256088298
# 10, 9, 8, 7, 6, 5, 4, 3, 2, 
# ab

Revised with below and refactored.

Eric Strom
You can't call via a code reference in a way that honors the prototype, so it needn't be set (unless you are going to use the reference to override a builtin).
ysth
yep, its in there for that exact reason
Eric Strom
Might I recommend the `//` (defined-or) operator instead of the `||` operator, if you're using Perl 5.10? `||` will return the wrong result for `prototype "CORE::time"` - since `time()` has no prototype, it returns an empty string, which evaluates false.
Chris Lutz
@Chris => thanks for catching the bug with 'time'. I haven't gotten into the habit of using // since the apple machines at work only have 5.8...
Eric Strom
+1 because it's a neat hack, even though it doesn't really give you a reference to the built-in functions themselves. It might be a good idea to memoize the results, so that `builtin 'grep' for 0 .. 100` doesn't create 100 anonymous subroutines, but it's still a pretty neat trick. I would make a temporary array and try to `split()` the prototype into it so we don't have to keep track of the index manually, but I'm not sure it really matters.
Chris Lutz
A: 

You can wrap shift with something that you can reference, but you have to use a prototype to use it, since shift is special.

sub my_shift (\@) { my $ll = shift; return shift @$ll }

The problem is that the prototype system can't magically figure out that when it calls some random ref-to-sub in a scalar, that it needs to take the reference before calling the subroutine.

my @list = (1,2,3,4);

sub my_shift (\@) { my $ll = shift; return shift @$ll }

my $a = shift @list;
my $my_shift_ref = \&my_shift;
my $b = (&{$my_shift_ref}  (\@list) ); # see below

print "a=$a, b=$b\n";

for (my $i = 0; $i <= $#list; ++$i) { print "\$list[$i] = ",$list[$i],"\n"; }

If this is called as just @list, perl barfs, because it can't automagically make references the way shift does.

See also: [http://www.perl.com/language/misc/fmproto.html%5D%5BTom Christensen's article].

Of course, for builtins that aren't special like shift, you can always do

sub my_fork { return fork; }

and then &my_fork all you want.

ts4z
A: 

As I understand you want to have coderef that will be called on some data, and it might point to some your function or to builtin.

If I'm right, just put the builtin in closure:

#!/usr/bin/perl -w
use strict;

my $coderef = \&test;
$coderef->( "Test %u\n", 1 );

$coderef = sub { printf @_ };
$coderef->( "Test %u\n", 2 );

exit;

sub test {
    print join(' ', map { "[$_]" } @_) . "\n";
}

Doing it with shift is also possible, but remember that shift without explicit array to work on, works on different arrays based on where it was called.

depesz
A: 

The only way I can get it to work is to make a reference to sub{shift}.

perl -e '@a=(1..3); $f=sub{shift}; print($f->(@a), "\n");'

This is functionally equivalent to:

perl -e '@a=(1..3); print(shift(@a), "\n");'

Which could be just perl -e 'print 1, "\n"' but then we wouldn't be talking about a builtin.

For your information I'm surprised that one cannot reference a builtin, and now that it's been made clear to me I can't help but think of it as a deficiency in Perl.

Update Eric correctly points out that $f=sub{shift}; $f->(@a) leaves @a unchanged. It should be more like:

perl -e '@a=(1..3); $f=sub{shift @{+shift}}; print($f->(\@a), "\n");

Thanks Eric.

dlamblin
shift is unfortunately trickier than that due to is prototype (take a look at what @a is after the print), your first line wants to be something like "perl -e '@a=(1..3); $f=sub{shift @{+shift}}; print($f->(\@a), "\n");'"I fully agree with you though that for the sake of completeness, it would be nice to see this fixed in subsequent perls.
Eric Strom
+1  A: 

If you want to see what it takes to fake it in production quality code, look at the code for autodie. The meat is in Fatal. Helps if you're a mad pirate Jedi Australian.

Schwern