ansaurus

Question

Answer 1

+16 A:

Let's start easy with the Spaceship Operator.

$a = 5 <=> 7;  # $a is set to -1
$a = 7 <=> 5;  # $a is set to 1
$a = 6 <=> 6;  # $a is set to 0

Sec 2008-10-02 12:09:33

That's hardly esoteric!

Leon Timmermans 2008-10-02 12:16:32

@Leon: C/C++ doesn't do a 3 value return for numbers. If memory serves String comapre functions are the only 3 value return that I know of in the whole STL language.AFAIK Python doesn't have a 3 return numeric compare.Java doesn't have a number specific 3 return compare either.

J.J. 2008-10-02 14:53:00

It's worth mentioning what's so useful about -1/0/1 comparison operators, since not everyone might know: you can chain them together with the or-operator to do primary/secondary/etc. sorts. So `($a->lname cmp $b->lname) || ($a->fname cmp $b->fname)` sorts people by their last names, but if two people have the same last name then they will be ordered by their first name.

hobbs 2010-07-26 11:13:22

@J.J. Python does have a 3-value compare: cmp()>>> print (cmp(5,7), cmp(6,6), cmp(7,5))(-1, 0, 1)

bukzor 2010-07-29 00:53:32

Answer 2

+14 A:

My vote would go for the (?{}) and (??{}) groups in Perl's regular expressions. The first executes Perl code, ignoring the return value, the second executes code, using the return value as a regular expression.

Leon Timmermans 2008-10-02 12:19:21

perl invented so many regexp extensions that other programs now often use pcre (perl compatible regex) instead of the original regex language.

Sec 2008-10-02 12:36:47

Read the little blurb here http://perldoc.perl.org/perlfaq6.html#Are-Perl-regexes-DFAs-or-NFAs%3f--Are-they-POSIX-compliant%3f:-D

J.J. 2008-10-02 15:30:12

Perl really has ( as far as I know ), lead the pack, when it comes to regexps.

Brad Gilbert 2008-10-02 21:53:00

This, as far as I'm aware, is still experimental, and may not work the same way in future Perls. Not to say that it isn't useful, but a slightly safer and just as useable version can be found in the s/// command's /e flag: `s/(pattern)/reverse($1);/ge;` # reverses all `patterns`.

Chris Lutz 2009-02-09 23:22:38

Answer 3

+5 A:

There also is $[ the variable which decides at which index an array starts. Default is 0 so an array is starting at 0. By setting

$[=1;

You can make Perl behave more like AWK (or Fortran) if you really want to.

Sec 2008-10-02 12:21:46

Although to quote from the perlvar documentuation: "Its use is highly discouraged.". Not many people expect the starting subscript of an array to change.

pjf 2008-10-02 13:17:27

I would only use this feature in a one-liner, if ever.

Brad Gilbert 2008-10-02 22:05:11

be warned: if you do this in a CPAN module, chromatic will find you and spray-paint your car. (this is a joke; chromatic has not been consulted regarding it.)

davidnicol 2008-10-03 21:08:48

Answer 4

+33 A:

The operators ++ and unary - don't only work on numbers, but also on strings.

my $_ = "a"
print -$_

prints -a

print ++$_

prints b

$_ = 'z'
print ++$_

prints aa

Leon Timmermans 2008-10-02 12:26:31

To quote perlvar: "The auto-decrement operator is not magical." So `--` doesn't work on strings.

moritz 2008-10-02 12:56:18

"aa" doesn't seem to be the natural element following "z". I would expect the next highest ascii value, which is "{".

Ether 2009-02-23 23:08:59

Don't ask a programmer what comes after "z"; ask a human. This feature is great for numbering items in a long list.

Barry Brown 2009-03-18 23:44:59

When new to Perl I implemented this feature myself with the exact z to aa behavior then showed it to a co-worker who laughed and me and said "let me show you something". I cried a bit but learned something.

Copas 2009-05-30 03:07:25

I wish I had this in C#, great feature

Andrija 2009-06-12 10:51:35

@Ether - If you want that, use numbers and autoconvert them to ASCII with `ord()`. Or, write a small class and overload the operators to do it for you.

Chris Lutz 2009-08-23 22:05:33

Answer 5

+38 A:

The flip-flop operator is useful for skipping the first iteration when looping through the records (usually lines) returned by a file handle, without using a flag variable:

while(<$fh>)
{
  next if 1..1; # skip first record
  ...
}

Run perldoc perlop and search for "flip-flop" for more information and examples.

John Siracusa 2008-10-02 12:41:44

Actually that's taken from Awk, where you can do flip-flop between two patterns by writing pattern1, pattern2

Bruno De Fraine 2008-10-02 12:50:13

To clarify, the "hidden" aspect of this is that if either operand to scalar '..' is a constant the value is implicitly compared to the input line number ($.)

Michael Carman 2008-10-02 13:41:10

Answer 6

+8 A:

A bit obscure is the tilde-tilde "operator" which forces scalar context.

print ~~ localtime;

is the same as

print scalar localtime;

and different from

print localtime;

Sec 2008-10-02 12:42:14

This is especially obscure because perl5.10.0 also introduces the "smart match operator" `~~`, which can do regex matches, can look if an item is contained in an array and so on.

moritz 2008-10-02 12:52:46

That's not obscure, that's obfuscated (and useful for golf and JAPHs).

Michael Carman 2008-10-02 13:18:24

This is not correct! ~~ is not safe on references! It stringifies them.

Leon Timmermans 2008-10-02 15:04:21

Well, yes. Stringification is what happens to references when forced into scalar context. How does that make "~~ forces scalar context" incorrect?

Dave Sherohman 2008-10-02 16:00:49

@Nomad Dervish: Scalar context /= stringification. e.g. "$n = @a" is scalar context. "$s = qq'@a'" is stringification. With regard to references, "$ref1 = $ref2" is scalar context, but does not stringify.

Michael Carman 2008-10-02 17:09:09

Answer 7

+22 A:

Binary "x" is the repetition operator:

print '-' x 80;     # print row of dashes

It also works with lists:

print for (1, 4, 9) x 3; # print 149149149

Bruno De Fraine 2008-10-02 12:45:51

This is one reason why Perl has been so popular with hackers.perl -e 'print 0x000 x 25';

J.J. 2008-10-02 15:28:39

It also works with lists: print for (1, 4, 9) x 3

dland 2008-10-02 16:47:34

My favorite use for this is generating placeholders for the last part of an SQL INSERT statement: @p = ('?') x $n; $p = join(", ", @p); $sql = "INSERT ... VALUES ($p)";

skiphoppy 2008-10-03 16:20:21

Answer 8

+18 A:

Based on the way the "-n" and "-p" switches are implemented in Perl 5, you can write a seemingly incorrect program including }{:

ls |perl -lne 'print $_; }{ print "$. Files"'

which is converted internally to this code:

LINE: while (defined($_ = <ARGV>)) {
    print $_; }{ print "$. Files";
}

Sec 2008-10-02 12:48:32

Now, this is just silly :-)

Leonardo Herrera 2008-11-22 05:54:25

Sometimes known as the "eskimo greeting", or just "eskimo" ...

martin clayton 2010-02-08 22:55:00

golfing says hello ;)

knittl 2010-07-15 20:08:26

Answer 9

+34 A:

There are many non-obvious features in Perl.

For example, did you know that there can be a space after a sigil?

 $ perl -wle 'my $x = 3; print $ x'
 3

Or that you can give subs numeric names if you use symbolic references?

$ perl -lwe '*4 = sub { print "yes" }; 4->()' 
yes

There's also the "bool" quasi operator, that return 1 for true expressions and the empty string for false:

$ perl -wle 'print !!4'
1
$ perl -wle 'print !!"0 but true"'
1
$ perl -wle 'print !!0'
(empty line)

Other interesting stuff: with use overload you can overload string literals and numbers (and for example make them BigInts or whatever).

Many of these things are actually documented somewhere, or follow logically from the documented features, but nonetheless some are not very well known.

Update: Another nice one. Below the q{...} quoting constructs were mentioned, but did you know that you can use letters as delimiters?

$ perl -Mstrict  -wle 'print q bJet another perl hacker.b'
Jet another perl hacker.

Likewise you can write regular expressions:

m xabcx
# same as m/abc/

moritz 2008-10-02 12:50:58

“Did you know that there can be a space after a sigil?” I am utterly flabbergasted. Wow.

Aristotle Pagaltzis 2008-10-02 19:00:26

Cool! !!$undef_var doesn't create a warning.

Axeman 2008-10-02 20:40:09

I think your example of using letters to delimit strings should be "_Just_ another perl hacker" rather than "Jet another perl hacker" =P

Chris Lutz 2009-08-23 21:58:37

The worst part is that you can use other things as delimiters, too. Even closing brackets. The following are valid: s}regex}replacement}xsmg; q]string literal];

Ryan Thompson 2009-10-27 22:03:13

LOL @ `4->()`...

j_random_hacker 2010-02-16 07:58:47

Answer 10

+11 A:

The null filehandle diamond operator <> has its place in building command line tools. It acts like <FH> to read from a handle, except that it magically selects whichever is found first: command line filenames or STDIN. Taken from perlop:

while (<>) {
...   # code for each line
}

spoulson 2008-10-02 13:06:18

It also follows the UNIX semantics of using "-" to mean "read from stdin. So you could do `perl myscript.pl file1.txt - file2.txt`, and perl would process the first file, then stdin, then the second file.

Ryan Thompson 2009-10-27 22:28:50

Answer 11

+11 A:

Special code blocks such as BEGIN, CHECK and END. They come from Awk, but work differently in Perl, because it is not record-based.

The BEGIN block can be used to specify some code for the parsing phase; it is also executed when you do the syntax-and-variable-check perl -c. For example, to load in configuration variables:

BEGIN {
    eval {
        require 'config.local.pl';
    };
    if ($@) {
        require 'config.default.pl';
    }
}

Bruno De Fraine 2008-10-02 13:16:24

You forgot INIT.

Axeman 2008-10-03 06:00:30

Answer 12

+34 A:

One of my favourite features in Perl is using the boolean || operator to select between a set of choices.

 $x = $a || $b;

 # $x = $a, if $a is true.
 # $x = $b, otherwise

This means one can write:

 $x = $a || $b || $c || 0;

to take the first true value from $a, $b, and $c, or a default of 0 otherwise.

In Perl 5.10, there's also the // operator, which returns the left hand side if it's defined, and the right hand side otherwise. The following selects the first defined value from $a, $b, $c, or 0 otherwise:

 $x = $a // $b // $c // 0;

These can also be used with their short-hand forms, which are very useful for providing defaults:

 $x ||= 0;   # If $x was false, it now has a value of 0.

 $x //= 0;   # If $x was undefined, it now has a value of zero.

Cheerio,

Paul

pjf 2008-10-02 13:23:29

These operators are a godsend for reducing boilerplate code

spoulson 2008-10-02 13:26:13

This is such a common idiom that it hardly qualifies as a "hidden" feature.

Michael Carman 2008-10-02 13:28:18

shame the pretty printer thinks // is a comment :)

John Ferguson 2008-10-02 14:31:48

Question, is there a "use feature" to use these new operators, or are they default enabled? I am still leaning Perl 5.10's features.

J.J. 2008-10-02 15:34:28

// is in there by default, no special tweaks needed. You can also backport it into 5.8.x with the dor-patch... see the authors/id/H/HM/HMBRAND/ directory on any CPAN mirror. FreeBSD 6.x and beyond does this for you in their perl package.

dland 2008-10-02 16:46:33

When || or // is combined with do { }, you can encapsulate a more complex assignment, ie $x = $a || do { my $z; 3 or 4 lines of derivation; $z };

RET 2008-10-03 01:40:02

@J.J. to clarify, the `feature` pragma guards access to new *keywords* that might otherwise step on user-defined subs -- like `say`, `state`, and `given`/`when`. Since that issue doesn't apply to symbolic operators like `//` and `~~` they're accessible even without `feature` (but you might want to throw in a `use 5.010` or similar declaration on code that uses them, so as to produce a more *useful* error message if someone tries to run that code on older perls).

hobbs 2010-07-26 11:05:56

Answer 13

+10 A:

The m// operator has some obscure special cases:

If you use ? as the delimeter it only matches once unless you call reset.
If you use ' as the delimeter the pattern is not interpolated.
If the pattern is empty it uses the pattern from the last successful match.

Michael Carman 2008-10-02 13:25:59

These are more like hidden gotchas than hidden features! I don't know anyone who likes them. A thread on p5p some time back discussed the usefulness of a putative m/$foo/r flag, where /r would mean no interpolation (the letter isn't important) since no-one can ever remember the single quotes thing.

dland 2008-10-02 16:43:03

@dland: Agreed; I'd call these hidden *mis*features and would never use them in production code.

Michael Carman 2008-10-02 16:59:56

I can't imagine a Perl programmer being unable to remember (or even guess) that single quotes stand for no interpolation. Its usage with this semantics is almost universal in the language that I'd rather _expect_ this to be so...

sundar 2008-10-03 17:53:37

and if the pattern is empty and the last successful match was compiled with the /o modifier, from then on it will be stuck on that pattern.

davidnicol 2008-10-03 21:11:41

I think the empty pattern behaviour has been deprecated. Primarily because a pattern like m/$foo/ becomes a nasty bug when $foo is empty.

Matthew S 2010-05-18 05:12:21

Answer 14

+34 A:

As Perl has almost all "esoteric" parts from the other lists, I'll tell you the one thing that Perl can't:

~~The one thing Perl can't do is have bare arbitrary URLs in your code, because the // operator is used for regular expressions.~~

Just in case it wasn't obvious to you what features Perl offers, here's a selective list of the maybe not totally obvious entries:

Duff's Device - in Perl

Portability and Standardness - There are likely more computers with Perl than with a C compiler

A file/path manipulation class - File::Find works on even more operating systems than .Net does

Quotes for whitespace delimited lists and strings - Perl allows you to choose almost arbitrary quotes for your list and string delimiters

Aliasable namespaces - Perl has these through glob assignments:

*My::Namespace:: = \%Your::Namespace

Static initializers - Perl can run code in almost every phase of compilation and object instantiation, from BEGIN (code parse) to CHECK (after code parse) to import (at module import) to new (object instantiation) to DESTROY (object destruction) to END (program exit)

Functions are First Class citizens - just like in Perl

Block scope and closure - Perl has both

Calling methods and accessors indirectly through a variable - Perl does that too:

my $method = 'foo';
my $obj = My::Class->new();
$obj->$method( 'baz' ); # calls $obj->foo( 'baz' )

Defining methods through code - Perl allows that too:

*foo = sub { print "Hello world" };

Pervasive online documentation - Perl documentation is online and likely on your system too

Magic methods that get called whenever you call a "nonexisting" function - Perl implements that in the AUTOLOAD function

Symbolic references - you are well advised to stay away from these. They will eat your children. But of course, Perl allows you to offer your children to blood-thirsty demons.

One line value swapping - Perl allows list assignment

Ability to replace even core functions with your own functionality

use subs 'unlink'; 
sub unlink { print 'No.' }

or

BEGIN{
    *CORE::GLOBAL::unlink = sub {print 'no'}
};

unlink($_) for @ARGV

Corion 2008-10-02 13:27:11

I'm a fan of Perl's documentation compared to other languages, but I still think that for Regexes and references it could be rationalised a whole lot. e.g. the best primer for regexes is not Perlre, but Perlop

John Ferguson 2008-10-02 14:43:52

John: Have you read perlrequick and perlretut?

Michael Carman 2008-10-02 15:02:45

"The one thing Perl can't do is have bare arbitrary URLs in your code, because the // operator is used for regular expressions." - this is utter nonsense.

2008-10-12 08:58:00

Thanks for your insight.I've looked at some ways to have a bare http://... URL in Perl code without using a source filter,and didn't find a way.Maybe you can show how this is possible? // is used for regular expressions in Perl versions up to 5.8.x.In 5.10 it's repurposed for defined-or assignment.

Corion 2008-10-13 14:23:54

bare URL: why would you expect http://example.com/ to be a single token, let alone a string, in *any* language? (besides one delimited solely by whitespace and parenthetheth :)

Hugh Allen 2008-10-21 09:48:15

Why/where would you *want* bare URLs in your code? I can't think of an example.

castaway 2009-07-08 06:57:08

Nobody would want that, it's just a Java meme. "http://foo.com" is the label http: and then "foo.com" in a comment. Some people find this interesting because... they are dumb.

jrockway 2009-10-10 18:44:16

Answer 15

+18 A:

This is a meta-answer, but the Perl Tips archives contain all sorts of interesting tricks that can be done with Perl. The archive of previous tips is on-line for browsing, and can be subscribed to via mailing list or atom feed.

Some of my favourite tips include building executables with PAR, using autodie to throw exceptions automatically, and the use of the switch and smart-match constructs in Perl 5.10.

Disclosure: I'm one of the authors and maintainers of Perl Tips, so I obviously think very highly of them. ;)

pjf 2008-10-02 13:30:52

It's probably one of the best documented languages out there, and set the pattern for tools to search documentation. That the list in this question is probably not as needed as for other languages.

Axeman 2008-10-02 21:40:54

autodie looks very nice.

j_random_hacker 2010-02-16 08:33:09

Answer 16

+32 A:

Autovivification. AFAIK no other language has it.

J.J. 2008-10-02 13:48:56

I had no idea that Python, et al, didn't support this.

skiphoppy 2008-10-03 16:17:42

ECMAscript autovivs.

davidnicol 2008-10-03 21:01:07

@davidnicol: Really? Can you provide a link? My quick search on google didn't return anything. For those that don't know ECMAscript is the correct name for Javascript. http://en.wikipedia.org/wiki/ECMAScript

J.J. 2008-10-04 09:25:52

I thought I would miss this more when I moved to Python, but I think it's a blessing in disguise there.

Gregg Lind 2009-03-05 22:56:09

And there is a module to disable autovivication

Alexandr Ciornii 2009-06-29 10:06:20

@Gregg Lind - Given that Python automatically creates variables whenever you first assign to them, autovivification would create monstrous problems out of a single typo.

Chris Lutz 2009-08-23 22:04:07

I find myself relieved that this feature has been confined to perl.

Omnifarious 2009-10-31 08:31:52

Answer 17

+12 A:

while(/\G(\b\w*\b)/g) {
     print "$1\n";
}

the \G anchor. It's hot.

J.J. 2008-10-02 14:25:00

...and it indicates the position of the end of the previous match.

Dave Sherohman 2008-10-02 16:05:42

But you have to call your regex in scalar context.

davidnicol 2008-10-03 21:03:20

@davidnicol: The above code works. Can you clarify what you mean?

J.J. 2008-10-04 09:27:36

Answer 18

+21 A:

New Block Operations

I'd say the ability to expand the language, creating pseudo block operations is one.

You declare the prototype for a sub indicating that it takes a code reference first:

sub do_stuff_with_a_hash (&\%) {
    my ( $block_of_code, $hash_ref ) = @_;
    while ( my ( $k, $v ) = each %$hash_ref ) { 
        $block_of_code->( $k, $v );
    }
}

You can then call it in the body like so

use Data::Dumper;


do_stuff_with_a_hash {
    local $Data::Dumper::Terse = 1;
    my ( $k, $v ) = @_;
    say qq(Hey, the key   is "$k"!);
    say sprintf qq(Hey, the value is "%v"!), Dumper( $v );


} %stuff_for
;

(Data::Dumper::Dumper is another semi-hidden gem.) Notice how you don't need the sub keyword in front of the block, or the comma before the hash. It ends up looking a lot like: map { } @list

Source Filters

Also, there are source filters. Where Perl will pass you the code so you can manipulate it. Both this, and the block operations, are pretty much don't-try-this-at-home type of things.

I have done some neat things with source filters, for example like creating a very simple language to check the time, allowing short Perl one-liners for some decision making:

perl -MLib::DB -MLib::TL -e 'run_expensive_database_delete() if $hour_of_day < AM_7';

Lib::TL would just scan for both the "variables" and the constants, create them and substitute them as needed.

Again, source filters can be messy, but are powerful. But they can mess debuggers up something terrible--and even warnings can be printed with the wrong line numbers. I stopped using Damian's Switch because the debugger would lose all ability to tell me where I really was. But I've found that you can minimize the damage by modifying small sections of code, keeping them on the same line.

Signal Hooks

It's often enough done, but it's not all that obvious. Here's a die handler that piggy backs on the old one.

my $old_die_handler = $SIG{__DIE__};
$SIG{__DIE__}       
    = sub { say q(Hey! I'm DYIN' over here!); goto &$old_die_handler; }
    ;

That means whenever some other module in the code wants to die, they gotta come to you (unless someone else does a destructive overwrite on $SIG{__DIE__}). And you can be notified that somebody things something is an error.

Of course, for enough things you can just use an END { } block, if all you want to do is clean up.

`overload::constant`

You can inspect literals of a certain type in packages that include your module. For example, if you use this in your import sub:

overload::constant 
    integer => sub { 
        my $lit = shift;
        return $lit > 2_000_000_000 ? Math::BigInt->new( $lit ) : $lit 
    };

it will mean that every integer greater than 2 billion in the calling packages will get changed to a Math::BigInt object. (See overload::constant).

Grouped Integer Literals

While we're at it. Perl allows you to break up large numbers into groups of three digits and still get a parsable integer out of it. Note 2_000_000_000 above for 2 billion.

Axeman 2008-10-02 14:31:18

When using $SIG{__DIE__} handlers, its strongly recommended that you inspect $^S to see if your program is actually dying, or just throwing an exception which is going to be caught. Usually you don't want to interfere with the latter.

pjf 2008-10-02 22:25:24

Yep. I do that generally. And that's a good point.

Axeman 2008-10-03 05:58:26

The new block is very instructive ! I was thinking It was a language semantic! many thanks.

ZeroCool 2010-01-03 19:40:17

Answer 19

+1 A:

Axeman reminded me of how easy it is to wrap some of the built-in functions.

Before Perl 5.10 Perl didn't have a pretty print(say) like Python.

So in your local program you could do something like:

sub print {
     print @_, "\n";
}

or add in some debug.

sub print {
    exists $ENV{DEVELOPER} ?
    print Dumper(@_) :
    print @_;
}

J.J. 2008-10-02 15:04:15

It's also very easy to accidentally change the context! Your print subroutine (using . to concatenate) will print the number of items to be printed, rather than the items themselves.Using `print @_, "\n"` (note the comma) will preserve the context.

pjf 2008-10-02 22:22:29

:-D tks for the clarification. I will edit accordingly. :-D teach me to write code without running it :-P

J.J. 2008-10-03 01:55:30

Er...except I don't think you can override print this way. Most other builtins I think you can, but not print. :|

Robert P 2009-04-03 22:57:04

This is true - `print` cannot be overridden.

Chris Lutz 2009-08-25 00:16:02

@Chris: It's true print cannot be overridden in this manner. But not that it cannot be overridden absolutely. Using some sub modules of B and some tricks you can find on http://perlmonks.com you can override it.

J.J. 2009-09-09 17:23:34

Answer 20

+25 A:

It's simple to quote almost any kind of strange string in Perl.

my $url = q{http://my.url.com/any/arbitrary/path/in/the/url.html};

In fact, the various quoting mechanisms in Perl are quite interesting. The Perl regex-like quoting mechanisms allow you to quote anything, specifying the delimiters. You can use almost any special character like #, /, or open/close characters like (), [], or {}. Examples:

my $var  = q#some string where the pound is the final escape.#;
my $var2 = q{A more pleasant way of escaping.};
my $var3 = q(Others prefer parens as the quote mechanism.);

Quoting mechanisms:

q : literal quote; only character that needs to be escaped is the end character. qq : an interpreted quote; processes variables and escape characters. Great for strings that you need to quote:

my $var4 = qq{This "$mechanism" is broken.  Please inform "$user" at "$email" about it.};

qx : Works like qq, but then executes it as a system command, non interactively. Returns all the text generated from the standard out. (Redirection, if supported in the OS, also comes out) Also done with back quotes (the ` character).

my $output  = qx{type "$path"};      # get just the output
my $moreout = qx{type "$path" 2>&1}; # get stuff on stderr too

qr : Interprets like qq, but then compiles it as a regular expression. Works with the various options on the regex as well. You can now pass the regex around as a variable:

sub MyRegexCheck {
    my ($string, $regex) = @_;
    if ($string)
    {
       return ($string =~ $regex);
    }
    return; # returns 'null' or 'empty' in every context
}

my $regex = qr{http://[\w]\.com/([\w]+/)+};
@results = MyRegexCheck(q{http://myurl.com/subpath1/subpath2/}, $regex);

qw : A very, very useful quote operator. Turns a quoted set of whitespace separated words into a list. Great for filling in data in a unit test.


   my @allowed = qw(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z { });
   my @badwords = qw(WORD1 word2 word3 word4);
   my @numbers = qw(one two three four 5 six seven); # works with numbers too
   my @list = ('string with space', qw(eight nine), "a $var"); # works in other lists
   my $arrayref = [ qw(and it works in arrays too) ];

They're great to use them whenever it makes things clearer. For qx, qq, and q, I most likely use the {} operators. The most common habit of people using qw is usually the () operator, but sometimes you also see qw//.

Robert P 2008-10-02 16:43:04

I sometimes use qw"" so that syntax highlighters will highlight it correctly.

Brad Gilbert 2008-10-02 22:47:01

Works for me in SlickEdit. :)

Robert P 2009-01-15 00:35:40

@Brad: Use a better editor! Works for me on vim.

fengshaun 2010-06-09 06:12:11

@fengshaun, The editors I generally use **do** highlight these correctly. I was referring, in part to the syntax highlighter on StackOverflow.

Brad Gilbert 2010-06-13 22:40:10

Answer 21

+24 A:

The quoteword operator is one of my favourite things. Compare:

my @list = ('abc', 'def', 'ghi', 'jkl');

and

my @list = qw(abc def ghi jkl);

Much less noise, easier on the eye. Another really nice thing about Perl, that one really misses when writing SQL, is that a trailing comma is legal:

print 1, 2, 3, ;

That looks odd, but not if you indent the code another way:

print
    results_of_foo(),
    results_of_xyzzy(),
    results_of_quux(),
    ;

Adding an additional argument to the function call does not require you to fiddle around with commas on previous or trailing lines. The single line change has no impact on its surrounding lines.

This makes it very pleasant to work with variadic functions. This is perhaps one of the most under-rated features of Perl.

dland 2008-10-02 16:54:51

An interesting corner case of Perl's syntax is that the following is valid: for $_ qw(a list of stuff) {...}

ephemient 2008-10-03 03:21:26

You can even abuse glob syntax for quoting words, as long as you don't use special characters such as *?. So you can write `for (<a list of stuff>) { ... }`

moritz 2008-10-07 12:24:47

@ephemient: nearly. That only works with lexicals: for my $x qw(a b c) {...} For instance: for $_ qw(a b c) {print} # prints nothing

dland 2008-10-08 07:43:55

why add that extra lexical when you can enjoy perl's favourite default?for (qw/a b c d/) { print;}

fengshaun 2010-06-09 06:08:37

@fengshaun: it depends on how large the block is. A named lexical helps document the narrative. I tend to only use an implicit $_ as a statement modifier: print for qw(a b c);

dland 2010-07-01 14:53:07

Answer 22

+20 A:

Taint checking. With taint checking enabled, perl will die (or warn, with -t) if you try to pass tainted data (roughly speaking, data from outside the program) to an unsafe function (opening a file, running an external command, etc.). It is very helpful when writing setuid scripts or CGIs or anything where the script has greater privileges than the person feeding it data.

Magic goto. goto &sub does an optimized tail call.

The debugger.

use strict and use warnings. These can save you from a bunch of typos.

Glomek 2008-10-02 17:00:49

Why don't other languages have this feature? This feature used makes perl web scripts an order of magnitude more secure.

Matthew Lock 2009-12-03 06:25:37

+1 for 'taint' checking. teehee!

temp2290 2010-01-29 16:46:04

Answer 23

+24 A:

The "for" statement can be used the same way "with" is used in Pascal:

for ($item)
{
    s/&‎nbsp;/ /g;
    s/<.*?>/ /g;
    $_ = join(" ", split(" ", $_));
}

You can apply a sequence of s/// operations, etc. to the same variable without having to repeat the variable name.

NOTE: the non-breaking space above (&‎nbsp;) has hidden Unicode in it to circumvent the Markdown. Don't copy paste it :)

2008-10-02 17:11:11

And "map" does the same trick as well... map { .... } $item;One advantage of using "for" over "map" would be that you could use next to break out.

draegtun 2008-10-02 19:39:45

Also, for has the item being manipulated listed before the code doing the manipulating, leading to better readability.

Robert P 2009-04-03 22:59:31

Answer 24

+10 A:

rename("$_.part", $_) for "data.txt";

renames data.txt.part to data.txt without having to repeat myself.

2008-10-02 17:12:40

Answer 25

+5 A:

sub load_file
{
    local(@ARGV, $/) = shift;
    <>;
}

and a version that returns an array as appropriate:

sub load_file
{
    local @ARGV = shift;
    local $/ = wantarray? $/: undef;
    <>;
}

2008-10-02 17:15:17

Answer 26

+37 A:

Add support for compressed files via magic ARGV:

s{ 
    ^            # make sure to get whole filename
    ( 
      [^'] +     # at least one non-quote
      \.         # extension dot
      (?:        # now either suffix
          gz
        | Z 
       )
    )
    \z           # through the end
}{gzcat '$1' |}xs for @ARGV;

(quotes around $_ necessary to handle filenames with shell metacharacters in)

Now the <> feature will decompress any @ARGV files that end with ".gz" or ".Z":

while (<>) {
    print;
}

2008-10-02 17:23:05

This is cool on many different levels...

Leonardo Herrera 2008-11-22 05:51:49

I don't think you need to escape the `|` in the replacement.

Chris Lutz 2009-08-23 21:56:51

I'm staring at this and I can't figure out how it works. At what point is `zcat |` parsed as a command to pipe through?

Ether 2010-06-03 03:44:55

@Ether => detecting pipes is a feature of the two argument open, which the diamond operator uses as it opens each file in `@ARGV`

Eric Strom 2010-06-05 06:29:15

Answer 27

+23 A:

Not really hidden, but many every day Perl programmers don't know about CPAN. This especially applies to people who aren't full time programmers or don't program in Perl full time.

mpeters 2008-10-02 17:25:18

Answer 28

+24 A:

The ability to parse data directly pasted into a DATA block. No need to save to a test file to be opened in the program or similar. For example:

my @lines = <DATA>;
for (@lines) {
    print if /bad/;
}

__DATA__
some good data
some bad data
more good data 
more good data

2008-10-02 17:57:38

This is very ugly.

Hai 2010-01-06 21:12:19

And very useful in little tests!

fengshaun 2010-06-09 06:13:07

@peter mortensen how would you have multiple blocks? And how do you end a block?

Toad 2010-10-01 17:55:26

@Toad: it is allan's answer (see the revision list). It is better to address that user. Or, as that user has left Stack Overflow, maybe address no one in particular (so a real Perl expert can straighten it out later).

Peter Mortensen 2010-10-01 18:25:38

Answer 29

+5 A:

Safe compartments.

With the Safe module you can build your own sandbox-style environment using nothing but perl. You would then be able to load perl scripts into the sandbox.

Best regards,

melo 2008-10-02 18:03:06

Answer 30

+1 A:

@Corion - Bare URLs in Perl? Of course you can, even in interpolated strings. The only time it would matter is in a string that you were actually USING as a regular expression.

2008-10-02 19:48:08

It comes from a joke where, in C++, you could embed, raw and without quotes or comments, a URL in your program: `http://www.example.com` (the `http:` is a label, and the `//` makes the rest a comment). This is what everyone is referring to.

Chris Lutz 2009-08-25 00:17:26

Answer 31

+5 A:

Core IO::Handle module. Most important thing for me is that it allows autoflush on filehandles. Example:

use IO::Handle;    
$log->autoflush(1);

Alexandr Ciornii 2008-10-02 19:53:54

Answer 32

+1 A:

Showing progress in the script by printing on the same line:

$| = 1; # flush the buffer on the next output 

for $i(1..100) {
    print "Progress $i %\r"
}

Taveren 2008-10-03 10:25:16

Answer 33

+16 A:

map - not only because it makes one's code more expressive, but because it gave me an impulse to read a little bit more about this "functional programming".

brunorc 2008-10-03 15:04:33

Answer 34

+3 A:

How about the ability to use

my @symbols = map { +{ 'key' => $_ } } @things;

to generate an array of hashrefs from an array -- the + in front of the hashref disambiguates the block so the interpreter knows that it's a hashref and not a code block. Awesome.

(Thanks to Dave Doyle for explaining this to me at the last Toronto Perlmongers meeting.)

talexb 2008-10-03 16:43:33

Answer 35

+7 A:

I don't know how esoteric it is, but one of my favorites is the hash slice. I use it for all kinds of things. For example to merge two hashes:

my %number_for = (one => 1, two => 2, three => 3);
my %your_numbers = (two => 2, four => 4, six => 6);
@number_for{keys %your_numbers} = values %your_numbers;
print sort values %number_for; # 12346

2008-10-03 21:09:09

%number_for = ( %number_for, %your_numbers );

ijw 2009-11-18 13:36:45

Answer 36

+8 A:

tie, the variable tying interface.

davidnicol 2008-10-03 21:15:27

Tie::File saved my day once !

mhd 2009-03-03 11:58:58

Answer 37

+12 A:

The continue clause on loops. It will be executed at the bottom of every loop, even those which are next'ed.

while( <> ){
  print "top of loop\n";
  chomp;

  next if /next/i;
  last if /last/i;

  print "bottom of loop\n";
}continue{
  print "continue\n";
}

2008-10-04 02:29:35

Answer 38

+3 A:

All right. Here is another. Dynamic Scoping. It was talked about a little in a different post, but I didn't see it here on the hidden features.

Dynamic Scoping like Autovivification has a very limited amount of languages that use it. Perl and Common Lisp are the only two I know of that use Dynamic Scoping.

J.J. 2008-10-05 15:12:40

Ohh ya, "local" designates a dynamically scoped variable, while "my" designates a variable in a static scope.

J.J. 2008-10-05 15:14:54

Answer 39

+3 A:

My favorite semi-hidden feature of Perl is the eof function. Here's an example pretty much directly from perldoc -f eof that shows how you can use it to reset the file name and $. (the current line number) easily across multiple files loaded up at the command line:

while (<>) {
  print "$ARGV:$.\t$_";
} 
continue {
  close ARGV if eof
}

Telemachus 2008-10-10 02:11:14

Answer 40

+2 A:

I'm a bit late to the party, but a vote for the built-in tied-hash function dbmopen() -- it's helped me a lot. It's not exactly a database, but if you need to save data to disk it takes away a lot of the problems and Just Works. It helped me get started when I didn't have a database, didn't understand Storable.pm, but I knew I wanted to progress beyond reading and writing to text files.

AmbroseChapel 2008-10-11 23:07:37

Answer 41

+7 A:

The "desperation mode" of Perl's loop control constructs which causes them to look up the stack to find a matching label allows some curious behaviors which Test::More takes advantage of, for better or worse.

SKIP: {
    skip() if $something;

    print "Never printed";
}

sub skip {
    no warnings "exiting";
    last SKIP;
}

There's the little known .pmc file. "use Foo" will look for Foo.pmc in @INC before Foo.pm. This was intended to allow compiled bytecode to be loaded first, but Module::Compile takes advantage of this to cache source filtered modules for faster load times and easier debugging.

The ability to turn warnings into errors.

local $SIG{__WARN__} = sub { die @_ };
$num = "two";
$sum = 1 + $num;
print "Never reached";

That's what I can think of off the top of my head that hasn't been mentioned.

Schwern 2008-10-15 15:13:09

Answer 42

+4 A:

($x, $y) = ($y, $x) is what made me want to learn Perl.

The list constructor 1..99 or 'a'..'zz' is also very nice.

2008-10-15 17:22:31

Answer 43

+3 A:

You can replace the delimiter in regexes and strings with just about anything else. This is particularly useful for "leaning toothpick syndrome", exemplified here:

$url =~ /http:\/\/www\.stackoverflow\.com\//;

You can eliminate most of the back-whacking by changing the delimiter. /bar/ is shorthand for m/bar/ which is the same as m!bar!.

$url =~ m!http://www\.stackoverflow\.com/!;

You can even use balanced delimiters like {} and []. I personally love these. q{foo} is the same as 'foo'.

$code = q{
    if( this is awesome ) {
        print "Look ma, no escaping!";
    }
};

To confuse your friends (and your syntax highlighter) try this:

$string = qq'You owe me $1,000 dollars!';

Guillaume Gervais 2008-10-28 12:48:30

You should explicitly mention that, when using `{}` (and friends) as quote delimiters, Perl _will_ balance the delimiters.

Chris Lutz 2009-08-25 00:05:25

Answer 44

+3 A:

Use lvalues to make your code really confusing:

my $foo = undef ;
sub bar:lvalue{ return $foo ;}

# Then later

bar = 5 ;
print bar ;

2008-11-19 15:58:50

Answer 45

+3 A:

Very late to the party, but: attributes.

Attributes essentially let you define arbitrary code to be associated with the declaration of a variable or subroutine. The best way to use these is with Attribute::Handlers; this makes it easy to define attributes (in terms of, what else, attributes!).

I did a presentation on using them to declaratively assemble a pluggable class and its plugins at YAPC::2006, online here. This is a pretty unique feature.

Joe McMahon 2008-11-21 20:27:52

Answer 46

A:

One more...

Perl cache:

my $processed_input = $records || process_inputs("$records_file");

On Elpeleg Open Source, Perl CMS http://www.web-app.net/

2009-01-26 07:36:20

Answer 47

+4 A:

This one isn't particularly useful, but it's extremely esoteric. I stumbled on this while digging around in the Perl parser.

Before there was POD, perl4 had a trick to allow you to embed the man page, as nroff, straight into your program so it wouldn't get lost. perl4 used a program called wrapman (see Pink Camel page 319 for some details) to cleverly embed an nroff man page into your script.

It worked by telling nroff to ignore all the code, and then put the meat of the man page after an END tag which tells Perl to stop processing code. Looked something like this:

#!/usr/bin/perl
'di';
'ig00';

...Perl code goes here, ignored by nroff...

.00;        # finish .ig

'di         \" finish the diversion
.nr nl 0-1  \" fake up transition to first page
.nr % 0     \" start at page 1
'; __END__

...man page goes here, ignored by Perl...

The details of the roff magic escape me, but you'll notice that the roff commands are strings or numbers in void context. Normally a constant in void context produces a warning. There are special exceptions in op.c to allow void context strings which start with certain roff commands.

              /* perl4's way of mixing documentation and code
                 (before the invention of POD) was based on a
                 trick to mix nroff and perl code. The trick was
                 built upon these three nroff macros being used in
                 void context. The pink camel has the details in
                 the script wrapman near page 319. */
                const char * const maybe_macro = SvPVX_const(sv);
                if (strnEQ(maybe_macro, "di", 2) ||
                    strnEQ(maybe_macro, "ds", 2) ||
                    strnEQ(maybe_macro, "ig", 2))
                        useless = NULL;

This means that 'di'; doesn't produce a warning, but neither does 'die'; 'did you get that thing I sentcha?'; or 'ignore this line';.

In addition, there are exceptions for the numeric constants 0 and 1 which allows the bare .00;. The code claims this was for more general purposes.

            /* the constants 0 and 1 are permitted as they are
               conventionally used as dummies in constructs like
                    1 while some_condition_with_side_effects;  */
            else if (SvNIOK(sv) && (SvNV(sv) == 0.0 || SvNV(sv) == 1.0))
                useless = NULL;

And what do you know, 2 while condition does warn!

Schwern 2009-02-09 23:00:07

Answer 48

+3 A:

I personally love the /e modifier to the s/// operation:

while(<>) {
  s/(\w{0,4})/reverse($1);/e; # reverses all words between 0 and 4 letters
  print;
}

Input:

This is a test of regular expressions
^D

Output (I think):

sihT si a tset fo regular expressions

Chris Lutz 2009-02-09 23:28:12

Answer 49

+1 A:

You might think you can do this to save memory:

@is_month{qw(jan feb mar apr may jun jul aug sep oct nov dec)} = undef;

print "It's a month" if exists $is_month{lc $mon};

but it doesn't do that. Perl still assigns a different scalar value to each key. Devel::Peek shows this. PVHV is the hash. Elt is a key and the SV that follows is its value. Note that each SV has a different memory address indicating they're not being shared.

Dump \%is_month, 12;

SV = RV(0x81c1bc) at 0x81c1b0
  REFCNT = 1
  FLAGS = (TEMP,ROK)
  RV = 0x812480
  SV = PVHV(0x80917c) at 0x812480
    REFCNT = 2
    FLAGS = (SHAREKEYS)
    ARRAY = 0x206f20  (0:8, 1:4, 2:4)
    hash quality = 101.2%
    KEYS = 12
    FILL = 8
    MAX = 15
    RITER = -1
    EITER = 0x0
    Elt "feb" HASH = 0xeb0d8580
    SV = NULL(0x0) at 0x804b40
      REFCNT = 1
      FLAGS = ()
    Elt "may" HASH = 0xf2290c53
    SV = NULL(0x0) at 0x812420
      REFCNT = 1
      FLAGS = ()

An undef scalar takes as much memory as an integer scalar, so you might ask well just assign them all to 1 and avoid the trap of forgetting to check with exists.

my %is_month = map { $_ => 1 } qw(jan feb mar apr may jun jul aug sep oct nov dec);

print "It's a month" if $is_month{lc $mon});

2009-02-13 18:57:47

This doesn't save memory, and it generates a nice trap for the unsuspecting programmer. Perl still assigns an undef scalar value to each key and undef doesn't take less memory than 1. Use Devel::Peek to see.

Schwern 2009-02-23 06:32:39

You might be right that the "undef" construct doesn't save memory. However, in my opinion, it's better than your solution for several reasons:1. the "undef" method tells the reader that the value isn't used2. the "1" initializer is more complicated for no good reason3. requiring "exists" is no more trap than many other things in Perl

2010-01-01 21:21:33

Also, note that the "1" method *does* use more RAM than "undef"!Try creating a program that initialzes a million elements this way and then look at the memory footprint using ps. You'll see that the "1" method uses more memory. I think it's true that the data structures are the same size, but the initializer uses more memory.

2010-01-01 21:22:53

Answer 50

+5 A:

use diagnostics;

If you are starting to work with Perl and have never done so before, this module will save you tons of time and hassle. For almost every basic error message you can get, this module will give you a lengthy explanation as to why your code is breaking, including some helpful hints as to how to fix it. For example:

use strict;
use diagnostics;

$var = "foo";

gives you this helpful message:

Global symbol "$var" requires explicit package name at - line 4.
Execution of - aborted due to compilation errors (#1)
    (F) You've said "use strict vars", which indicates that all variables
    must either be lexically scoped (using "my"), declared beforehand using
    "our", or explicitly qualified to say which package the global variable
    is in (using "::").

Uncaught exception from user code:
        Global symbol "$var" requires explicit package name at - line 4.
Execution of - aborted due to compilation errors.
 at - line 5

use diagnostics;
use strict;

sub myname {
    print { " Some Error " };
};

you get this large, helpful chunk of text:

syntax error at - line 5, near "};"
Execution of - aborted due to compilation errors (#1)
(F) Probably means you had a syntax error.  Common reasons include:

    A keyword is misspelled.
    A semicolon is missing.
    A comma is missing.
    An opening or closing parenthesis is missing.
    An opening or closing brace is missing.
    A closing quote is missing.

Often there will be another error message associated with the syntax
error giving more information.  (Sometimes it helps to turn on -w.)
The error message itself often tells you where it was in the line when
it decided to give up.  Sometimes the actual error is several tokens
before this, because Perl is good at understanding random input.
Occasionally the line number may be misleading, and once in a blue moon
the only way to figure out what's triggering the error is to call
perl -c repeatedly, chopping away half the program each time to see
if the error went away.  Sort of the cybernetic version of S.

Uncaught exception from user code:
    syntax error at - line 5, near "};"
Execution of - aborted due to compilation errors.
at - line 7

From there you can go about deducing what might be wrong with your program (in this case, print is formatted entirely wrong). There's a large number of known errors with diagnostics. Now, while this would not be a good thing to use in production, it can serve as a great learning aid for those who are new to Perl.

Robert P 2009-03-26 17:13:49

Answer 51

+4 A:

You can use @{[...]} to get an interpolated result of complex perl expressions

$a = 3;
$b = 4;

print "$a * $b = @{[$a * $b]}";

prints: 3 * 4 = 12

2009-05-31 02:39:22

Answer 52

+8 A:

The goatse operator*:

$_ = "foo bar";
my $count =()= /[aeiou]/g; #3

or

sub foo {
    return @_;
}

$count =()= foo(qw/a b c d/); #4

It works because list assignment in scalar context yields the number of elements in the list being assigned.

* Note, not really an operator

Chas. Owens 2009-05-31 03:13:52

That is the most (well, least) beautiful "operator" ever.

Chris Lutz 2009-08-24 23:59:44

tchrist 2010-10-30 20:54:39

Answer 53

+4 A:

The input record separator can be set to a reference to a number to read fixed length records:

$/ = \3; print $_,"\n" while <>; # output three chars on each line

2009-06-22 12:39:06

Answer 54

+1 A:

The following are just as short but more meaningful than "~~" since they indicate what is returned, and there's no confusion with the smart match operator:

print "".localtime;   # Request a string

print 0+@array;       # Request a number

2009-06-29 17:15:51

Answer 55

+2 A:

Quantum::Superpositions

use Quantum::Superpositions;

if ($x == any($a, $b, $c)) { ...  }

Dario 2009-07-09 18:29:22

Answer 56

+1 A:

The Schwartzian Transform is a technique that allows you to efficiently sort by a computed, secondary index. Let's say that you wanted to sort a list of strings by their md5 sum. The comments below are best read backwards (that's the order I always end up writing these anyways):

my @strings = ('one', 'two', 'three', 'four');

my $md5sorted_strings = 
    map { $_->[0] }               # 4) map back to the original value
    sort { $a->[1] cmp $b->[1] }  # 3) sort by the correct element of the list
    map { [$_, md5sum_func($_)] } # 2) create a list of anonymous lists
    @strings                      # 1) take strings

This way, you only have to do the expensive md5 computation N times, rather than N log N times.

2009-08-13 06:33:44

Answer 57

A:

$0 is the name of the perl script being executed. It can be used to get the context from which a module is being run.

# MyUsefulRoutines.pl

sub doSomethingUseful {
  my @args = @_;
  # ...
}

if ($0 =~ /MyUsefulRoutines.pl/) {
  # someone is running  perl MyUsefulRoutines.pl [args]  from the command line
  &doSomethingUseful (@ARGV);
} else {
  # someone is calling  require "MyUsefulRoutines.pl"  from another script
  1;
}

This idiom is helpful for treating a standalone script with some useful subroutines into a library that can be imported into other scripts. Python has similar functionality with the object.__name__ == "__main__" idiom.

mobrule 2009-09-04 17:30:11

Answer 58

+1 A:

The expression defined &DB::DB returns true if the program is running from within the debugger.

Kiffin 2009-10-10 18:37:15

Answer 59

+2 A:

One useful composite operator for conditionally adding strings or lists into other lists is the x!!operator:

 print 'the meaning of ', join ' ' =>  
     'life,'                x!! $self->alive,
     'the universe,'        x!! ($location ~~ Universe),
     ('and', 'everything.') x!! 42; # this is added as a list

this operator allows for a reversed syntax similar to

 do_something() if test();

Eric Strom 2009-10-31 08:49:34

Answer 60

+1 A:

Interpolation of match regular expressions. A useful application of this is when matching on a blacklist. Without using interpolation it is written like so:

#detecting blacklist words in the current line
/foo|bar|baz/;

Can instead be written

@blacklistWords = ("foo", "bar", "baz");
$anyOfBlacklist = join "|", (@blacklistWords);
/$anyOfBlacklist/;

This is more verbose, but allows for population from a datafile. Also if the list is maintained in the source for whatever reason, it is easier to maintain the array then the RegExp.

Erick 2009-11-05 16:21:50

Answer 61

+1 A:

Using hashes (where keys are unique) to obtain the unique elements of a list:

my %unique = map { $_ => 1 } @list;
my @unique = keys %unique;

Nick Dixon 2009-11-19 12:21:56

Answer 62

A:

using bare blocks with redo or other control words to create custom looping constructs.

traverse a linked list of objects returning the first ->can('print') method:

sub get_printer {
    my $self = shift;
    {$self->can('print') or $self = $self->next and redo}
}

Eric Strom 2010-01-14 06:50:33

Answer 63

A:

Add one for the unpack() and pack() functions, which are great if you need to import and/or export data in a format which is used by other programs.

Of course these days most programs will allow you to export data in XML, and many commonly used proprietary document formats have associated Perl modules written for them. But this is one of those features that is incredibly useful when you need it, and pack()/unpack() are probably the reason that people have been able to write CPAN modules for so many proprietary data formats.

Peter 2010-01-25 22:51:46

Answer 64

A:

There is a more powerful way to check program for syntax errors:

perl -w -MO=Lint,no-context myscript.pl

The most important thing that it can do is reporting for 'unexistant subroutine' errors.

Alexey 2010-04-13 10:33:13

Answer 65

+1 A:

use re debug
Doc on use re debug

and

perl -MO=Concise[,OPTIONS]
Doc on Concise

Besides being exquisitely flexible, expressive and amenable to programing in the style of C, Pascal, Python and other languages, there are several pragmas command switches that make Perl my 'goto' language for initial kanoodling on an algorithm, regex, or quick problems that needs to be solved. These two are unique to Perl I believe, and are among my favorites.

use re debug: Most modern flavors of regular expressions owe their current form and function to Perl. While there are many Perl forms of regex that cannot be expressed in other languages, there are almost no forms of other languages' flavor of regex that cannot be expressed in Perl. Additionally, Perl has a wonderful regex debugger built in to show how the regex engine is interpreting your regex and matching against the target string.

Example: I recently was trying to write a simple CSV routine. (Yes, yes, I know, I should have been using Text::CSV...) but the CSV values were not quoted and simple.

My first take was /^(^(?:(.*?),){$i}/ to extract the i record on n CSV records. That works fine -- except for the last record or n of n. I could see that without the debugger.

Next I tried /^(?:(.*?),|$){$i}/ This did not work, and I could not see immediately why. I thought I was saying (.*?) followed by a comma or EOL. Then I added use re debug at the top of a small test script. Ahh yes, the alteration between ,|$ was not being interpreted that way; it was being interpreted as ((.*?),) | ($) -- not what I wanted.

A new grouping was needed. So I arrived at the working /^(?:(.*?)(?:,|$)){$i}/. While I was in the regex debugger, I was surprised how many loops it took for a match towards the end of the string. It is the .*? term that is quite ambiguous and requires excessive backtracking to satisfy. So I tried /^(?:(?:^|,)([^,]*)){$i}/ This does two things: 1) reduces backtracking because of the greedy match of all but a comma 2) allowed the regex optimizer to only use the alteration once on the first field. Using Benchmark, this is 35% faster than the first regex. The regex debugger is wonderful and few use it.

perl -MO=Concise[,OPTIONS]: The B and Concise frameworks are tremendous tools to see how Perl is interpreting your masterpiece. Using the -MO=Concise prints the result of the Perl interpreters translation of your source code. There are many options to Concise and in B, you can write your own presentation of the OP codes.

As in this post, you can use Concise to compare different code structures. You can interleave your source lines with the OP codes those lines generate. Check it out.

drewk 2010-04-16 20:08:08

Answer 66

A:

Two things that work well together: IO handles on in-core strings, and using function prototypes to enable you to write your own functions with grep/map-like syntax.

sub with_output_to_string(&) {           # allows compiler to accept "yoursub {}" syntax.
  my $function = shift;
  my $string   = '';
  my $handle   = IO::Handle->new();
  open($handle, '>', \$string) || die $!; # IO handle on a plain scalar string ref
  my $old_handle = select $handle;
  eval { $function->() };
  select $old_handle;
  die $@ if $@;
  return $string;
}

my $greeting = with_output_to_string {
  print "Hello, world!";
};

print $greeting, "\n";

Danny Woods 2010-05-26 08:44:19

Answer 67

A:

Perl is great as a flexible awk/sed.

For example lets use a simple replacement for ls | xargs stat, naively done like:

$ ls | perl -pe 'print "stat "' | sh

This doesn't work well when the input (filenames) have spaces or shell special characters like |$\. So single quotes are frequently required in the Perl output.

One complication with calling perl via the command line -ne is that the shell gets first nibble at your one-liner. This often leads to torturous escaping to satisfy it.

One 'hidden' feature that I use all the time is \x27 to include a single quote instead of trying to use shell escaping '\''

So:

$ ls | perl -nle 'chomp; print "stat '\''$_'\''"' | sh

can be more safely written:

$ ls | perl -pe 's/(.*)/stat \x27$1\x27/' | sh

That won't work with funny characters in the filenames, even quoted like that. But this will:

$ ls | perl -pe 's/\n/\0/' | xargs -0 stat

Terry 2010-05-26 09:54:13

Answer 68

+1 A:

Next time you're at a geek party pull out this one-liner in a bash shell and the women will swarm you and your friends will worship you:

find . -name "*.txt"|xargs perl -pi -e 's/1:(\S+)/uc($1)/ge'

Process all *.txt files and do an in-place find and replace using perl's regex. This one converts text after a '1:' to upper case and removes the '1:'. Uses Perl's 'e' modifier to treat the second part of the find/replace regex as executable code. Instant one-line template system. Using xargs lets you process a huge number of files without running into bash's command line length limit.

Mark Maunder 2010-05-26 10:01:12

Answer 69

+1 A:

The ability to use a hash as a seen filter in a loop. I have yet to see something quite as nice in a different language. For example, I have not been able to duplicate this in python.

For example, I want to print a line if it has not been seen before.

my %seen;

for (<LINE>) {
  print $_ unless $seen{$_}++;
}

Jauder Ho 2010-05-26 10:35:01

Answer 70

+1 A:

The new -E option on the command line:

> perl -e "say 'hello"" # does not work 

String found where operator expected at -e line 1, near "say 'hello'"
        (Do you need to predeclare say?)
syntax error at -e line 1, near "say 'hello'"
Execution of -e aborted due to compilation errors.

> perl -E "say 'hello'" 
hello

knb 2010-05-26 11:28:35

Answer 71

+1 A:

You can expand function calls in a string, for example;

print my $foo = "foo @{[scalar(localtime)]} bar";

foo Wed May 26 15:50:30 2010 bar

trapd00r 2010-05-26 13:50:09

Answer 72

+1 A:

You can use different quotes on HEREDOCS to get different behaviors.

my $interpolation = "We will interpolated variables";
print <<"END";
With double quotes, $interpolation, just like normal HEREDOCS.
END

print <<'END';
With single quotes, the variable $foo will *not* be interpolated.
(You have probably seen this in other languages.)
END

## this is the fun and "hidden" one
my $shell_output = <<`END`;
echo With backticks, these commands will be executed in shell.
echo The output is returned.
ls | wc -l
END

print "shell output: $shell_output\n";

Justin 2010-05-26 14:10:53

Answer 73

+3 A:

@Schwern mentioned turning warnings into errors by localizing $SIG{__WARN__}. You can do also do this (lexically) with use warnings FATAL => "all";. See perldoc lexwarn.

On that note, since Perl 5.12, you've been able to say perldoc foo instead of the full perldoc perlfoo. Finally! :)

wolverian 2010-08-19 23:05:10

Answer 74

A:

"now"

sub _now { 
        my ($now) = localtime() =~ /([:\d]{8})/;
        return $now;
}

print _now(), "\n"; #  15:10:33

Jet 2010-09-06 17:25:31

That's been answered at http://stackoverflow.com/questions/161872/hidden-features-of-perl/162060#162060 already.

Olfan 2010-09-07 08:47:51

ansaurus

tags:

views:

answers:

Hidden features of Perl?

Hidden Features also found in other languages' Hidden Features:

Other Hidden Features:

New Block Operations

Source Filters

Signal Hooks

`overload::constant`

Grouped Integer Literals

related questions