views:

3261

answers:

3

When using system() calls in Perl, do you have to escape the shell args, or is that done automatically?

The arguments will be user input, so I want to make sure this isn't exploitable.

+23  A: 

If you use system $cmd, @args rather than system "$cmd @args" (an array rather than a string), then you do not have to escape the arguments because no shell is invoked (see system). system {$cmd} $cmd, @args will not invoke a shell either even if $cmd contains metacharacters and @args is empty (this is documented as part of exec). If the args are coming from user input, you will still want to untaint them. See -T in the perlrun docs, and the perlsec docs.

If you need to read the output or send input to the command, qx and readpipe have no equivalent. Instead, use open my $output, "-|", $cmd, @args or open my $input, "|-", $cmd, @args although this is not portable as it requires a real fork which means Unix only... I think. Maybe it'll work on Windows with it's simulated fork. A better option is something like IPC::Run, which will also handle the case of piping commands to other commands, which neither the multi-arg form of system nor the 4 arg form of open will handle.

runrig
+1 for I-never-noticed-they-added-that-syntax. Lovely.
chaos
As an addition, `system {'cmd'} 'cmd'` always bypasses `sh` even if `'cmd'` contains characters that would normally be interpreted by the shell.
ephemient
You should add that the *reason* why you don't have to escape shell metacharacters with "system 'cmd' @args" is that no shell is being invoked in this case (since the OP asked wether shell metachars would be escaped "automatically" which is not the case).
8jean
post updated from comments.
runrig
+1, I'm with chaos -- I never heard of the indirect-object syntax before!
j_random_hacker
The "-|" and "|-" piped open() modes work fine on Windows when used in the 3-or-more-arguments form -- no simulated fork() is required (for that, anyway). What doesn't work is using them in the 2-arg form to try to communicate with a forked child process.
j_random_hacker
+6  A: 

On Windows, the situation is a bit nastier. Basically, all Win32 programs receive one long command-line string -- the shell (usually cmd.exe) may do some interpretation first, removing < and > redirections for example, but it does not split it up at word boundaries for the program. Each program must do this parsing themselves (if they wish -- some programs don't bother). In C and C++ programs, routines provided by the runtime libraries supplied with the compiler toolchain will generally perform this parsing step before main() is called.

The problem is, in general, you don't know how a given program will parse its command line. Many programs are compiled with some version of MSVC++, whose quirky parsing rules are described here, but many others are compiled with different compilers that use different conventions.

This is compounded by the fact that cmd.exe has its own quirky parsing rules. The caret (^) is treated as an escape character that quotes the following character, and text inside double quotes is treated as quoted if a list of tricky criteria are met (see cmd /? for the full gory details). If your command contains any strange characters, it's very easy for cmd.exe's idea of which parts of text are "quoted" and which aren't to get out of sync with your target program's, and all hell breaks loose.

So, the safest approach for escaping arguments on Windows is:

  1. Escape arguments in the manner expected by the command-line parsing logic of the program you're calling. (Hopefully you know what that logic is; if not, try a few examples and guess.)
  2. Join the escaped arguments with spaces.
  3. Prefix every single non-alphanumeric character of the resulting string with ^.
  4. Append any redirections or other shell trickery (e.g. joining commands with &&).
  5. Run the command with system() or backticks.
j_random_hacker
Interesting information - thank you. It doesn't endear Windows to this Unixophile, but it helps to know what happens behind the scenes. (The ref'd page is a bit quiet about the role of caret! It mentions it, but only by exception. It is not clear how it handles ^\ or ^", for example.)
Jonathan Leffler
I agree with Jonathan Leffler. That is (in my opinion) an awful way to handle command-line arguments.
Chris Lutz
I totally agree that it's a terrible situation. Though in fairness, most of the terribleness probably arises from MS's laudable devotion to maintaining backwards compatibility. (To see just how obsessive they are, check out Raymond Chen's excellent blog sometime.)
j_random_hacker
@Jonathan: To clarify, two levels of encoding are necessary -- the caret is seen *only* by cmd.exe, which removes it when it passes the command line to the program being run. The rules on that page describe how a MSVC++-compiled program will parse its cmd line (i.e. the 2nd layer of parsing).
j_random_hacker
A: 

Doesn't look like it.

$ perl -e '@args = ("echo", "\""); system(@args);'
"

But if you only use 1 argument, you will need to.

$ perl -e '@args = ("echo \""); system(@args);'
sh: -c: line 0: unexpected EOF while looking for matching `"'
sh: -c: line 1: syntax error: unexpected end of file
ashawley
-1. You're partly right, but this is covered in the first sentence of the top answer. You're partly wrong because if you call with a single argument you *do* need to quote.
j_random_hacker
I did call with a single argument. The command isn't an *argument*. Technically, it's argument zero, but that's a generalization. The question by the OP was about arguments.
ashawley
Well, this is one area where Perl is tricky. system() has two distinct behaviours depending on whether it's called with 1 argument or >1. If you call with a single array argument, that actually gets expanded to multiple arguments, triggering the latter behaviour, which doesn't need quoting.
j_random_hacker
Again, you're confusing "shell arguments" with "subroutine arguments" for Perl.
ashawley
I think you're confused -- whether or not quoting is needed depends on whether or not the Perl system() function is called with 1 argument or more than one. If system() gets 2 or more arguments, as in your example, the shell is never even called.
j_random_hacker
You've called system with 2 arguments, 'echo' and one backslash, so the it doesn't need to be escaped. If you want the same results with one argument to system, e.g., system("echo $backslash"), then $backslash would need to be two backslashes.
runrig
So I've been told. The original question asked about "Shell args", this is a bit unclear.
ashawley
s/backslash/double-quote/ in comment above.
runrig