views:

168

answers:

4

My (Perl-based) application needs to let users input regular expressions, to match various strings behind the scenes. My plan so far has been to take the string and wrap it in something like

$regex = eval { qr/$text/ };
if (my $error = $@) { 
   # mangle $error to extract user-facing message

($text having been stripped of newlines ahead of time, since it's actually multiple regular expressions in a multi-line text-field that I split).

Are there any potential security risks with doing this - some weird input that could lead to arbitrary code execution? (Besides the buffer overflow vulnarabilities in the regular expression engines like CVE-2007-5116). If so, are there ways to mitigate them?

Is there a better way to do this? Any Perl modules which help abstract the operations of turning user input into regular expressions (such as extracting error messages ... or providing modifiers like /i, which I don't strictly need here, but would be nice)? I searched CPAN and didn't find much that was promising, but entertain the possibility that I missed something.

+3  A: 

the best way, is not to let users have too much privilege. Provide an interface just enough for users to do what they want. (like an ATM machine with only buttons for various options, no need for keyboard input). Of course, if you need user to key in input, then provide text box and then at the back end, use Perl to process the request (eg sanitizing etc). The motive behind letting your users input a regex is to search for string patterns right?? Then in that case, the most simplest and secure way is to tell them to input just the string. Then at the back end, you use Perl's regex to search for it. Is there any other compelling reason to have user input regex themselves?

ghostdog74
Presumably if they want to search for *patterns*, searching for plain strings is going to be orders of magnitude less powerful than being able to search by regexes.
Wooble
Yes. $customers demand more flexibility than a simple string match is capable of providing in this case.As for privileges, though, only moderately-trusted users get to do the regular expressions anyway. I just don't want to extend these users system("rm -rf /") capabilities and the like.
fennec
+4  A: 

With the (?{ code }) construct, user input could be used to execute arbitrary code. See the example in perlre#code and where it says

local $cnt = $cnt + 1,

replace it with the expression

system("rm -rf /home/fennec"); print "Ha ha.\n";

(Actually, don't do that.)

mobrule
Fortunately, `(?{ code })` causes a compile time error if the regex includes variable interpolation unless you say `use re 'eval'` (for exactly this reason).
cjm
@cjm - But it's not an error to say `$re=eval{qr/$tainted/}` and then to use that regex, as the OP has done (unless you use `taintperl`)
mobrule
Ah, with the help of your pointer I found this within the docs: "Before Perl knew how to execute interpolated code within a pattern, this operation was completely safe from a security point of view, although it could raise an exception from an illegal pattern." This is comforting.
fennec
+5  A: 

Using untrusted input as a regular expression creates denial-of-service vulnerability as described in perlsec:

Regular expressions - Perl's regular expression engine is so called NFA (Non-deterministic Finite Automaton), which among other things means that it can rather easily consume large amounts of both time and space if the regular expression may match in several ways. Careful crafting of the regular expressions can help but quite often there really isn't much one can do (the book "Mastering Regular Expressions" is required reading, see perlfaq2). Running out of space manifests itself by Perl running out of memory.

Greg Bacon
I can cope with exposing a DOS vulnerability. Goodness knows there are plenty in the rest of the application for people who can enter these regexpen. A magical 'wipe hard disk' button is another matter, though. :)
fennec
+1  A: 

Perhaps you could use a different regex engine that does not have the dangerous code tag support.

I haven't tried it but there is a PCRE for perl. You may also be able to limit or remove code support using this info on creating custom regex engines.

daotoad