views:

271

answers:

7

Is there a better way of doing data validation in Perl than regex?

I have Perl code that does string and numeric data validation using complex regex, and this makes code difficult to read and follow, for everyone.

A: 

If you want to make sure variable values fit a certain pattern, there is no better way than to use the pattern matching facilities of Perl.

On the other hand, if you want to improve a specific pattern, you can ask here for advice.

You can make your own programs easier to follow by abstracting away such regular expressions and use facilities such as Regexp::Common.

Sinan Ünür
+6  A: 

There is an absolutely awesome module for parameter validation in perl: Params::Validate It lets you check your parameters in a clean and nice way. We used it everywhere from the moment we discovered it.

maksymko
+6  A: 

A common mistake is to cram all your requirements into a single regular expression. This works the first time, but usually you get a regular expression that nobody will understand two weeks down the road.

Don't do that. Use one regular expression per requirement.

innaM
+1  A: 

There are ways to do validation without regexps. But - using regexp doesn't mean you can't make it readable.

There is (often not used) //x flag to regexps, which let's you build very readable regexps with comments.

Of course this doesn't mean that you should validate everything with regexps - even if technically possible - it is often insane (think 4KB long regexp for email address validation).

depesz
You could just use http://search.cpan.org/perldoc/Email::Valid instead of pasting the 4 KB regex into your code ... **That** would make your code more readable.
Sinan Ünür
Of course. But it doesn't negate the fact that the regexp exists. And is not a sane solution to the problem at hand.
depesz
+6  A: 

Don't reinvent the wheel. Use Regexp::Common from CPAN:

#!/usr/bin/perl

use strict;
use warnings;

use Regexp::Common qw(number);

my $val = '500.345';

print "Good float\n" if $val =~ /^$RE{num}{real}$/;

CPAN is your friend.

xcramps
You must anchor your patterns.
Sinan Ünür
if perl had strong typing..then we wouldnt have to validate ...correct?
not-exactly-a-unixhater
@not-exactly-a-unixhater: Have a look at Variable::Strongly::Typed or Lexical::Types on CPAN. If its parameter related then take a look at "signature" related modules.
draegtun
A: 

There might be.

Sometimes a regular expression is the best approach, sometimes it is not. You have to examine it on a case-by-case basis.

David Dorward
This is not helpful. Adding an example of each case would be helpful.
ire_and_curses
That would require that the OP provided the cases in the first place. I can't suggest good solutions for sanity checking vaguely described data formats.
David Dorward
A: 

How you validate data depends on what you are trying to do. If the data merely needs to look like some pattern, that's regex territory. If the data must be an exact value from an enumeration, a hash is better. If a number has to be within a range, the numeric comparisons are the tool you should use.

In general, the answer to this sort of question is "maybe".

brian d foy