tags:

views:

205

answers:

5

I'm trying to create a regex that will accept the following values:

  • (blank)
  • 0
  • 00
  • 00.0
  • 00.00

I came up with ([0-9]){0,2}\.([0-9]){0,2} which to me says "the digits 0 through 9 occurring 0 to 2 times, followed by a '.' character (which should be optional), followed by the digits 0 through 9 occuring 0 to 2 times. If only 2 digits are entered the '.' is not necessary. What's wrong with this regex?

+15  A: 

You didn't make the dot optional:

[0-9]{0,2}(\.[0-9]{1,2})?
Joachim Sauer
Perfect! Thank you.
James Cadd
That will also match `.0`. Is that okay? If not, you should change the first part to `[0-9]{1,2}`. That will in turn cause (blank) to fail. You could solve that problem by wrapping the whole thing in (...)? but it probably makes more sense to check for (blank) separately.
Patrick McElhaney
One caveat: This regex will match anything because all parts are optional. If you ask it to match "a", it will succeed because it matches the empty space "before the a". Maybe you should anchor the regex with ^ and $ (since you do want to match an empty string).
Tim Pietzcker
@Tim: it will not match "a". It will find a match in "a" (2 to be precise). The problem is that in some languages/environments the default behaviour of regex is "find" (i.e. find a substring that matches) and in others its "match" (i.e. try if the entire String matches the regexp). Therefore your caveat is applicable, but only in some environments.
Joachim Sauer
Making every element optional is not good way to include empty string in the set of strings that will satisfy a regular expression, because it opens up the floodgates to every string, as Tim points out. Far better is to use a strategy like this: `^$|foo`
FM
+3  A: 

First off, {0-2} should be {0,2} as it was in the first instance.

Secondly, you need to group the repetition sections as well.

Thirdly, you need to make the whole last part optional. Because if there's a dot, there must be something after it, you should also change the second repetition thing to {1,2}.

([0-9]{0,2})(\.([0-9]{1,2}))?
Samir Talwar
+2  A: 

There are a few problems with your regex:

  1. The dot is a special character, and acts as a wildcard; if you want a literal dot, you need to escape it (\.).
  2. Even if you replaced the dot to not be a wildcard, your regex will match strings like "0." because you did not tell the regular expression engine to only match the dot if there are numbers following it.
  3. Because your expression isn't anchored, it could match strings that contain the pattern within another word, for example (ie. ab12 would match).

A better pattern would be something like:

/\b[0-9]{0,2}(?:\.[0-9]{1,2})?\b/

Note that (?:...) makes the group not create a backreference, which probably is not needed in your case.

Daniel Vandersluis
+1  A: 

Here is one way, illustrated in Perl, to match only the strings you listed. The important part is its method for matching empty strings: it does not make every pattern element optional, a strategy that has the undesirable effect of matching almost every string.

use warnings;
use strict;

my @data = (
    '',
    '0',
    '00',
    '00.0',
    '00.00',
    'foo',    # Should not match.
    '.0',     # Should not match.
);

for (@data){
    print $_, "\n" if /^$|^[0-9]{1,2}(\.[0-9]{1,2})?$/;
}
FM
A: 

Most of the above examples don't anchor the beginning ^ and ending $ of the data.

I would solve it with one of the following:

  • ^[[:digit:]]{0,2}([.][[:digit:]]{1,2})$
  • ^\d{0,2}([.]\d{1,2})$
  • ^[0-9]{0,2}([.][0-9]{1,2})$

For readability, i generally prefer using [.] to \. and using POSIX classes like [[:digit:]].

nicerobot