views:

85

answers:

3

Hi,

What can be a regular expression for following type of string

E.g. 1, 2-3, 4..5, <6, <=7, >8, >=9 Here I am using equals, range (-), sequence (..) & greater than/equal to operators for numbers less than 100. These numbers are separated by a comma.

Pls help me in writing a regular expression for this. Thanks in advance.

  • Atul
+4  A: 

How about something like this:

^(\d+(-|\.\.)|[<>]=?)?\d+$

Example using Python:

>>> import re
>>> pattern = '^(\d+(-|\.\.)|[<>]=?)?\d+$'
>>> for s in '1, 2-3, 4..5, <6, <=7, >8, >=9'.split(','):
...     print(re.match(pattern, s.strip()).group(0))
... 
1
2-3
4..5
<6
<=7
>8
>=9

To be clear, this regex matches only one element in the list. I highly recommend that you preprocess your input by splitting it on commas and trimming the individual elements, like I did in the example above. Even though that's not strictly necessary (you can add this logic to the regex I gave here), it will but quite a bit more efficient and readable.

How the regex works:

  • Observe that every valid string ends with one or more digits, thus \d+$.
  • There may or may not be something before that, thus, ^(...)?\d+$.
  • Those prefixes are either the start of a range, or a comparison:
    • \d+(-|\.\.) matches a number followed by a dash or two periods.
    • <=? matches "<" as well as "<=". Likewise for >=?. We can abbreviate this to [<>]=?.
  • Combining these two options using a pipe (|), which signifies choice, we get

    ^(\d+(-|\.\.)|[<>]=?)?\d+$
    
Stephan202
A: 

Try this expression:

^(?:\d+(?:(?:\.\.|-)\d+)?|[<>]=?\d+)(?:,\s*\d+(?:(?:\.\.|-)\d+)?|[<>]=?\d+)*$

It consists of the alternation of

  • \d+(?:(?:\.\.|-)\d+)? for a number followed by an optional expression for a range or sequence, and
  • [<>]=?\d+ for the inequalities.

That’s repeated in the second parts with a comma and optional whitespace for the list.

And for the condition of only allowing numbers less than 100, you can replace \d+ with [1-9]\d for 1..99 or (?:0|[1-9]\d) for 0..99.

Gumbo
A: 

You should totally use a regular expression tool like regex buddy.

You're trying to verify that your string generally looks like the sample?

1, 2-3, 4..5, <6, <=7, >8, >=9

matches

\s*(\d+|\d+-\d+|\d+\.\.\d+|[<>]=?\d+)\s*(,\s*(\d+|\d+-\d+|\d+\.\.\d+|[<>]=?\d+)\s*)*

It's easier to split on , and then match each part with

\s*(\d+|\d+-\d+|\d+\.\.\d+|[<>]=?\d+)\s*

That reads: white space trimmed, match digits or digits dash digits, or digits dot dot digits, or one of less-than or greater-than with optional equal to digits.

You can compress that down to the harder to read:

\s*((\d+(-|\.\.)|[<>]=?)?\d+)\s*

If you want all your digits to be 1-2 digits only, then change all the \d+ to \d{1,2} or \d\d?

dlamblin