tags:

views:

189

answers:

6

Hi all,

I've got a string which is pipe '|' delimited. I need a regex to validate the number of items, based on the pipe character. So a regex which will do the following:

If the max number of items is three:

asdfasdf|asdfasdf|asdfasdf = VALID

asdfasdf|asdfasdf|asdfasdf|asdfasf = Not Valid

Also, this string may be empty.

Any help would be much appreciated

Regards

+3  A: 

Are you using any programming language? If so, it probably has a specific functions for this. Using regex for everything string related is bad if it can be avoided.

PHP:

$items = explode('|', $mystr);
if ($items > $max) failed

Python:

items = mystr.split('|')
if len(items) > max: failed
Tor Valamo
+4  A: 
^(?:[^|]+(?:\|[^|]+){0,2})?$

This will match an empty string or up to three pipe delimited items, where an item can be any character other than a pipe. Each item needs to be at least one character long in this pattern; if you want to allow for blank items, change the +s to *s.

If you want to change the upper limit of how many items are allowed, change {0,2} to {0,max-1}, where max is the limit you want (because you will have at most max - 1 pipes in your string for it to be valid).

Daniel Vandersluis
Just curious, but isn't the grouping unnecessary? IE: couldn't you remove the grouping and have it be just as effective/correct? I might be missing something...
Randyaa
`(?:...)` defines a non-capturing group (ie. it does not create a backreference). Both groups are necessary -- the outer one allows for the whole pattern to be made optional by the `?` to allow for an empty string; the inner one groups the pipe with its following item.
Daniel Vandersluis
even though I know the original question required using regular expressions, that's overkill for such a simple problem. I think the better solution is to simply iterate over every character and count the pipes.
Bryan Oakley
I didn't catch the ? at the end of the outermost group, my bad. Thanks for the explanation.Minus the requirement of matching on the empty string you can eliminate that group though.IE: ^[^|]+(?:\|[^|]+){0,2}$Also, If you don't need to worry about the start/end of the sequence you can also eliminate those characters making it even shorter/easier.IE: [^|]+(?:\|[^|]+){0,2}For the described problem, this would be my preferred expression
Randyaa
The pattern needs to be anchored in order to enforce the maximum number of items -- your pattern will match a|b|c|d|e|f (twice in fact). Also, the question stated that the empty string is valid.
Daniel Vandersluis
both valid points, i stand corrected :(
Randyaa
A: 

It's probably easiest here to just count the | characters. E. g. in PowerShell:

PS> $valid = 'asdfasdf|asdfasdf|asdfasdf'
PS> $notvalid= 'asdfasdf|asdfasdf|asdfasdf|asdfasf'
PS> ($valid.ToCharArray() | where {$_ -eq '|' }).Count -lt 3
True
PS> ($notvalid.ToCharArray() | where {$_ -eq '|' }).Count -lt 3
False
Joey
A: 
^[^|]*(?:\|[^|]*){0,2}$

This assumes that the items between the pipes may be empty.

Manu
A: 
^([^|]*\|){0,2}[^|]*$

Allows empty string items, up to three items, change the 0 or 2 to alter the min/max range.

Kazar
+1  A: 

Why use a regular expression? Just iterate over the string and count each pipe. This is effectively doing what regex does, but without having to keep all of the pattern matching bookeeping.

Bryan Oakley