tags:

views:

957

answers:

9

Can anyone help me build a regular expression that will find a string of any length containing any characters but must contain 21 commas

Thanks

+13  A: 
/^([^,]*,){21}[^,]*$/

That is:

^     Start of string
(     Start of group
[^,]* Any character except comma, zero or more times
,     A comma
){21} End and repeat the group 21 times
[^,]* Any character except comma, zero or more times again
$     End of string
Greg
Very nice explanation. Just make sure you put that in the comment in the code.
Robert
thanks that really good only thing is ,,,,,,,,,,,,,,,,, etc is allowed how can i include this?
This regex will allow 21 consecutive commas - to prevent that, change both of the * to +
Peter Boughton
Also, using *+ (or ++) instead of * (or +) may be faster, if your regex engine supports it.
Peter Boughton
Also, regarding Robert's comment - see my answer below which uses (?x) to enable commenting, so both the regex and explanation can live together.
Peter Boughton
@Greg the explanation bit would be better in <pre> tags to prevent the distracting highlighting.
ShuggyCoUk
@Peter, JavaScript doesn't support possessive quantifiers. I doubt it would make a noticeable difference here anyway.
Alan Moore
+1  A: 
^(?:[^,]*)(?:,[^,]*){21}$
Trumpi
You can remove the '?:' if you're not using .NET
Trumpi
Irrespective of regex version, you should leave the ?: unless you want to capture the group.
Peter Boughton
A: 
.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,.*,
John D. Cook
This will match more than 21 commas
Greg
And it's unreadable too! There's a reason {num} syntax exists.
Peter Boughton
Agreed, it's unreadable. The {21} syntax is preferable. But it makes the point visibly: just repeat what you want. It also has the advantage of working in old regex environments that don't support the repeat count syntax.
John D. Cook
It's also extremely inefficient. The first dot-star will initially consume the whole string, then backtrack just enough to let the following comma match. Then the second dot-star-comma will have nothing to match, so the first one will be forced to backtrack again, and so on. This is a recipe for catastrophic backtracking. http://www.regular-expressions.info/catastrophic.html
Alan Moore
+3  A: 

Exactly 21 commas:

^([^,]*,){21}[^,]$

At least 21 commas:

^([^,]?,){21}.*$
Daniel Brückner
The first will match more than 21 commas (because it's not anchored) and if you anchor it the string will have to end with comma
Greg
Allready fixed ... ^^
Daniel Brückner
Oops ... not fixed ... but now ...
Daniel Brückner
A: 

if exactly 21:

/^[^,]*(,[^,]*){21}$/

if at least 21:

/(,[^,]*){21}/

However, I would suggest don't use regex for such simple task. Because it's slow.

kcwu
A: 

What language? There's probably a simpler method.

For example...

In CFML, you can just see if ListLen(MyString) is 22

In Java, you can compare MyString.split(',') to 22

etc...

Peter Boughton
+2  A: 

Might be faster and more understandable to iterate through the string, count the number of commas found and then compare it to 21.

zimbu668
+1  A: 

If you're using a regex variety that supports the Possessive quantifier (e.g. Java), you can do:

^(?:[^,]*+,){21}[^,]*+$

The Possessive quantifier can be better performance than a Greedy quantifier.


Explanation:

(?x)    # enables comments, so this whole block can be used in a regex.
^       # start of string
(?:     # start non-capturing group
[^,]*+  # as many non-commas as possible, but none required
,       # a comma
)       # end non-capturing group
{21}    # 21 of previous entity (i.e. the group)
[^,]*+  # as many non-commas as possible, but none required
$       # end of string
Peter Boughton
A: 
var valid = ((" " + input + " ").split(",").length == 22);

or...

var valid = 21 == (function(input){
    var ret = 0;
    for (var i=0; i<input.length; i++)
        if (input.substr(i,1) == ",")
            ret++;
    return ret
})();

Will perform better than...

var valid = (/^([^,]*,){21}[^,]*$/).test(input);
Tracker1