tags:

views:

89

answers:

3

I'm looking for a regex to validate initials. The only format I want it to allow is:

(a capital followed by a period), and that one or more times

Valid examples:

A.
A.B.
A.B.C.

Invalid examples:

a.
a
A
A B
A B C
AB
ABC

Using The Regulator and some websites I have found the following regex, but it only allows exactly one upper (or lower!) case character followed by a period:

^[A-Z][/.]$

Basically I only need to know how to force upper case characters, and how I can repeat the validation to allow more the one occurence of an upper case character followed by a period.

+4  A: 

You almost has it right: + says "one or more occurenses" and it's \., not /.

Wrapping it in () denotes that it's a group.

^([A-Z]\.)+$
simendsjo
Thanks. I had to change your regex a bit in order to make it work in my system:^([A-Z][/.])+$
iar
I'd recommend adding ?: to the beginning of the group like this: `^(?:[A-Z]\.)+$`. This would not capture this group.
Max
It depends on how it should be used. If he just wants to see if it matches then it's reason to capture the group.
simendsjo
+2  A: 

the regex you want is this:

^(?:[A-Z]\.)+$

the ?: marks the group as non-captured

case sensitivity is however a flag which is handled differently in every language. but in most implementations it is active by default

seanizer
He wrote "case sensitivity (...) is active by default". However, in many editors with regex support, it's the other way around.
Tim Pietzcker
@Tim, ah yes, I mis-read. Thanks.
Bart Kiers
I was talking about languages, not editors. I'm pretty sure that java, javascript, php and perl have case sensitivity turned on by default
seanizer
@seanizer, yes, I as commented: I mis-read your answer. I thought you meant that case-insensitive was enabled by default. I haven't drank enough coffee, I guess...
Bart Kiers
+3  A: 

Here's a quick regular expression lesson:

  • a matches exactly one a
  • a+ matches one or more a in a row
  • ab matches a followed by b
  • ab+ matches a followed by one or more b in a row
  • (ab)+ matches one or more of a followed by b

So in this case, something like this should work:

^([A-Z][.])+$

References


Variations

You can also use something like this:

^(?:[A-Z]\.)+$

The (?:pattern) is a non-capturing group. The \. is how you match a literal ., because otherwise it's a metacharacter that means "(almost) any character".

References


Even more variations

Since you said you're matching initials, you may want to impose some restriction on what is a reasonable number of repetition.

A limited repetition syntax in regex is something like this:

^(?:[A-Z]\.){1,10}$

This will match at least one, but only up to 10 letters and period repetition (see on rubular.com).

polygenelubricants
. matches any character. You need to escape it: ^([A-Z][\.])+$. And it's not necessary to put the dot in brackets as it's only only one character
simendsjo
@simen: You don't need to escape `.` inside a character class definition. `[.]` is a character class definition that matches only one character, the period. I sometimes prefer `[.]` to `\.` (or `\\.` in string literals of some languages) but that's just me.
polygenelubricants
@polygenelubricants: Ah, makes sense. \. has always served me well though :)
simendsjo
the nice thing about [.] is you don't have to worry about escaping (how many slashes to use in which context)
seanizer
@seanizer: yep, and one would hope at least that if `\.` is significantly faster than `[.]`, then a decent regex engine would optimize the latter to the former.
polygenelubricants
Thanks. I have given the first answer the label "accepted answer" already, but your answer is helpful as well.
iar