tags:

views:

80

answers:

3

Regexp in Java I want to make a regexp who do this verify if a word is like [0-9A-Za-z][._-'][0-9A-Za-z] example for valid words

A21a_c32 
daA.da2
das'2
dsada
ASDA
12SA89

non valid words

dsa#da2
34$

Thanks

+1  A: 

^[0-9A-Za-z]+[._'-]?[0-9A-Za-z]+$ (see matches on rubular.com)

Key points:

  • ^ is the start of the string anchor
  • $ is the end of string anchor
  • + is "one-or-more repetition of"
  • ? is "zero-or-one repetition of" (i.e. "optional")
  • - in a character class definition is special (range definition)...
    • unless it's escaped, or first, or last
  • . unescaped outside of a character class definition is special...
    • but in a character class definition it's just a period

References

polygenelubricants
yes, i understand ...the sign + it's not ok, like in my example (ASDA) the special chars ._-' can missbtw, thanks for your answer
tinti
@tinti: I don't understand your comment. Are you saying there's something wrong with my pattern? If so, then let's fix it. Go back and forth with me on rubular. Give me input that should/shouldn't be matched.
polygenelubricants
Correction: ? is "zero-or-one", not "zero-or-more". "zero-or-more" is *
dty
@Danny: yep, thanks for catching that. Corrected now.
polygenelubricants
@polygene: good explanation and a great reference site. I love a good regex. How about one for pure speed:^(?>[0-9a-z]*)([-_.'][a-z0-9])?(?>[a-z0-9]*)$The (?>) stuff are possessive quantifiers alternate syntax. Java would support the normal syntax too:^[0-9a-z]*+([-_.'][a-z0-9])?[a-z0-9]*+$ (untested version)This is hardly worth it for small strings, but for large ones, avoiding excessive backtracking can make a HUGE diff.
Java Drinker
+1  A: 

If [._'-] are optional, put the ? with the next characters, like this:

[0-9A-Za-z]+([._'-][0-9A-Za-z]+)?
True Soft
+1  A: 
"(\\p{Alnum})*([.'_-])?(\\p{Alnum})*"

In this solution I assume that the delimiter is optional, the empty string is also legal, and that the string may start/end with the delimiter, or be composed only of the delimiter.

Eyal Schneider