tags:

views:

59

answers:

3

To validate names that can be

John, John Paul, etc.

I use this regex:

String regex = "[A-Z]([a-z]+|\\s[a-z]+)";

but when I do:

boolean ok = Pattern.matches(regex, "John Paul");

the matches fail?

Why? I want to use matches to validate the string as whole...

Is that regex wrong?

+4  A: 

Paul has a capital P and your regex doesn't allow for capitalization at the start of the second word.

Noel M
+1  A: 

Try something like this:

[A-Z][a-z]+( [A-Z][a-z]+)?

The ? is an optional part that matches the last name. This captures the last name (with a preceding space) in group 1. You can use a non-capturing group (?:...) if you don't need this capture.

References


Problem with original pattern

Here's the original pattern:

[A-Z]([a-z]+|\s[a-z]+)

Expanding the alternation this matches:

[A-Z][a-z]+

Or:

[A-Z]\s[a-z]+

This does match John, and J paul, but it clearly doesn't match John Paul.

polygenelubricants
Rubular: original pattern http://www.rubular.com/r/brGWN77bma and proposed pattern http://www.rubular.com/r/DWM2pNGfR2
polygenelubricants
A: 

You are going to have lots of problems with validating names - there are lots of different types of names. Consider:

  • Jéan-luc Picard
  • Carmilla Parker-Bowles
    • Joan d'Arc
  • Matt LeBlanc
  • Chan Kong-sang (Jackie Chan's real name)
  • P!nk
  • Love Symbol #2

The easiest thing to do is to get the user to enter their name and accept it as is. If you want to break it up into personal name and family names for things such as personalisation, then I suggest you break up the input fields into two (or more) parts, or simply ask for a "preferred name" or "nickname" field.

I doubt you'll find a regex that can validate all the variety of names out there - get a big set of sample data (preferably real-world) before you start trying.

Robert Watkins
Don't forget the tetragrammaton יהוה‎
polygenelubricants