I'd like a Regular Expression for C# that matches "Johnson", "Del Sol", or "Del La Range"; in other words, it should match words with spaces in the middle but no space at the start or at the end.
The ? qualifier is your friend. Makes a shortest-possible match instead of a greedy one. Use it for the first name, as in:
^(.+?) (.+)$
Group 1 grabs everything up to the first space, group 2 gets the rest.
Of course, now what do you do if the first name contains spaces?
This should do the job:
^[a-zA-Z][a-zA-Z ]*[a-zA-Z]$
Edit: Here's a slight improvement that allows one-latter names and hyphens/apostrophes in the name:
^[a-zA-Z'][a-zA-Z'- ]*[a-zA-Z']?$
^\p{L}+(\s+\p{L}+)*$
This regex has the following features:
- Will match a one letter last name (e.g. Malcolm X's last name)
- Will not match last names containing numbers (like anything with a
\w
or a[^ ]
will) - Matches unicode letters
But what about last names like "O'Connor" or hyphenated last names ... hmm ...
I think this is more what you were looking for:
^[^ ][a-zA-Z ]+[^ ]$
This should match the beginning of the line with no space, alpha characters or a space, and no space at the end.
This works in irb, but last time I worked with C#, I've used similar regexes:
(zero is good, nil means failed)
>> "Di Giorno" =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> 0
>> "DiGiorno" =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> 0
>> " DiGiorno" =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> nil
>> "DiGiorno " =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> nil
>> "Di Gior no" =~ /^[^ ][a-zA-Z ]+[^ ]$/
=> 0
In the name "Ṣalāḥ ad-Dīn Yūsuf ibn Ayyūb" (see http://en.wikipedia.org/wiki/Saladdin), which is the first name, and which is the last? What about in the name "Roberto Garcia y Vega" (invented)? "Chiang Kai-shek" (see http://en.wikipedia.org/wiki/Chang_Kai-shek)?
Spaces in names are the least of your problems! See http://stackoverflow.com/questions/620118/personal-names-in-a-global-application-what-to-store.