tags:

views:

555

answers:

2

Whats the best way to separate the string, "Parisi, Kenneth" into "Kenneth" and "Parisi"?
I am still learning how to parse strings with these regular expressions, but not too familiar with how to set vars equal to the matched string & output of the matched (or mismatched) string.

+1  A: 

Something like this should do the trick for names without unicode characters:

my ($lname,$fname) = ($1,$2) if $var =~ /([a-z]+),\s+([a-z]+)/i;

To break it down:

  • ([a-z]+) match a series of characters and assign it to the first group $1
  • , match a comma
  • \s+ match one or more spaces (if spaces are optional, change the + to *)
  • ([a-z]+) match a series of characters and assign it to the second group $2
  • i case insensitive match

You can change the character class [a-z] to include characters you think are valid for names.

codelogic
Won't work with names like d'Angeli or Jean-Pierre...
PhiLho
[a-z] can include all valid name characters.
codelogic
Yeah, PhiLho is right in this case and I actually do have instances of last names in both of those formats he exampled.
CheeseConQueso
Oh but thanks for the breakdown.. Thats what I really needed the most help on.
CheeseConQueso
If you have non-alpha characters in the names, you can add then to the matching pattern: ([a-z'-]+)
Bruce Alderman
cool thanks.. Ill keep that in mind if i run into a special case name and need to switch or add code
CheeseConQueso
Both ' and - are valid name characters (eg O'Reilly, Drake-Brockman)
cletus
Some last names even contain spaces.
bart
And depending on how you validate your inputs, someone might even try some unicode to represent their own name more correctly.
Adam Bellaire
Try unicode? If you are working with names and don't support more characters than simply [a-z] your code is badly broken.
innaM
@Manni, yes and as has been stated at least 3 times, [a-z] is simply an example character class, which can be replaced with whatever the user requires. The OP requested for info on regex grouping, hence my response. In addition, no specific requirements were stated.
codelogic
+12  A: 
my ($lname, $fname) = split(/,\s*/, $fullname, 2);

Note the third argument, which limits the results to two. Not strictly required but a good practice nonetheless imho.

cletus
... assuming every name from input is in the last, first format. If the comma is missing, won't $fname be undefined?
Bruce Alderman
@aardvark: Yes, but garbage in, garbage out. OP doesn't mention this as a requirement.
cletus
Good call. Never regex when a split will do.
Robert P
@Robert: at the risk of being pedantic, the first arg to split is a regex. :-)
cletus
If there's more than one comma, you've lost part of the name.
bart
@bart: again, garbage in, garbage out.
cletus
@bart: but probably worth adding the limit arg. It's good to use it as a rule of thumb.
cletus
@cletus - Touche!
Robert P