tags:

views:

120

answers:

2

I admit regex is a strange world and I have not been able to really get my head wraped around it. But I have this problem that I believe belongs in the regex world. i would like to change last names like "o'brian" to "O'Brian" or "macdonald" to "MacDonald" or "who-knew" to "Who-Knew" or "who knew" to "Who Knew"

so far all I have is ....

setCaps("o'brian");
string setCaps(string s)
    {
        string result = Regex.Replace(s, @"\b[a-z]['a-z]\w+", delegate(Match match)
                    {
                        string ch = match.ToString();
                        return char.ToUpper(ch[0]) + ch.Substring(1);
                    });

        return result;

    }

Thanks

+10  A: 

Not actually sure this is possible for your Mac.... For example, while macdonald should be MacDonald, Mrs Macey really doesn't want to be Mrs MacEy. And what if its company names? Smith's Machinery, doesn't want to be Smith's MacHinery!

The "O" prefix could be problematic also. Consider Mr O'Pera, or Mrs O'Pal!

The best thing to do with Mac and Mc prefixes is to hold an exception list, which you refer back to. There are only a finite number of these style names!

The following should help start: http://dgmweb.net/genealogy/FGS/Indices/EveryNameIndex-Mc.shtml

James Wiseman
Agreed. When it comes to name capitalization you should probably leave it to the user typing it in rather than trying to second-guess how it should be done. There are some people who have strange names with strange preferred capitalization who might get irritated if you try to force what they consider to be incorrect capitalization on them.
Welbog
...and not all Macdonalds are spelled MacDonald!
dtb
...same goes for Macintosh!
James Wiseman
A: 

Simple regular expressions won't easily do the job - the problem is quite complex. I suggest to try the following.

  1. Split the input into "words" and separators.
    "o'brian"   => "o"    "'"  "brian"
    "macdonald" => "mac"  ""   "donald"
    "who-knew"  => "who"  "-"  "knew"
    "who knew"  => "who"  " "  "knew"
  1. Process all words making the first letter upper case and all remaining letters lower case.

  2. Join the words together again and maybe modify the separators.

You will at least need a list of possible separators and a list of words that could occure joined together without separator like "Mac" in "MacDonald".

Daniel Brückner
as @dtb said in comment to @James's answer, not all Macdonald == MacDonald.
Ray Hayes