What is the way to convert names to proper case in C#? I have a table with firstname and lastname in it. For Example mcdonalds to McDonalds, o'brien to O'Brien and so on.
This is an interesting problem. I don't think there's an 'out of the box' solution.
I have bookmarked the following article which may be close to what you want:
Lost and Found Identity Proper Case Format Provider (IFormatProvider implementation)
I haven't tried the code and this solution pretty much requires manually handling all cases. But it is a start and maybe you'll find it useful.
There is absolutely no way for a computer just to magically know that the first "D" in "McDonalds" should be capitalized. So, I think there are two choices.
Someone out there may have a piece of software or a library that will do this for you.
Barring that, you're only choice is to take the following approach: First, I'd look up the name in a dictionary of words that have "interesting" capitalization. Obviously you'd have to provide this dictionary yourself, unless one exists already. Second, apply an algorithm that corrects some of the obvious ones, like Celtic names beginning with O' and Mac and Mc, although given a large enough pool of names, such an algorithm will undoubtedly have a lot of false positives. Lastly, capitalize the first letter of every name that doesn't meet the first two criteria.
The hard part of this is the algorithms to decide on the capitalization. The string manipulation itself is pretty easy. There isn't a perfect way, since there are no "rules" for cases. One strategy might be a set of rules, such as "capitalize the first letter...usually" and "capitalize the 3rd letter if the first two letters are mc...usually"
Starting with a dictionary of real names and comparing them to your own name for matches will help. You could also take a dictionary of real names, generate a Markhov chain from it, and throw any new names at the Markhov chain to determine the capitalization. That's a crazy, complicated solution.
The ultimate perfect solution is to use humans to correct the data.
Doing this requires that your program be able to interpret the english language to an extent. At the very least be able to break down a string into a set of words. There is no API built-into the .Net Framework that can achieve this.
However if there was, you could use the following code.
public string ProperCase(string str, Func<string,bool> isWord) {
var word = new StringBuilder();
var cur = new StringBuilder();
for ( var i = 0; i < str.Length; i++ ) {
cur.Append(cur.Length == 0 ? Char.ToUpper(str[i]) : str[i]));
if ( isWord(cur.ToString()) {
word.Append(cur.ToString());
cur.Length = 0;
}
}
if ( cur.Length > 0 ) {
word.Append(cur);
}
return word.ToString();
}
It's not a perfect solution but it gives you a general idea of the outline
You could consider using a search engine to help you. Submit a query and see how the results have capitalized the name.
I wrote the following extension methods. Feel free to use them.
public static class StringExtensions
{
public static string ToProperCase( this string original )
{
if( original.IsNullOrEmpty() )
return original;
string result = _properNameRx.Replace( original.ToLower( CultureInfo.CurrentCulture ), HandleWord );
return result;
}
public static string WordToProperCase( this string word )
{
if( word.IsNullOrEmpty() )
return word;
if( word.Length > 1 )
return Char.ToUpper( word[0], CultureInfo.CurrentCulture ) + word.Substring( 1 );
return word.ToUpper( CultureInfo.CurrentCulture );
}
private static readonly Regex _properNameRx = new Regex( @"\b(\w+)\b" );
private static readonly string[] _prefixes = { "mc" };
private static string HandleWord( Match m )
{
string word = m.Groups[1].Value;
foreach( string prefix in _prefixes )
{
if( word.StartsWith( prefix, StringComparison.CurrentCultureIgnoreCase ) )
return prefix.WordToProperCase() + word.Substring( prefix.Length ).WordToProperCase();
}
return word.WordToProperCase();
}
}
You could check the lower/mixed case surname against a dictionary (file) that has the correct casings in it, then return the 'real' value from the dictionary.
I had a quick google to see if one exists, but to no avail!
CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
TextInfo textInfo = cultureInfo.TextInfo;
string txt = textInfo.ToTitleCase("texthere");