views:

253

answers:

5

I need a function that will take a string and "pascal case" it. The only indicator that a new word starts is an underscore. Here are some example strings that need to be cleaned up:

  1. price_old => Should be PriceOld
  2. rank_old => Should be RankOld

I started working on a function that makes the first character upper case:

public string FirstCharacterUpper(string value)
{
 if (value == null || value.Length == 0)
  return string.Empty;
 if (value.Length == 1)
  return value.ToUpper();
 var firstChar = value.Substring(0, 1).ToUpper();
 return firstChar + value.Substring(1, value.Length - 1);
}

The thing the above function doesn't do is remove the underscore and "ToUpper" the character to the right of the underscore.

Also, any ideas about how to pascal case a string that doesn't have any indicators (like the underscore). For example:

  1. companysource
  2. financialtrend
  3. accountingchangetype

The major challenge here is determining where one word ends and another starts. I guess I would need some sort of lookup dictionary to determine where new words start? Are there libraries our there to do this sort of thing already?

Thanks,

Paul

+7  A: 

Well the first thing is easy:

string.Join("", "price_old".Split(new [] { '_' }, StringSplitOptions.RemoveEmptyEntries).Select(s => s.Substring(0, 1).ToUpper() + s.Substring(1)).ToArray());

returns PriceOld

Second thing is way more difficult. As companysource could be CompanySource or maybe CompanysOurce, can be automated but is quite faulty. You will need an English dictionary, and do some guessing (ah well, I mean alot) on which combination of words is correct.

Jan Jongboom
@Jan As you so effectively pointed out, dealing with words is hard. I guess there is no way around it, I'll have to do some sort of dictionary lookup. I guess I was hoping someone already developed something I could use.
Paul Fryer
+1: for pointing out the dictionary solution for *second thing*
KMan
+3  A: 

You can use the TextInfo.ToTitleCase method then remove the '_' characters.

So, using the extension methods I've got:

http://theburningmonk.com/2010/08/dotnet-tips-string-totitlecase-extension-methods

you can do somethingl ike this:

var s = "price_old";
s.ToTitleCase().Replace("_", string.Empty);
theburningmonk
Interesting approach!
Rubens Farias
@theburningmonk I like what I'm seeing so far... might just end up using this approach.
Paul Fryer
@theburningmonk It works like a charm! Thanks again.
Paul Fryer
@Paul - no probs ;-) glad I could help!
theburningmonk
MSDN states that this function is implemented incorrectly and may therefore be subject to change; so be careful with new releases of .NET.
Jan Jongboom
@Jan - but then anything in the framework is and can be subject to change even if the relevant MSDN article doesn't specifically state it as such.. I don't think that's reason enough to refrain from using them.
theburningmonk
+2  A: 

Try this:

public static string GetPascalCase(string name)
{
    return Regex.Replace(name, @"^\w|_\w", 
        (match) => match.Value.Replace("_", "").ToUpper());
}

Console.WriteLine(GetPascalCase("price_old")); // => Should be PriceOld
Console.WriteLine(GetPascalCase("rank_old" )); // => Should be RankOld
Rubens Farias
Only this is four times as slow as just splitting and substringing, and twice as slow when compiling the regex (doing this 100.000 times).
Jan Jongboom
Can I have your benchmark, @Jan?
Rubens Farias
A: 

With underscores:

s = Regex.Replace(s, @"(?:^|_)([a-z])",
      m => m.Groups[1].Value.ToUpper());

Without underscores:

You're on your own there. But go ahead and search; I'd be surprised if nobody has done this before.

Alan Moore
A: 

For your 2nd problem of splitting concatenated words, you could utilize our best friends Google & Co. If your concatenated input is made up of usual english words, the search engines have a good hit rate for the single words as an alternative search query

If you enter your sample input, Google and Bing suggest the following:

original             | Google                | Bing
=====================================================================
companysource        | company source        | company source 
financialtrend       | financial trend       | financial trend
accountingchangetype | accounting changetype | accounting change type

See this exaple.

Writing a small screen scraper for that should be fairly easy.

Frank Bollack