views:

113

answers:

4

I'm looking for a conversion function to convert a string of text that is in UpperCase to SentenceCase, all the examples I can find turn the text into TitleCase.

Sentence case in a general sense describes the way that capitalization is used within a sentence. Sentence case also describes the standard capitalization of an English sentence, i.e. the first letter of the sentence is capitalized, with the rest being lower case (unless requiring capitalization for a specific reason, e.g. proper nouns, acronyms, etc.).

Can anyone point me in the direction of a script or function for SentenceCase?

+4  A: 

There isn't anything built in to .NET - however, this is one of those cases where regular expression processing actually may work well. I would start by first converting the entire string to lower case, and then, as a first approximation, you could use regex to find all sequences like [a-z]\.\s+(.), and use ToUpper() to convert the captured group to upper case. The RegEx class has an overloaded Replace() method which accepts a MatchEvaluator delegate, which allows you to define how to replace the matched value.

Here's a code example of this at work:

var sourcestring = "THIS IS A GROUP. OF CAPITALIZED. LETTERS.";
// start by converting entire string to lower case
var lowerCase = sourcestring.ToLower();
// matches the first sentence of a string, as well as subsequent sentences
var r = new Regex(@"(^[a-z])|\.\s+(.)", RegexOptions.ExplicitCapture);
// MatchEvaluator delegate defines replacement of setence starts to uppercase
var result = r.Replace(lowerCase, s => s.Value.ToUpper());

// result is: "This is a group. Of uncapitalized. Letters."

This could be refined in a number of different ways to better match a broader variety of sentence patterns (not just those ending in a letter+period).

LBushkin
A: 

This works for me.

/// <summary>
/// Converts a string to sentence case.
/// </summary>
/// <param name="input">The string to convert.</param>
/// <returns>A string</returns>
public static string SentenceCase(string input)
{
    if (input.Length < 1)
        return input;

    string sentence = input.ToLower();
    return sentence[0].ToString().ToUpper() +
       sentence.Substring(1);
}
Ed B
If the input are multiple sentences you will also need to split each sentence by using the dot as delimiter.
PoweRoy
Or any other valid punctuation marks
SwDevMan81
"dot as delimiter" doesn't really cut it. `Mr. and Mrs. Smith have $1,000.00 each; they live on Magnolia Blvd. in the blue house.`
Jay
It also doesn't take into account other reasons to capitalize as mentioned in: (unless requiring capitalization for a specific reason, e.g. proper nouns, acronyms, etc.).
JLWarlow
Yep it's a simple method...very rarely have to use it anyway. It's not something you would use in content management.
Ed B
+2  A: 

I found this sample on MSDN.

devio
This seems like a very complicated way of converting a string to sentence case. I think this is a problem better suited for regular expressions, personally.
LBushkin
A: 

If your input string is not a sentence, but many sentences, this becomes a very difficult problem.

Regular expressions will prove an invaluable tool, but (1) you'll have to know them quite well to be effective, and (2) they might not be up to doing the job entirely on their own.

Consider this sentence

"Who's on 1st," Mr. Smith -- who wasn't laughing -- replied.

This sentence doesn't start with a letter, it has a digit, various punctuation, a proper name, and a . in the middle.

The complexities are enormous, and this is one sentence.

One of the most important things when using RegEx is to "know your data." If you know the breadth of types of sentences you'll be dealing with, your task will be more manageable.

In any event, you'll have to toy with your implementation until you are satisfied with your results. I suggest writing some automated tests with some sample input -- as you work on your implementation, you can run the tests regularly to see where you're getting close and where you're still missing the mark.

Jay