views:

89

answers:

2

I'm trying to make my article title well-formatted, I'm currently using TextInfo.ToTitleCase for formating. It does well on most of jobs, but it's not that perfect.

For example:

  • Original String: war and peace
  • Expected Result: War and Peace
  • Actual Result: War And Peace

Microsoft also use above as the example, it's obviously a known problem. What I'm going to do is to write a list by hand for those words like "a", "and", "or" etc (I'm not sure I can get a complete list or not), would it be the best solution for this problem?

A: 

I've not seen a solution to this problem in a provided library... It looks a great candidate for an extension method. Interestingly, it's slightly more complex than just a list of words and has a few variations.

The Chicago Manual of Style suggests this:

  1. Always capitalize the first and the last word.

  2. Capitalize all nouns, pronouns, adjectives, verbs, adverbs, and subordinate conjunctions ("as", "because", "although").

  3. Lowercase all articles, coordinate conjunctions ("and", "or", "nor"), and prepositions regardless of length, when they are other than the first or last word.

  4. Lowercase the "to" in an infinitive.

The last case seems particular hard as you need to parse to determine if "to" is used in an infinitive.

Andrew Flanagan
+1  A: 

Here is a JavaScript implementation from a source I trust and have used myself: http://ejohn.org/blog/title-capitalization-in-javascript/

In the source code, he has a list of lowercase-only exceptions that you (I believe correctly) assumed you would need.

The work would be in converting it to something ASP.NET could use serverside, of course, but the logic has already got a lot of thought put into it which should help you with whatever you end up rolling.

Good luck!

Funka
I forgot to mention that he based his off of something in Perl, and there are several other ports in different languages that you might find in the comments on that page...
Funka