Поделиться через


What is Title Case?

Disclaimer: I'm not an English teacher (that's my mom), so I'm sure my description of title casing in English probably has exceptions/variations.

Title casing has an interesting history in computer programming.  Programmers like to use CamelCase to make variable names more readable, and, particularly amongst developers native to some languages, there's an idea that title casing is interesting, such as in String.ToTitleCase(), and in Windows 7, LCMapString(LCMAP_TITLECASE).  Most title casing algorythms are linguistically bad, even in English.  For other languages it's worse.

ToTitleCase() takes a very simple approach to title casing.  Maybe in the future it'll be smarter, but for now it just uppercases the first letter in a group of letters, and tries to pay attention to non-letters and word breaks.  It also tries to keep acronyms all upper-case.

Even in English this is a simplistic approach.  The title of this post is "What is Title Case?"  Is is supposed to be lower case, but ToTitleCase() would mess it up.  Additionally unexpected word breaks or punctuation could trick the algorithm.  Even the acronym test isn't complete since it just expects all-upper case and sometimes acronyms keep the lower case of the full title.  Also it messess up names like DiSilva or McConnell.  Contractions can also be messed up.

Outside of English, ToTitleCase() rapidly gets silly.  In English we capitalize everything except articles, short prepositions and some other short words.  In German it's just like a normal sentence, with only nouns getting capitalized, so the English slightly over-eager capitilization behavior becomes very over-eager.  Other languages also can have letters before the main word, eg: l'État, so the ToTitleCase rules can mess up those words as well.

And then there're scripts/languages that don't even have an upper/lower case distinction, so ToTitleCase gets pointless.

Anyway, use care when using ToTitleCase().  It might work in some cases, but don't expect it to work linguistically, particularly globally, particularly in non-English cases.  Also maybe we'll get smarter and figure out a more correct way to do it in the future.

 -Shawn

Comments

  • Anonymous
    August 18, 2009
    If its so bad (and Michael Kaplan agrees with you) - why was added to Windows 7?

  • Anonymous
    August 18, 2009
    I almost added a paragraph to answer that :) I actually added it to Win7.  (And I even recycled the bits of the UPPER and LOWER flags, combining them is TITLE, isn't that sneaky?) Mostly it was added for .Net SDK parity. Although titlecase does have quite a few problems, applications still find it convenient and better than some alternatives.  In limited cases, where the linguistic problems are understood, it might be a bit helpful.   It's also possible that in the very-long-term we'll figure out how to do something smart and linguistically appropriate.  (so don't complain if LCMAP_TITLECASE gets smarter some day).

  • Anonymous
    June 22, 2010
    The comment has been removed

  • Anonymous
    June 22, 2010
    Yes, the ligatures are annoying special cases, however the "Title Casing" rules for several languages differ.  "Is" in "What is Title Case?" shouldn't be capitalized, but ToTitleCase() will do that.