String Data Management
Visual C++ provides several ways to manage string data:
String Manipulation for working with C-style
NULL
-terminated stringsWin32 API functions for managing strings
MFC's class
CStringT
Class, which provides flexible, resizable string objectsClass
CStringT
Class, which provides an MFC-independent string object with the same functionality asCString
Nearly all programs work with string data. MFC's CString
class is often the best solution for flexible string handling. Starting with version 7.0, CString
can be used in MFC or MFC-independent programs. Both the run-time library and CString
support strings containing multibyte (wide) characters, as in Unicode or MBCS programming.
This article describes the general-purpose services that the class library provides related to string manipulation. Topics covered in this article include:
The CStringT
Class class provides support for manipulating strings. It's intended to replace and extend the functionality normally provided by the C run-time library string package. The CString
class supplies member functions and operators for simplified string handling, similar to those found in Basic. The class also provides constructors and operators for constructing, assigning, and comparing CString
s and standard C++ string data types. Because CString
isn't derived from CObject
, you can use CString
objects independently of most of the Microsoft Foundation Class Library (MFC).
CString
objects follow "value semantics." A CString
object represents a unique value. Think of a CString
as an actual string, not as a pointer to a string.
A CString
object represents a sequence of a variable number of characters. CString
objects can be thought of as arrays of characters.
Unicode and MBCS Provide Portability
With MFC version 3.0 and later, MFC, including CString
, is enabled for both Unicode and multibyte character sets (MBCS). This support makes it easier for you to write portable applications that you can build for either Unicode or ANSI characters. To enable this portability, each character in a CString
object is of type TCHAR
, which is defined as wchar_t
if you define the symbol _UNICODE
when you build your application, or as char
if not. A wchar_t
character is 16 bits wide. MBCS is enabled if you build with the symbol _MBCS
defined. MFC itself is built with either the _MBCS
symbol (for the NAFX libraries) or the _UNICODE
symbol (for the UAFX libraries) defined.
Note
The CString
examples in this and the accompanying articles on strings show literal strings properly formatted for Unicode portability, using the _T
macro, which translates the literal string to the form:
L"literal string"
Note
which the compiler treats as a Unicode string. For example, the following code:
CString strName = _T("Name");
Note
is translated as a Unicode string if _UNICODE
is defined or as an ANSI string if not. For more information, see the article Unicode and Multibyte Character Set (MBCS) Support.
A CString
object can store up to INT_MAX
(2,147,483,647) characters. The TCHAR
data type is used to get or set individual characters inside a CString
object. Unlike character arrays, the CString
class has a built-in memory allocation capability. This allows CString
objects to automatically grow as needed (that is, you don't have to worry about growing a CString
object to fit longer strings).
CStrings
and const char
Pointers
A CString
object also can act like a literal C-style string (an PCXSTR
, which is the same as const char*
if not under Unicode). The CSimpleStringT::operator PCXSTR
conversion operator allows CString
objects to be freely substituted for character pointers in function calls. The CString(LPCWSTR pszSrc)
constructor allows character pointers to be substituted for CString
objects.
No attempt is made to fold CString
objects. If you make two CString
objects containing Chicago
, for example, the characters in Chicago
are stored in two places. (This may not be true of future versions of MFC, so you shouldn't depend on it.)
Note
Use the CSimpleStringT::GetBuffer
and CSimpleStringT::ReleaseBuffer
member functions when you need to directly access a CString
as a nonconstant pointer to a character.
Note
Use the CStringT::AllocSysString
and CStringT::SetSysString
member functions to allocate and set BSTR
objects used in Automation (formerly known as OLE Automation).
Note
Where possible, allocate CString
objects on the frame rather than on the heap. This saves memory and simplifies parameter passing.
The CString
class isn't implemented as a Microsoft Foundation Class Library collection class, though CString
objects can certainly be stored as elements in collections.
CString
Reference Counting
As of MFC version 4.0, when CStringT
Class objects are copied, MFC increments a reference count rather than copying the data. This makes passing parameters by value and returning CString
objects by value more efficient. These operations cause the copy constructor to be called, sometimes more than once. Incrementing a reference count reduces that overhead for these common operations and makes using CString
a more attractive option.
As each copy is destroyed, the reference count in the original object is decremented. The original CString
object isn't destroyed until its reference count is reduced to zero.
You can use the CString
member functions CSimpleStringT::LockBuffer
and CSimpleStringT::UnlockBuffer
to disable or enable reference counting.