Byte Classification
The latest version of this topic can be found at Byte Classification.
Each of these routines tests a specified byte of a multibyte character for satisfaction of a condition. Except where specified otherwise, the output value is affected by the setting of the LC_CTYPE
category setting of the locale; see setlocale for more information. The versions of these functions without the _l
suffix use the current locale for this locale-dependent behavior; the versions with the _l
suffix are identical except that they use the locale parameter passed in instead.
Note
By definition, the ASCII characters between 0 and 127 are a subset of all multibyte-character sets. For example, the Japanese katakana character set includes ASCII as well as non-ASCII characters.
The predefined constants in the following table are defined in CTYPE.H.
Multibyte-Character Byte-Classification Routines
Routine | Byte Test Condition | .NET Framework equivalent |
---|---|---|
isleadbyte, _isleadbyte_l | Lead byte; test result depends on LC_CTYPE category setting of current locale |
Not applicable, but see System::Globalization::CultureInfo |
_ismbbalnum, _ismbbalnum_l | isalnum || _ismbbkalnum |
Not applicable, but see System::Globalization::CultureInfo |
_ismbbalpha, _ismbbalpha_l | isalpha || _ismbbkalnum |
Not applicable, but see System::Globalization::CultureInfo |
_ismbbgraph, _ismbbgraph_l | Same as _ismbbprint , but _ismbbgraph does not include the space character (0x20) |
Not applicable, but see System::Globalization::CultureInfo |
_ismbbkalnum, _ismbbkalnum_l | Non-ASCII text symbol other than punctuation. For example, in code page 932 only, _ismbbkalnum tests for katakana alphanumeric |
Not applicable, but see System::Globalization::CultureInfo |
_ismbbkana, _ismbbkana_l | Katakana (0xA1 – 0xDF), code page 932 only | Not applicable, but see System::Globalization::CultureInfo |
_ismbbkprint, _ismbbkprint_l | Non-ASCII text or non-ASCII punctuation symbol. For example, in code page 932 only, _ismbbkprint tests for katakana alphanumeric or katakana punctuation (range: 0xA1 – 0xDF). |
Not applicable, but see System::Globalization::CultureInfo |
_ismbbkpunct, _ismbbkpunct_l | Non-ASCII punctuation. For example, in code page 932 only, _ismbbkpunct tests for katakana punctuation. |
Not applicable, but see System::Globalization::CultureInfo |
_ismbblead, _ismbblead_l | First byte of multibyte character. For example, in code page 932 only, valid ranges are 0x81 – 0x9F, 0xE0 – 0xFC. | Not applicable, but see System::Globalization::CultureInfo |
_ismbbprint, _ismbbprint_l | isprint || _ismbbkprint. ismbbprint includes the space character (0x20) |
Not applicable, but see System::Globalization::CultureInfo |
_ismbbpunct, _ismbbpunct_l | ispunct || _ismbbkpunct |
Not applicable, but see System::Globalization::CultureInfo |
_ismbbtrail, _ismbbtrail_l | Second byte of multibyte character. For example, in code page 932 only, valid ranges are 0x40 – 0x7E, 0x80 – 0xEC. | Not applicable, but see System::Globalization::CultureInfo |
_ismbslead, _ismbslead_l | Lead byte (in string context) | Not applicable, but see System::Globalization::CultureInfo |
ismbstrail, _ismbstrail_l | Trail byte (in string context) | Not applicable, but see System::Globalization::CultureInfo |
_mbbtype, _mbbtype_l | Return byte type based on previous byte | Not applicable, but see System::Globalization::CultureInfo |
_mbsbtype, _mbsbtype_l | Return type of byte within string | Not applicable, but see System::Globalization::CultureInfo |
mbsinit | Tracks the state of a multibyte character conversion. | Not applicable, but see System::Globalization::CultureInfo |
The MB_LEN_MAX
macro, defined in LIMITS.H, expands to the maximum length in bytes that any multibyte character can have. MB_CUR_MAX
, defined in STDLIB.H, expands to the maximum length in bytes of any multibyte character in the current locale.