Byte Classification

Article
12/18/2017

The latest version of this topic can be found at Byte Classification.

Each of these routines tests a specified byte of a multibyte character for satisfaction of a condition. Except where specified otherwise, the output value is affected by the setting of the LC_CTYPE category setting of the locale; see setlocale for more information. The versions of these functions without the _l suffix use the current locale for this locale-dependent behavior; the versions with the _l suffix are identical except that they use the locale parameter passed in instead.

Note

By definition, the ASCII characters between 0 and 127 are a subset of all multibyte-character sets. For example, the Japanese katakana character set includes ASCII as well as non-ASCII characters.

The predefined constants in the following table are defined in CTYPE.H.

Multibyte-Character Byte-Classification Routines

Routine	Byte Test Condition	.NET Framework equivalent
isleadbyte, _isleadbyte_l	Lead byte; test result depends on `LC_CTYPE` category setting of current locale	Not applicable, but see System::Globalization::CultureInfo
_ismbbalnum, _ismbbalnum_l	`isalnum \|\| _ismbbkalnum`	Not applicable, but see System::Globalization::CultureInfo
_ismbbalpha, _ismbbalpha_l	`isalpha \|\| _ismbbkalnum`	Not applicable, but see System::Globalization::CultureInfo
_ismbbgraph, _ismbbgraph_l	Same as `_ismbbprint`, but `_ismbbgraph` does not include the space character (0x20)	Not applicable, but see System::Globalization::CultureInfo
_ismbbkalnum, _ismbbkalnum_l	Non-ASCII text symbol other than punctuation. For example, in code page 932 only, `_ismbbkalnum` tests for katakana alphanumeric	Not applicable, but see System::Globalization::CultureInfo
_ismbbkana, _ismbbkana_l	Katakana (0xA1 – 0xDF), code page 932 only	Not applicable, but see System::Globalization::CultureInfo
_ismbbkprint, _ismbbkprint_l	Non-ASCII text or non-ASCII punctuation symbol. For example, in code page 932 only, `_ismbbkprint` tests for katakana alphanumeric or katakana punctuation (range: 0xA1 – 0xDF).	Not applicable, but see System::Globalization::CultureInfo
_ismbbkpunct, _ismbbkpunct_l	Non-ASCII punctuation. For example, in code page 932 only, `_ismbbkpunct` tests for katakana punctuation.	Not applicable, but see System::Globalization::CultureInfo
_ismbblead, _ismbblead_l	First byte of multibyte character. For example, in code page 932 only, valid ranges are 0x81 – 0x9F, 0xE0 – 0xFC.	Not applicable, but see System::Globalization::CultureInfo
_ismbbprint, _ismbbprint_l	`isprint \|\| _ismbbkprint. ismbbprint` includes the space character (0x20)	Not applicable, but see System::Globalization::CultureInfo
_ismbbpunct, _ismbbpunct_l	`ispunct \|\| _ismbbkpunct`	Not applicable, but see System::Globalization::CultureInfo
_ismbbtrail, _ismbbtrail_l	Second byte of multibyte character. For example, in code page 932 only, valid ranges are 0x40 – 0x7E, 0x80 – 0xEC.	Not applicable, but see System::Globalization::CultureInfo
_ismbslead, _ismbslead_l	Lead byte (in string context)	Not applicable, but see System::Globalization::CultureInfo
ismbstrail, _ismbstrail_l	Trail byte (in string context)	Not applicable, but see System::Globalization::CultureInfo
_mbbtype, _mbbtype_l	Return byte type based on previous byte	Not applicable, but see System::Globalization::CultureInfo
_mbsbtype, _mbsbtype_l	Return type of byte within string	Not applicable, but see System::Globalization::CultureInfo
mbsinit	Tracks the state of a multibyte character conversion.	Not applicable, but see System::Globalization::CultureInfo

The MB_LEN_MAX macro, defined in LIMITS.H, expands to the maximum length in bytes that any multibyte character can have. MB_CUR_MAX, defined in STDLIB.H, expands to the maximum length in bytes of any multibyte character in the current locale.

Share via

Byte Classification

Multibyte-Character Byte-Classification Routines

See Also

Additional resources