GetStringTypeW
A version of this page is also available for
4/8/2010
This function returns character-type information for the characters in the specified source string. For each character in the string, the function sets one or more bits in the corresponding 16-bit element of the output array. Each bit identifies a given character type, such as whether the character is a letter, a digit, or neither.
Syntax
BOOL GetStringTypeW(
DWORD dwInfoType,
LPCWSTR lpSrcStr,
int cchSrc,
LPWORD lpCharType
);
Parameters
dwInfoType
[in] Value that specifies the type of character information the user wants to retrieve. The various types are divided into different levels (see the following Remarks section for a list of the information included in each type). This parameter can specify one of the following character type flags. The following table shows the values this parameter can take.Value Description CT_CTYPE1
Retrieve character type information.
CT_CTYPE2
Retrieve bidirectional layout information.
CT_CTYPE3
Retrieve text processing information.
- lpSrcStr
[in] Pointer to the string for which character types are requested. If cchSrc is –1, the string is assumed to be null terminated. This must be a Unicode string.
- cchSrc
[in] Size, in characters, of the string pointed to by the lpSrcStr parameter. If this count includes a terminating null character, the function returns character type information for the terminating null character. If this value is –1, the string is assumed to be null terminated and the length is calculated automatically.
- lpCharType
[out] Pointer to an array of 16-bit values. The length of this array must be large enough to receive one 16-bit value for the number of characters specified in the cchSrc parameter. When the function returns, this array contains one word corresponding to each Unicode character in the source string.
Return Value
Nonzero indicates success. Zero indicates failure. To get extended error information, call the GetLastError function. Possible values for GetLastError include the following:
- ERROR_INVALID_FLAGS
- ERROR_INVALID_PARAMETER
Remarks
The GetStringTypeW function is designed for Unicode strings only.
The lpSrcStr and lpCharType pointers must not be the same. If they are the same, the function fails and GetLastError returns ERROR_INVALID_PARAMETER.
The character-type bits are divided into several levels. The information for one level can be retrieved by a single call to this function. Each level is limited to 16 bits of information so that the other mapping routines, which are limited to 16 bits of representation per character, can also return character-type information.
The character types supported by this function include the following.
Ctype 1
These types support ANSI C and POSIX (LC_CTYPE) character-typing functions. A combination of these values is returned in the array pointed to by the lpCharType parameter when the dwInfoType parameter is set to CT_CTYPE1.Name Value Description C1_UPPER
0x0001
Uppercase
C1_LOWER
0x0002
Lowercase
C1_DIGIT
0x0004
Decimal digits
C1_SPACE
0x0008
Space characters
C1_PUNCT
0x0010
Punctuation
C1_CNTRL
0x0020
Control characters
C1_BLANK
0x0040
Blank characters
C1_XDIGIT
0x0080
Hexadecimal digits
C1_ALPHA
0x0100
Any linguistic character: alphabetic, syllabary, or ideographic
The following character types are either constant or computable from basic types and do not need to be supported by this function.
Type Description Alphanumeric
Alphabetic characters and digits (C1_ALPHA and C1_DIGIT)
Printable
Graphic characters and blanks (all C1_* types except C1_CNTRL)
Ctype 2
These types support proper layout of Unicode text. The direction attributes are assigned so that the bidirectional layout algorithm standardized by Unicode produces accurate results. These types are mutually exclusive. For more information about the use of these attributes, see The Unicode Standard: Worldwide Character Encoding, Volumes 1 and 2, Addison Wesley Publishing Company: 1991, 1992, ISBN 0201567881.Name Value Description Strong
C2_LEFTTORIGHT
0x0001
Left to right
C2_RIGHTTOLEFT
0x0002
Right to left
Weak
C2_EUROPENUMBER
0x0003
European number, European digit
C2_EUROPESEPARATOR
0x0004
European numeric separator
C2_EUROPETERMINATOR
0x0005
European numeric terminator
C2_ARABICNUMBER
0x0006
Arabic number
C2_COMMONSEPARATOR
0x0007
Common numeric separator
Neutral
C2_BLOCKSEPARATOR
0x0008
Block separator
C2_SEGMENTSEPARATOR
0x0009
Segment separator
C2_WHITESPACE
0x000A
White space
C2_OTHERNEUTRAL
0x000B
Other neutrals
Not applicable
C2_NOTAPPLICABLE
0x0000
No implicit directionality (for example, control codes)
Ctype 3
These types are intended to be placeholders for extensions to the POSIX types required for general text processing or for the standard C library functions. A combination of these values is returned when dwInfoType is set to CT_CTYPE3.Name Value Description C3_NONSPACING
0x0001
Nonspacing mark
C3_DIACRITIC
0x0002
Diacritic nonspacing mark
C3_VOWELMARK
0x0004
Vowel nonspacing mark
C3_SYMBOL
0x0008
Symbol
C3_KATAKANA
0x0010
Katakana character
C3_HIRAGANA
0x0020
Hiragana character
C3_HALFWIDTH
0x0040
Half-width character
C3_FULLWIDTH
0x0080
Full-width character
C3_IDEOGRAPH
0x0100
Ideographic character
C3_KASHIDA
0x0200
Arabic Kashida character
C3_LEXICAL
0x0400
Punctuation which is counted as part of the word (Kashida, hyphen, feminine/masculine ordinal indicators, equal sign, and so forth)
C3_ALPHA
0x8000
All linguistic characters (alphabetic, syllabary, and ideographic)
Not applicable
C3_NOTAPPLICABLE
0x0000
Not applicable
Requirements
Header | winnls.h |
Library | coredll.lib |
Windows Embedded CE | Windows CE 1.0 and later |
Windows Mobile | Windows Mobile Version 5.0 and later |