COMMENT This algorithm maps a Unicode string encoded in UTF-16 to a
string in the specified ANSI codepage. The supported ANSI codepages
are limited to those that can be set as system codepage.
It requires the following externally specified values:
1) CodePage: An integer value to represent an ANSI codepage value.
If CodePage value is CP_ACP (0), use the system default ANSI codepage from
the OS.
If CodePage value is CP_OEMCP (1), use the sysstem default OEM codepage from
the OS.
2) UnicodeString: A string encoded in UTF-16. Every Unicode code point
is an unsigned 16-bit ("WORD") value. A surrogate pair is not
supported in this algorithm.
3) UnicodeStringLength: The string length in 16-bit ("WORD") unit for
UnicodeString. When UnicodeStringLength is 0, the length is
decided by counting from the beginning of the string to a NULL
character (Unicode value U+0000), including the null character.
4) MultiByteString: A string encoded in ANSI codepage. Every
character can be an 8-bit (byte) unsigned value or two 8-bit
unsigned values.
5) MultiByteStringLength: The length in bytes, including
the byte for NULL terminator. When MultiByteStringLength is 0,
the MultiByteString value will not be used in this algorithm.
Instead, the length of the result string in ANSI codepage will be
returned.
6) lpDefaultChar
Optional. Point to the byte to use if a character cannot be represented in
the specified codepage. The application sets this parameter to NULL if
the function is to use a system default value. The common default value is
0x3f, which is the ASCII value for the question mark.
PROCEDURE WideCharToMultiByteFromCodepageDataFile
IF CodePage is CP_ACP THEN
COMMENT Windows operating system keeps a systemwide value of
default ANSI system codepage. It is used to provide a default
COMMENT system codepage to be used by legacy ANSI application.
SET CodePage to the default ANSI system codepage from the Windows
operating system.
ELSE IF CodePage is CP_OEMCP THEN
COMMENT Windows keeps a systemwide value of
default OEM system codepage. It is used to provide a default
COMMENT system codepage to be used by legacy console application.
SET CodePage to the default OEM system codepage from Windows.
ENDIF
IF CodePage is CP_UTF8 THEN
CALL Utf8ConversionAlgorithm
COMMENT For UTF-8 use the algorithm in 3.1.5.1.6
RETURN
ENDIF
IF UnicodeStringLength is 0 THEN
COMPUTE UnicodeStringLength as the string length in 16-bit units
of UnicodeString as a NULL-terminated string, including
NULL terminator.
ENDIF
IF MultiByteStringLength is 0 THEN
SET IsCountingOnly to True
ELSE
SET IsCountingOnly to False
ENDIF
SET ResultMultiByteLength to 0
SET CodePageFileName to the concatenation of strings "Bestfit",
CodePage as a string, and ".txt"
IF lpDefaultChar is null THEN
COMMENT No default char is specified by the caller. Read the default
COMMENT char from CPINFO in the data file
OPEN SECTION CharacterInfo where section name is CPINFO
from file with the name of CodePageFileName
SET lpDefaultChar to CharacterInfo.Field3
ENDIF
OPEN SECTION WideCharMapping where section name is WCTABLE from file
with the name of CodePageFileName
FOR each Unicode codepoint UnicodeChar in UnicodeString
SELECT MappingData from WideCharMapping
where field 1 matches UnicodeChar
IF MappingData is null THEN
COMMENT There is no mapping for this Unicode character, use
COMMENT the default character
IF IsCountingOnly is False THEN
SET MultiByteString[ResultMultiByteLength]
to lpDefaultChar
ENDIF
INCREMENT ResultMultiByteLength
CONTINUE FOR loop
ENDIF
SET MultiByteResult to MappingData.Field2
IF MultiByteResult is less than 256 THEN
COMMENT This is a single byte result
IF IsCountingOnly is True THEN
INCREMENT ResultMultiByteLength
ELSE
SET MultiByteString[ResultMultiByteLength]
to MultiByteResult
INCREMENT ResultMultiByteLength
ENDIF
ELSE
COMMENT This is a double byte result
IF IsCountingOnly is True THEN
COMPUTE ResultMultiByteLength as
ResultMultiByteLength added by 2
ELSE
SET MultiByteString[ResultMultiByteLength] to
MultiByteResult divided by 256
INCREMENT ResultMultiByteLength
SET MultiByteString[ResultMultiByteLength] to
the remainder of MultiByteResult divided by 256
INCREMENT ResultMultiByteLength
ENDIF
ENDIF
END FOR
RETURN ResultMultiByteLength as a 32-bit unsigned integer