Handling White Space (Windows Embedded CE 6.0)
1/6/2010
The Microsoft XML Parser (MSXML) for Windows Embedded CE or processor is at the heart of XML-related processing in Internet Explorer and can be used through scripts and programming languages such as Microsoft Visual C++®. To use these scripts and programming languages you must understand how the MSXML parser handles white space.
The XML 1.0 specification describes an XML processor as a software application that reads an XML file or stream and ensures its legitimacy as XML. This legitimacy is of two kinds: well-formed, performed on all XML documents, and validity, performed on XML documents that reference a document type definition (DTD). In other words, an XML processor is what has come to be more commonly known as an XML parser.
MSXML contains a built-in parser that performs all the functions of a conformant XML 1.0 parser, including white space handling in an XML document. All white space in the document is preserved and passed to a downstream application.
The built-in XSLT processor is not the downstream application that receives the output of the MSXML parser. This can be confusing to developers who are new to MSXML. The following illustration shows an intervening step that receives the parser's output and builds a Document Object Model (DOM) tree, caching it in memory for ready access later.
This has great significance for the way the XSLT processor handles white space in the "input document" it is working with, because this input document is actually the DOM tree in memory — and by default, the Microsoft DOM tree builder removes extraneous white space from the document as passed to it by the parser.
Extraneous refers to any content that is not in the scope of an xml:space attribute whose value is "preserve". The MSXML processor honors the xml:space="preserve" attribute as expected.
To override this default and retain the original document's extraneous white space in the DOM tree, you must set the preserveWhiteSpace property to TRUE at the time the DOM is loaded. In other words, preserving the original document's extraneous or insignificant white space cannot be accomplished using XSLT alone. By the time the XSLT processor sees what it thinks is the document, all such white space has, by default, been stripped.
The only way to set the preserveWhiteSpace property to TRUE is through script or programming language DOM processing. For more information, see Controlling White Space with the DOM.