Share via


More on "Status of XQuery in the .NET Framework 2.0"

As  Soumitra Sengupta and Charlie Heinemann have officially announced on MSDN, and several of us have blogged about previously Microsoft will not ship an implementation of XQuery in the .NET Framework version 2.0.  This decision has generated a certain amount of discussion, and perhaps some confusion which the MSDN article attempts to clear up.  This post offers my own understanding of the underlying rationale in somewhat more detail and offers a personal perspective on the situation.

XQuery enables transformation and querying of large volume of XML data using a declarative and typed language.  Since there are a large number of customer use cases for this capability, we will ship an XQuery implementation in SQL Server 2005.  Michael Rys has outlined the rationale for supporting XQuery in the database engine even before the W3C Recommendation is final:

1. SQL Server (unlike the .Net Framework) already has mechanisms to deal with future changes that break backwards-compatibility (the database compatibility levels). Thus, if we implement something that is not according to the final standard either on purpose or the standard changes after we ship, we can provide a way to users to chose the compatibility level, when we align the behavior with the standard in a future release. …
2. Since we are planning on using XQuery to query XML data, we did not want to provide a stop-gap language such as XPath 1.0. That would have cost us almost as much in implementation cost and would have created instant legacy….
3. By scoping the XQuery implementation to a subset, we minimize the risk of non-anticipated breaking changes while still providing values to our customers. It also allows us to grow our implementation into what users really need in XQuery instead of wasting resources in implementing/testing and documenting functionality that is rarely used.

These arguments, however, do not apply to XQuery support in the next version of the .NET Framework.  We provided an implementation of XQuery in Whidbey Beta 1 as a preview technology and got feedback from our customers.  Based on this feedback and on updated assessments of the XQuery timetable at W3C,  it was clear that we should remove this code from Whidbey Beta 2 and the final .NET Framework 2.0. 

The most important consideration is timing: One can realistically expect that XQuery will become a finalized Recommendation in the early 2006 timeframe, whereas .NET Framework 2.0 and Visual Studio 2005 will ship in the summer of 2005. Microsoft has learned the hard way that supporting draft W3C Recommendations in core technology components is simply a bad idea.. As many readers will recall, Microsoft supported what was expected to be a nearly-final version of XSLT 1.0 in IE 5, and that turned out to be a mistake when the really final version incorporated several incompatible changes.  That created a de facto standard flavor of XSLT that nobody wanted, and which created considerable confusion and support costs that linger to this day.  “Never Again!”  seems to be the watchword  here whenever the subject of shipping implementations of draft W3C specifications comes up. In short, supporting XQuery in .NET 2.0 would create an unacceptable risk of repeating the IE 5.0 XSLT fiasco.  Considering that releasing XQuery in the .NET Framework now means baking it into the OS for the Longhorn release in 2006, supporting a preliminary draft is clearly not the right thing to do for our customers.

Another consideration is that SQL Server 2005 will support only a subset of XQuery, and it is important to align support for a given specification across various products.  In other words, we don’t want to create a situation in which XQuery code developed for .NET won’t work with SQL Server.  Supporting the SQL Server subset in .NET is not the right choice because that is not the right subset to solve key client side scenarios. 

Furthermore, we have simply listened to the customers:  While our customers are asking for XML datatype and XQuery support in SQL Server 2005 to enable storing and retrieving semi-structured and marked up data, the are not pointing to any compelling new scenarios for XQuery in the client.  The whole point of the Whidbey Beta 1 preview release was to get feedback from potential customers, and as far as we could tell the feedback indicated a lack of enthusiasm for XQuery in the client / middle tier at this point.

We do support XSLT on the client side and we hear from customers that it solves a large number of important XML application scenarios.  In Whidbey, we are shipping a new XSLT .NET compiler in the client that will meet or beat our existing XSLT performance numbers on the native stack.  The one area where XSLT 1.0  does not subsume the functionality of a scoped-down XQuery is in strongly-typed query support.  While strong typing is necessary in the server for query optimization, our customers tell us that it is not as critical in the client where most transformations and aggregations are done against un-typed XML.

Thus, the WebData XML group weighed the risk of shipping XQuery in the .NET platform against the risk of being out of alignment with the W3C standard and Microsoft's server implementation,  and determined the right thing to do for our customers is not to ship it in the client at this point.  We are committed to completing the XQuery / XSLT2 standards work in W3C, and created the position I hold in order to support that commitment.  Working with my colleagues Paul Cotton and Michael Rys, a big part of my responsibility is to help ensure that the XQuery becomes a W3C Recommendation as soon as humanly possible. 

So, to summarize:
1.    We will ship a subset of XQuery in SQL Server 2005.  This will enable important customer scenarios for storing and retrieving data using the new XML datatype.  This implementation will be part of Yukon B3 as well. 
2.    We will ship our new compiled XSLT implementation in .NET Framework 2.0 and brand new XSLT debugger in Visual Studio 2005. These will enable customer scenarios for filtering and transforming XML on the client side.
3.    We will continue to drive the XQuery standards in the W3C. We will also actively monitor the progress of  XSLT 2.0 in the W3C and its uptake by the XML developer community.  We will remain deeply engaged with our customers regarding improving our query and transformation story in the frameworks and tools to determine the right strategy and product plans. 

OK, that’s more or less the consensus around here.  Moving on to my personal perspective…

•   There is no doubt in my mind that XQuery is going to be successful as a query language for XML data stores. While some first-generation XML database products got by with offering XPath 1.0 – based query languages, XQuery offers several important advantages over XPath 1.0. These include the ability to do joins across XML collections, the ability to query on data types rather than just text representations, and the ability to restructure output within the query environment.  XQuery actually has very little competition in this niche: theoretically XSLT would fit the bill as both a query and a transformation language, but very few people have taken the idea seriously.  Alternatively, SQL extended with XPath can do this, but in practice the mis-match between the relational and XML data models makes this very messy.  (As I understand it, the next version of the SQL standard will reference XQuery normatively rather than try to define an alternative).

•    There is a LOT of doubt in my mind about XQuery’s future on the client side or middle tier as a data integration language and/or a replacement for XSLT as a transformation language  The WebData XML group bet heavily on this idea a few years ago, and it didn't work out for the reasons noted in the blog posts referenced above.  That's not to say that the official use case for XQuery as a  way of integrating across the relational and XML worlds is misguided, but simply to argue that this is not at all proven in the real world.    Right now the corner cases where SQL, programming language, and XQuery data types do not mesh cleanly (dates are a notorious example), and the common cases where tricky semantic alignments are needed to integrate real-world data, are best handled by procedural code that handles these in a domain-specic manner. A couple of companies have bet heavily on XQuery as a framework for a general solution in this area, and perhaps they will make it work.  Dana Florescu, who has contributed greatly to the development of XQuery over the years, offers an enthusiastic perspective in a recent interview.  It is quite possible that this vision will be realized in the next few years, we shall see.
Still, I’m afraid I have to agree with Dr. Florescu’s colleagues at the at the CIDR conference who (as she notes in the interview) gave her an award for the "idea the world is least ready for" :-)

•    I am growing increasingly skeptical that XQuery-based applications will be easily portable across implementations.  Part of this skepticism is theoretical, based on the sheer size of the XQuery spec and the reality that no commercial DBMS vendors have  implemented the whole thing. Conversely, since XQuery 1.0 will not implement insert/delete/update operations, all DBMS implementers have to add proprietary extensions in order to meet obvious customer needs. But another part of my skepticism is  based on the reality that few real-world SQL applications are portable across products.  XQuery is at least as complex as SQL and forged in the same competitive environment, so it is unclear to me why we can expect it to be any more portable across implementations than with SQL.

•     I've given up on the idea of XQuery as an XML-aware general purpose programming language for real-world developers. I very much like the vision of a development language that can integrate the typed object, RDBMS, and XML worlds, and at one time it looked as though XQuery could hit a sweet spot there. I suspect, however, that XQuery missed its window of opportunity; now that dynamic languages with built-in XML libraries have been accepted into the mainstream, the problems with the XSD type system on which XQuery tries to build become increasingly obvious, and the prospect of conventional languages extended to handle XML natively is becoming tangible, it's just not as exciting an idea as it once sounded.

•    I've also rethought my previous position that XQuery is easier for ordinary mortals to learn than XSLT.  Part of the reason for that is a recent month-long debate on the xml-dev mailing list brought out a lot of people who passionately admire and know how to exploit XSLT, and only a few testimonials (from stakeholders!) for XQuery as anything beyond an XML DB query language.  Furthermore, I've been exposed to the XSLT debugger in the next version of VisualStudio.NET -- I think that once people can watch a stylesheet execute, they will come to grok XSLT's oddly powerful paradigm and learn to apply it to their data manipulation problems.

So, several of us in the WebData XML group have explained why we collecvtively and individually have concluded that XQuery shouldn't be supported in the .NET framework at this point.  What do you think, and what about in the future?  Are there any passionate admirers of XQuery as something other than an XML database query language who think that MS should be seriously considering client and middle-tier use cases  for XQuery once XQuery is a Recommendation?  We’re waiting to hear from you!

Comments

  • Anonymous
    January 28, 2005
    I've been studying XQuery and planning to incorporate it into an application for months now. Our App translates XML datastores and web services into other sources (HTML, but also PowerPoint, Word, and Excel) using XPath queries in templates.

    It's completely .NET developed and so far users love it. The only problem is that I'm finding myself righting plenty of custom code to add functionality as XPath statements just aren't powerful enough... My users want to have a PowerPoint template say {if (//pct_complete <50) then "Behind Schedule" else "On Schedule"}. I'm contnuously building in much of the major functionality of XQuery... painful as I'm hoping that I'm matching my syntax to what will be used in the future.

    Is there any chance to have a seperate download for those that accept the risk of not exactly meeting the W3C standard?

  • Anonymous
    January 28, 2005
    The comment has been removed

  • Anonymous
    January 28, 2005
    I like that last idea of breaking XQuery (and perhaps other bits which are based on evolving standards) out of the BCL into seperate APIs -- similar to how the WSE is implemented -- so long as there is a solid way to deploy these other packages.

  • Anonymous
    January 28, 2005
    > Should we add SQL integration to C# on the level that XML integration has been
    > added to it in C-Omega?

    I for one would like to see tuples and XML documents treated as first-class data objects in programming languages. I understand that this expands the scope of the languages, but it seems like a natural evolutionary step. Over the last 30 years, strings, data structures, and objects have gradually moved from application-level to language-level constructs; continuing that trend seems like a good idea to me, FWIW.

  • Anonymous
    January 28, 2005
    Mike Champion has a post on Microsoft's decision to cut XQuery from Whidbey (but leave a subset in SQL Server). I completely agree with their decision and rationale for pulling XQuery from the 2.0 version of the Framework class library, but I have to disagree with his assessment of XSLT...

  • Anonymous
    January 28, 2005
    You're making the right decision. No reason to set yourselves up to be burned by changes to the spec and years of hearing people complain "MS has proprietary XQuery!".

    Ship it as a separate assembly or as part of a 2.x upgrade to System.Xml.dll.

  • Anonymous
    January 28, 2005
    The comment has been removed

  • Anonymous
    January 28, 2005
    Of OLUG, XQuery and mild suprises.

  • Anonymous
    January 28, 2005
    The comment has been removed

  • Anonymous
    January 29, 2005
    Kent Tegels makes some very interesting points. First "why don't we do the best job providing a tool that we can and see what developers use it for" is exactly the right question, and exactly the question we discuss internally. All I'm saying here is that at one point it looked like the answer was "XQuery", and now it doesn't. More later on what we conclude that the answer is.)

    Second, if XQuery does in fact become the "real standard" for querying, reshaping, and managing data represented as XML, what I'm saying here could change. Right now it looks like it will be the real standard for querying, it will compete (IMHO unsuccessfully) for mindshare/resources with XSLT as the real standard for reshaping, and could possibly (after 1.1 is a Recommendation in a few years) be a contender as the one true standard for managing XML data. So, we are betting on XQuery as a query language, investing more in XSLT as the reshaping standard for the time being, and keeping an open mind about data management/manipulation.

    There seemed to be an implication in the post that maybe MS should cover all the good guesses about which XML technologies will find good uses and let the market sort it all out. That's how I used to think about MS anyway -- "they could easily afford to support [X, Y, Z] why don't they? From the inside it looks a bit different -- I see budget decisions driven by business case analyses, lots of people spending lots of energy maintaining / testing / securing code based on long-ago guesses about what might be useful, an immense list of immediate needs and good ideas competing for resources, and a need to make hard choices going forward. I'm pretty sure that the choices we make will enable all sorts of unforseen use cases better than investing our finite resources in XQuery on the client today would have.

  • Anonymous
    January 31, 2005
    The comment has been removed

  • Anonymous
    February 03, 2005
    In doing some <a href="http://www.wpclabs.org">research</a> for possible use of XQuery in a simple translation engine, I (possibly incidently :-) ) came to the same conclusion: In reviewing 5 or 6 XQuery client side implementations, MS is not "down the path" far enough just yet, and it would take a crystal ball to be sure of a final sulution [before the fact] that aligned with the final standard [after the fact].

    Our system is a parser/processor which, on completion, needs to pass of the resultant data, in XML form, to a translator for final conversion to a legacy format. This could have been done a number of ways, but I was interested in trying to incorporate greater flexibility in the future and XQuery seems to present that promise.

    The Saxon.NET engine has me chomping at the bit, but for now, we're using the Java Saxon with some proprietary interop stuff between that and our .NET listeners. Granted, it sounds like our purposes would have been met by the "partial solution", but the majority of early adopters are looking for comprehensive support to leverage in their next gen apps.

    It would be nice if the Native .NET XQuery support were prepared in such a way that, once the W3C standard becomes gospel, release to market would be minimal...

  • Anonymous
    July 10, 2008
    PingBack from http://gisselle.onlinevidsdigestabout.info/microsoftvisualstudiosulution.html

  • Anonymous
    June 02, 2009
    PingBack from http://portablegreenhousesite.info/story.php?id=25105