How not to re-use XML technologies

I like XML. Call me crazy but I find it relatively easy to work with. Whether it is hand-editing an XML document in a text editor, manipulating it programmatically (as long as you pick a reasonable API, e.g. XOM in Java), transforming it (e.g. XSLT) or querying an XML back-end through XPath/XQuery. Sure it carries useless features that betray its roots in the publishing world (processing instructions anyone?), sure the whole attribute/element overlap doesn’t have much value for systems modeling, but overall it hits a good compromise between human readability and machine processing and it has a pretty solid extensibility story with namespaces.

In addition, the XML toolbox of specifications is very large and offers standard-based answers to many XML-related tasks. That’s good, but when composing a solution it also means that one needs to keep two things in mind:

  • not all these XML specifications are technically sound (even if they carry a W3C stamp of approval), and
  • just because XML’s inherent flexibility lets one stretch a round hole, it doesn’t mean it’s a good idea to jam a square peg into it.

The domain of IT management provides examples for both of these risks. These examples constitute some of the technical deficiencies of management-related XML specifications that I mentioned in the previous post. More specifically, let’s look at three instances of XML mis-use that relate to management-related specifications. We will see:

  • a terrible XML specification that infects any solution it touches (WS-Addressing, used in WS-Management),
  • a mediocre XML specification that has plenty of warts but can be useful for a class of problems, except in this case it isn’t (XSD, used in SML), and
  • a very good XML specification except it is used in the wrong place (XPath, used in CMDBf).

Let’s go through them one by one.

WS-Addressing in WS-Management

The main defect of WS-Management (and of WSDM before it) is probably its use of WS-Addressing. SOAP needs WS-Addressing like a migraine patient needs a bullet in the head (actually, four bullets in the head since we got to deal with four successive versions). SOAP didn’t need a new addressing model, it already had URIs. It just needed a message correlation mechanism. But what we got is many useless headers (like wsa:Action) and the awful EPR construct which solves a problem that didn’t exist and creates many very real new ones. One can imagine nifty hacks that would be enabled by a templating mechanism for SOAP (I indulged myself and sketched one to facilicate mash-up style integrations with SOAP) but if that’s what we’re after then there is no reason to limit it to headers.

XSD in SML

The words “Microsoft” and “bully” often appear in the same sentence, but invariably “Microsoft” is the subject not the object of the bullying. Well, to some extent we have a reverse example here, as unlikely as it may seem. Microsoft created an XML-based meta-model called SDM that included capabilities that looked like parts of XSD. When they opened it up to the industry and floated the idea of standardizing it, they heard back pretty loudly that it would have to re-use XSD rather than “re-invent” it. So they did and that ended up as SML. Except it was the wrong choice and in retrospect I think it would have been better to improve on the original SDM to create a management-specific meta-model than swallow XSD (SML does profile out a few of the more obscure features of XSD, like xs:redefine, but that’s marginal). Syntactic validation of documents is very different from validation of IT models. Of course this may all be irrelevant anyway if SML doesn’t get adopted, which at this point still looks like the most likely outcome (due to things like the failure of CML to produce any model element so far, the ever-changing technical strategy for DSI and of course the XSD-induced complexity of SML).

XPath in CMDBf

I have already covered this in my review of CMDBf 1.0. The main problem is that while XML is a fine interchange format for the CMDBf specification, one should not assume that it is the native format of the data stores that get connected. Using XPath as a selector language makes life difficult for those who don’t use XML as their backend format. Especially when it is not just XPath 1.0 but also the much more complex XPath 2.0. To make matters worse, there is no interoperable serialization format for XPath 1.0 nodesets, which will prevent any kind of interoperability on this. That omission can be easily fixed (and I am sure it will be fixed in DMTF) but that won’t address the primary concern. In the context of CMDBf, XPath/XQuery is an excellent implementation choice for some situations, but not something that should be pushed at the level of the protocol. For example, because XPath is based on the XML model, it has clear notions of order of elements. But what if I have an OO or an RDF-based backend? What am I to make of a selector that says that the “foo” element has to come after the “bar” element? There is no notion of order in Java attributes and/or RDF properties.

Revisionism?

My name (in the context of my previous job at HP) appears in all three management specifications listed above (in increasing level of involvement as contributor for WS-Management, co-author for SML and co-editor for CMDBf) so I am not a neutral observer on these questions. My goal here is not to de-associate myself from these specifications or pick and choose the sections I want to be associated with (we can have this discussion over drinks if anyone is interested). Some of these concerns I had at the time the specifications were being written and I was overruled by the majority. Other weren’t as clear to me then as they are now (my view of WS-Addressing has moved over time from “mostly harmless” to “toxic”). I am sure all other authors have a list of things they wished had come out differently. And while this article lists deficiencies of these specifications, I am not throwing the baby with the bathwater. I wrote recently about WS-Management’s potential for providing consistency for resource manageability. I have good hopes for CMDBf, now in the DTMF, not necessarily as a federation technology but as a useful basis for increased interoperability between configuration repositories. SML has the most dubious fate at this time because, unlike the other two, it hasn’t (yet?) transcended its original supporter to become something that many companies clearly see fitting in their plans.

[UPDATED 2008/3/27: For an extreme example of purposely abusing XML technologies (namely XPath in that case) in a scenario in which it is not the right tool for the job (graph queries), check out this XPath brain teasers article.]

4 Comments

Filed under CMDB Federation, CMDBf, Everything, IT Systems Mgmt, Microsoft, SML, SOAP, SOAP header, Specs, Standards, Tech, WS-Management, XOM

4 Responses to How not to re-use XML technologies

  1. Pingback: William Vambenepe’s blog » Blog Archive » Manageability, management integration and WS-Management

  2. William,

    I have been coming up to speed on CMDBf and have found your blog very helpful. While I am just beginning to understand the spec, your comment “I have good hopes for CMDBf, now in the DTMF, not necessarily as a federation technology but as a useful basis for increased interoperability between configuration repositories” echoes what was going through my head.

    I work for a database virtualization company that uses query federation as our core enabling technology. While I have a vested interest here, seems to me that query federation could be a useful tool for implementing a federated CMDB, esp. one like ours that speaks SQL and XQuery and can publish results as Web Services. I guess my question is why build a spec for federation when there are already ways to do it; building a schema or transport message format does make sense.

    Thanks,

    Tim

  3. vbp

    Hi Tim. I think there is room for CMDB-specific federation technology to be created but I agree with you that efforts around federating CMDBs should also leverage generic query federation capabilities when available. It’s a matter of finding the right balance between generic federation technology, like what your company provides and CMDB-specific parts. The reason I believe there is room for CMDB-specific parts is that query federation in the generic sense is a very hard problem to solve, one that requires compromises in either scale, performance, freshness or other such dimensions. Focusing on a specific domain, like IT configuration management allows one to make simplifying assumptions and optimizations that are appropriate for that domain and make the problem more tractable.
    In that sense, query federation is different from other technologies like encryption which are pretty well addressed by generic technology so there is no need to develop CDMB-specific encryption technology.

  4. Pingback: William Vambenepe’s blog » Blog Archive » Microsoft ditches SML, returns to SDM?