William Vambenepe's blog

IT management in a changing IT world

This teaching relationship is the basis of buy cheapest viagra the physician doctor, which originally meant "teacher" in Latin.This original definition excluded naturally occurring herbal viagra for women, such as gastric juice and hydrogen peroxide (they kill micro-organisms but are not produced by micro-organisms), and also excluded synthetic compounds such as the sulfonamides (which are antimicrobial agents).Human evolution is characterized by a number of important natural herbal viagra, developmental, physiological and behavioural changes, which have taken place since the split between the last common ancestor of humans and chimpanzees.It was at that time known that milk fermented with lactic-acid bacteria female viagra uk the growth of proteolytic bacteria because of the low pH produced by the fermentation of lactose.See negative effects of the fight-or-flight order generic viagra.

Archive for the 'WS-ResourceTransfer' Category

24
Apr
2009

A post-mortem on the previous IT management revolution

by William (@vambenepe on Twitter)

Before rushing to standardize “Cloud APIs”, let’s take a look back at the previous attempt to tackle the same problem, which is one of IT management integration and automation. I am referring to the definition of specifications that attempted to use the then-emerging SOAP-based Web services framework to easily integrate IT management systems and their targets.

Leaving aside the “Cloud” spin of today and the “Web services” frenzy of yesterday, the underlying problem remains to provide IT services (mostly applications) in a way that offers the best balance of performance, availability, security and economy. Concretely, it is about being able to deploy whatever IT infrastructure and application bits need to be deployed, configure them and take any required ongoing action (patch, update, scale up/down, optimize…) to keep them humming so customers don’t notice anything bothersome and you don’t break any regulation. Or rather so that any disruption a customer sees and any mandate you violate cost you less than it would have cost to avoid them.

The realization that IT systems are moving more and more towards distributed/connected applications was the primary reason that pushed us towards the definition of Web services protocols geared towards management interactions. By providing a uniform and network-friendly interface, we hoped to make it convenient to integrate management tasks vertically (between layers of the IT stack) and horizontally (across distributed applications). The latter is why we focused so much on managing new entities such as Web services, their execution environments and their conversations. I’ll refer you to the WSMF submission that my HP colleagues and I made to OASIS in 2003 for the first consistent definition of such a management framework. The overview white paper even has a use case called “management as a service” if you’re still not convinced of the alignment with today’s Cloud-talk.

Of course there are some differences between Web service management protocols and Cloud APIs. Virtualization capabilities are more advanced than when the WS effort started. The prospect of using hosted resources is more realistic (though still unproven as a mainstream business practice). Open source component are expected to play a larger role. But none of these considerations fundamentally changes the task at hand.

Let’s start with a quick round-up and update on the most relevant efforts and their status.

Protocols

WSMF (Web Services Management Framework): an HP-created set of specifications, submitted to the OASIS WSDM working group (see below). Was subsumed into WSDM. Not only a protocol BTW, it includes a basic model for Web services-related artifacts.

WS-Manageability: An IBM-led alternative to parts of WSDM, also submitted to OASIS WSDM.

WSDM (Web Services Distributed Management): An OASIS technical committee. Produced two standards (a protocol, “Management Using Web Services” and a model of Web services, “Management Of Web Services”). Makes use of WSRF (see below). Saw a few implementations but never achieved real adoption.

OGSI (Open Grid Services Infrastructure): A GGF (the organization now known as OGF) standard to provide a service-oriented resource manipulation infrastructure for Grid computing. Replaced with WSRF.

WSRF: An OASIS technical committee which produced several standards (the main one is WS-ResourceProperties). Started as an attempt to align the GGF/OGSI approach to resource access with the IT management approach (represented by WSDM). Saw some adoption and is currently quietly in use under the cover in the GGF/OGF space. Basically replaced OGSI but didn’t make it in the IT management world because its vehicle there, WSDM, didn’t.

WS-Management: A DMTF standard, based on a Microsoft-led submission. Similar to WSDM in many ways. Won the adoption battle with it. Based on WS-Transfer and WS-Enumeration.

WS-ResourceTransfer (aka WS-RT): An attempt to reconcile the underlying foundations of WSDM and WS-Management. Stalled as a private effort (IBM, Microsoft, HP, Intel). Was later submitted to the W3C WS-RA working group (see below).

WSRA (Web Services Resource Access): A W3C working group created to standardize the specifications that WS-Management is built on (WS-Transfer etc) and to add features to them in the form of WS-RT (which was also submitted there, in order to be finalized). This is (presumably) the last attempt at standardizing a SOAP-based access framework for distributed resources. Whether the window of opportunity to do so is still open is unclear. Work is ongoing.

WS-ResourceCatalog : A discovery helper companion specification to WS-Management. Started as a Microsoft document, went through the “WSDM/WS-Management reconciliation” effort, emerged as a new specification that was submitted to DMTF in May 2007. Not heard of since.

CMDBf (Configuration Management Database Federation): A DMTF working group (and soon to be standard) that mainly defines a SOAP-based protocol to query repositories of configuration information. Not linked with (or dependent on) any of the specifications listed above (it is debatable whether it belongs in this list or is part of a new breed).

Modeling

DCML (Data Center Markup Language): The first comprehensive effort to model key elements of a data center, their relationships and their policies. Led by EDS and Opsware. Never managed to attract the major management vendors. Transitioned to an OASIS member section and died of being ignored.

SDM (System Definition Model): A Microsoft specification to model an IT system in a way that includes constraints and validation, with the goal of improving automation and better linking the different phases of the application lifecycle. Was the starting point for SML.

SML (Service Modeling Language): Currently a W3C “proposed recommendation” (soon to be a recommendation, I assume) with the same goals as SDM. It was created, starting from SDM, by a consortium of companies that eventually submitted it to W3C. No known adoption other than the Eclipse COSMOS project (Microsoft was supposed to use it, but there hasn’t been any news on that front for a while). Technically, it is a combination of XSD and Schematron. It appears dead, unless it turns out that Microsoft is indeed using it (I don’t know whether System Center is still using SDM, whether they are adopting SML, whether they are moving towards M or whether they have given up on the model-centric vision).

CML (Common Model Library): An effort by the SML authors to create a set of model elements using the SML metamodel. Appears to be dead (no news in a long time and the cml-project.org domain name that was used seems abandoned).

SDD (Solution Deployment Descriptor): An OASIS standard to define a packaging mechanism meant to simplify the deployment and configuration of software units. It is to an application archive what OVF is to a virtual disk. Little adoption that I know of, but maybe I have a blind spot on this.

OVF (Open Virtualization Format): A recently released DMTF standard. Defines a packaging and descriptor format to distribute virtual machines. It does not defined a common virtual machine format, but a wrapper around it. Seems to have some momentum. Like CMDBf, it may be best thought of as part of a new breed than directly associated with WS-Management and friends.

This is not an exhaustive list. I have left aside the eventing aspects (WS-Notification, WS-Eventing, WS-EventNotification) because while relevant it is larger discussion and this entry to too long already (see here and here for some updates from late last year on the eventing front). It also does not cover the Grid work (other than OGSI/WSRF to the extent that they intersect with the IT management world), even though a lot of the work that took place there is just as relevant to Cloud computing as the IT management work listed above. Especially CDDLM/CDL an abandoned effort to port SmartFrog to the then-hot XML standards, from which there are plenty of relevant lessons to extract.

The lessons

What does this inventory tell us that’s relevant to future Cloud API standardization work? The first lesson is that protocols are easy and models are hard. WS-Management and WSDM technically get the job done. CMDBf will be a good query language. But none of the model-related efforts listed above seem to have hit the mark of “doing the job”. With the possible exception of OVF which is promising (though the current expectations on it are often beyond what it really delivers). In general, the more focused and narrow a modeling effort is, the more successful it seems to be (with OVF as the most focused of the list and CML as the other extreme). That’s lesson learned number two: models that encompass a wide range of systems are attractive, but impossible to deliver. Models that focus on a small sub-area are the way to go. The question is whether these specialized models can at least share a common metamodel or other base building blocks (a type system, a serialization, a relationship model, a constraint mechanism, etc), which would make life easier for orchestrators. SML tries (tried?) to be all that, with no luck. RDF could be all that, but hasn’t managed to get noticed in this context. The OVF and SDD examples seems to point out that the best we’ll get is XML as a shared foundation (a type system and a serialization). At this point, I am ready to throw the towel on achieving more modeling uniformity than XML provides, and ready to do the needed transformations in code instead. At least until the next window of opportunity arrives.

I wish that rather than being 80% protocols and 20% models, the effort in the WS-based wave of IT management standards had been the other way around. So we’d have a bit more to show for our work, for example a clear, complete and useful way to capture the operational configuration of application delivery services (VPN, cache, SSL, compression, DoS protection…). Even if the actual specification turns out to not make it, its content should be able to inform its successor (in the same way that even if you don’t use CIM to model your server it is interesting to see what attributes CIM has for a server).

It’s less true with protocols. Either you use them (and they’re very valuable) or you don’t (and they’re largely irrelevant). They don’t capture domain knowledge that’s intrinsically valuable. What value does WSDM provide, for example, now that’s it’s collecting dust? How much will the experience inform its successor (other than trying to avoid the WS-Addressing disaster)? The trend today seems to be that a more direct use of HTTP (“REST”) will replace these protocols. Sure. Fine. But anyone who expects this break from the past to be a vaccination against past problems is in for a nasty surprise. Because, and I am repeating myself, it’s the model, stupid. Not the protocol. Something I (hopefully) explained in my comments on the Sun Cloud API (before I knew that caring about this API might actually become part of my day job) and something on which I’ll come back in a future post.

Another lesson is the need for clear use cases. Yes, it feels silly to utter such an obvious statement. But trust me, standards groups still haven’t gotten this. It’s not until years spent on WSDM and then WS-Management that I realized that most people were not going after management integration, as I was, but rather manageability. Where “manageability” is concerned with discovering and monitoring individual resources, while “management integration” is concerned with providing a systematic view of the environment, with automation as the goal. In other words, manageability standards can allow you to get a traditional IT management console without the need for agents. Management integration standards can allow you to coordinate your management systems and automate their orchestration. WS-Management is for manageability. CMDBf is in the management integration category. Many of the (very respectful and civilized) head-butting sessions I engaged in during the WSDM effort can be traced back to the difference between these two sets of use cases. And there is plenty of room for such disconnect in the so-loosely-defined “Cloud” world.

We have also learned (or re-learned) that arbitrary non-backward compatible versioning, e.g. for political or procedural reasons as with WS-Addressing, is a crime. XML namespaces (of the XSD and WSDL types, as well as URIs used in similar ways in specifications, e.g. to identify a dialect or profile) are tricky, because they don’t have backward compatibility metadata and because of the practice to use organizations domain names in the URI (as opposed to specification-specific names that can be easily transferred, e.g. cmdbf.org versus dmtf.org/cmdbf). In the WS-based management world, we inherited these problems at the protocol level from the generic WS stack. Our hands are more or less clean, but only because we didn’t have enough success/longevity to generate our own versioning problems, at the model level. But those would have been there had these models been able to see the light of day (CML) or see adoption (DCML).

There are also practical lessons that can be learned about the tactics and strategies of the main players. Because it looks like they may not change very much, as corporations or even as individuals. Karla Norsworthy speaks for IBM on Cloud interoperability standards in this article. Andrew Layman represented Microsoft in the post-Manifestogate Cloud patch-up meeting in New York. Winston Bumpus is driving the standards strategy at VMWare. These are all veterans of the WS-Management, WSDM and related wars collaborations (and more generally the whole WS-* effort for the first two). For the details of what there is to learn from the past in that area, you’ll have to corner me in a hotel bar and buy me a few drinks though. I am pretty sure you’d get your money’s worth (I am not a heavy drinker)…

In summary, here are my recommendations for standardizing Cloud API, based on lessons from the Web services management effort. The theme is “focus on domain models”. The line items:

  • Have clear goals for each effort. E.g. is your use case to deploy and run an existing application in a Cloud-like automated environment, or is it to create new applications that efficiently take advantage of the added flexibility. Very different problems.
  • If you want to use OVF, then beef it up to better apply to Cloud situations, but keep it focused on VM packaging: don’t try to grow it into the complete model for the entire data center (e.g. a new DCML).
  • Complement OVF with similar specifications for other domains, like the application delivery systems listed above. Informally try to keep these different specifications consistent, but don’t over-engineer it by repeating the SML attempt. It is more important to have each specification map well to its domain of application than it is to have perfect consistency between them. Discrepancies can be bridged in code, or in a later incarnation.
  • As you segment by domain, as suggested in the previous two bullets, don’t segment the models any further within each domain. Handle configuration, installation and monitoring issues as a whole.
  • Don’t sweat the protocols. HTTP, plain old SOAP (don’t call it POS) or WS-* will meet your need. Pick one. You don’t have a scalability challenge as much as you have a model challenge so don’t get distracted here. If you use REST, do it in the mindset that Tim Bray describes: “If you’re going to do bits-on-the-wire, Why not use HTTP? And if you’re going to use HTTP, use it right. That’s all.” Not as something that needs to scale to Web scale or as a rebuff of WS-*.
  • Beware of versioning. Version for operational changes only, not organizational reasons. Provide metadata to assert and encourage backward compatibility.

This is not a recipe for the ideal result but it is what I see as practically achievable. And fault-tolerant, in the sense that the failure of one piece would not negate the value of the others. As much as I have constrained expectations for Cloud portability, I still want it to improve to the extent possible. If we can’t get a consistent RDF-based (or RDF-like in many ways) modeling framework, let’s at least apply ourselves to properly understanding and modeling the important areas.

In addition to these general lessons, there remains the question of what specific specifications will/should transition to the Cloud universe. Clearly not all of them, since not all of them even made it in the “regular” IT management world for which they were designed. How many then? Not surprisingly (since IBM had a big role in most of them), Karla Norsworthy, in the interview mentioned above, asserts that “infrastructure as a service, or virtualization as a paradigm for deployment, is a situation where a lot of existing interoperability work that the industry has done will surely work to allow integration of services”. And just as unsurprisingly Amazon’s Adam Selipsky, who’s company has nothing to with the previous wave but finds itself in leadership position WRT to Cloud Computing is a lot more circumspect: “whether existing standards can be transferred to this case [of cloud computing] or if it’s a new topic is [too] early to say”. OVF is an obvious candidate. WS-Management is by far the most widely implemented of the bunch, so that gives it an edge too (it is apparently already in use for Cloud monitoring, according to this press release by an “innovation leader in automated network and systems monitoring software” that I had never heard of). Then there is the question of what IBM has in mind for WS-RT (and other specifications that the WS-RA working group is toiling on). If it’s not used as part of a Cloud API then I really don’t know what it will be used for. But selling it as such is going to be an uphill battle. CMDBf is a candidate too, as a model-neutral way to manage the configuration of a distributed system. But here I am, violating two of my own recommendations (“focus on models” and “don’t isolate config from other modeling aspects”). I guess it will take another pass to really learn…

[UPDATED 2009/5/7: Senior moment! When writing this entry I forgot that I wrote an earlier entry (in late 2007) specifically to describe the difference between "manageability" and "management integration". So here it is, if you care for more details on this topic.]

18
Dec
2008

Less is more: inventory of XPath subsets

by William (@vambenepe on Twitter)

Many specifications that manipulate XML content have taken the step to create their own subset of XPath. Typically, they need an XML query/pointer language but full XPath (or XPointer) is overkill for their purpose (I am talking about XPath 1.0 here, XPath 2.0 is usually over-over-kill-and-then-some). Defining a subset of XPath rather than inventing a query language from scratch is attractive because:

  • people are relatively familiar with XPath (at least the most common parts)
  • it is already specified so you can leverage the W3C spec-writing work
  • implementers who have access to an XPath engine get an implementation of the subset “for free” since the XPath engine will process statements from any XPath subset

Here is a quick inventory of spec-defined XPath subsets that I am aware of.

XML Schema

Section 3.11.6 of the XML Schema specification (part 1) defines a subset of XPath used to point to a set of elements that are the target of an identity constraint. Here is the BNF of the abbreviated form of the subset (you can also use the functionally equivalent full-length notation):

Selector   ::=    Path ( '|' Path )*
Path       ::=    ('.//')? Step ( '/' Step )*
Step       ::=    '.' | NameTest
NameTest   ::=    QName | '*' | NCName ':' '*'

Actually, there is a second subset defined, to point to the identifying key from the identified element. It is very similar to the previous one but it allows attribute nodes to be selected, via this modification:

Path       ::=    ('.//')? ( Step '/' )* ( Step | '@' NameTest )

According to the specification, these subsets were defined “in order to reduce the burden on implementers, in particular implementers of streaming processors”. As this article points out, stream-friendliness of an XPath subset is a relative notion. But the subsets above seems, indeed, to fit the bill. They also include many simplifications that “reduce the burden on implementers” but have nothing to do with streaming, such as removing functions and all predicates (rather than simply restricting the content of predicates).

WS-Management and WS-ResourceTransfer

WS-Managment also defines two subsets (it calls them dialects) of XPath. I won’t copy the BNF definitions here, as they are pretty long. You can find them in Appendix D of the specification. They are called “XPath level 1″ and “XPath level 2″.

The main use cases driving these dialects had to do with implementing WS-Management in resource-constrained environments, e.g. the board management controller of a server.

WS-ResourceTransfer took this idea from WS-Management and it too defines an “XPath level 1″ dialect (see Appendix I of the specification)

Windows EventLog Remoting Protocol v6

Version 6.0  of the Microsoft EventLog Remoting Protocol, new in Vista, adds a filter mechanism (to select events in the log) that is based on a subset of XPath. Streaming appears to be a concern there too (“evaluation of each event MUST be restricted to forward-only, in-order, depth-first traversal of the XML”). And, being Microsoft, they also add some extensions to XPath.

CMDBf (coming soon)

CMDBf also defines a subset of XPath. It is not defined via a restricted syntax but via a limitation of the type of objects returned. There is no BNF provided. You can write your XPath any way you want as long as it only returns objects of the right types (e.g. nodesets containing comment nodes are out of luck). Think of it as “management by objectives” rather than micromanagement. The main driver here is not support for  streaming. It’s that since XPath nodeset serialization is a pain we only want to do it where there is a compelling use case. There is no point creating interoperability challenges for no practical benefit.

Others?

Except for XSD, all the examples above come from the IT management world, because that’s where I live. There are probably plenty of specifications in other domains which took similar steps, such as the Digital Talking Book ANSI standard (see this section).

And those are only XPath subsets defined as part of specifications. There are also XPath subsets defined by implementations (e.g. ElementTree’s limited XPath support). And others defined for research purposes (e.g. “Univariate XPath”, from this ACM article, that is quickly described in the previously-mentioned post about stream-friendly XPath subsets).

If you know of other interesting XPath subsets, please leave a comment.

12
Nov
2008

WS Resource Access working group starting at W3C

by William (@vambenepe on Twitter)

Things went quiet for a while, but the W3C Web Services Resource Access Working Group has finally taken life, as was announced last week. It’s a well-know PR trick to announce bad news on a Friday such that it goes undetected, is it a coincidence that W3C picked a Friday for this announcement?

As you can tell by this last remark, I have no trouble containing my enthusiasm about this new group. Which should not come as a surprise to regular readers of this blog (see this, this, this and this, chronologically).

The most obvious potential pushback against this effort is the questionable architectural need to redo over SOAP what can be done over simple HTTP. Along the lines of Erik Wilde’s “HTTP over SOAP over HTTP” post. But I don’t expect too much noise about this aspect, because even on the blogosphere people eventually get tired of repeating the same arguments. If some really wanted to put up a fight against this, it would have been done when the group was first announced, not now. That resource modeling party is over.

While I understand the “WS-Transfer is just HTTP over SOAP over HTTP” argument, this is not my problem with this group. For one thing, this group is not really about WS-Transfer, it’s about WS-ResourceTransfer (WS-RT) which adds fine-grained resource access on top of WS-Transfer. Which is not something that HTTP gives you out of the box. You may argue that this is not needed (just model your addressable resources in a fine-grained way and use “hypermedia” to navigate between them) but I don’t really buy this. At least not in the context of IT management models, which is where the whole thing started. You may be able to architect an IT management system in such RESTful way, but even if you can it’s too far away from current IT modeling practices to be practical in many scenarios (unfortunately, as it would be a great complement to an RDF-based IT model). On the other hand, I am not convinced that this fine-grained access needs to go beyond “read” (i.e. no need for “fine-grained write”).

The next concern along that “HTTP over SOAP over HTTP” line of thought might then be why build this on top of SOAP rather than on top of HTTP. I don’t really buy this one either. SOAP, through the SOAP processing model (mainly the use of headers, something that WS-RT unfortunately butchers) is better suited than HTTP for such extensions. And enough of them have already been defined that you may want to piggyback on. The main problem with SOAP is the WS-Addressing tumor that grew on it (first I thoughts it was just a wart, but then it metastatized). WS-RT is affected by it, but it’s not intrinsic to WS-RT.

Finally, it would be a little hard for me to reject SOAP-based resources access altogether, having been associated with many such systems: WSMF, WSDM/WSRF, WS-Management and even WS-RT in its pre-submission days (and my pre-Oracle days). Not that I have signed away my rights to change my mind.

So my problem with WS-RAWG is not a fundamental architectural problem. It’s not even a problem with the defects in the current version of WS-RT. They are fixable and the alternative specifications aren’t beauty queens either.

Rather, my concerns are focused on the impact on the interoperability landscape.

When WS-RT started (when I was involved in it), it was as part of a convergence effort between HP, IBM, Intel and Microsoft. With the plan to use this to unify the competing WS-Management and WSDM/WSRF stacks. Sure it was also an opportunity to improve things a bit, but 90% of the value came from the convergence/unification aspect, not technical improvements.

With three of the four companies having given up on this, it isn’t much of a convergence anymore. Rather then paring-down the number of conflicting options that developers have to chose from (a choice that usually results in “I won’t pick either sine there is no consensus, I’ll just do it my own way”), this effort is going to increase it. One more candidate. WS-Management is not going to go away, and it’s pretty likely that in W3C WS-RT will move further away from it.

Not to mention the fact that CMDBf (and its SOAP-based graph-oriented query protocol) has since emerged and is progressing towards standardization. At this point, my (notoriously buggy) crystal ball shows a mix of WS-management and CMDBf taking the prize overall. With WS-Management used to access individual resources and CMDBf used to access any kind of overall system view. Which, as a side note, means that DMTF has really taken this game over (at least in the IT management domain) from W3C and OASIS. Not that W3C really wanted to be part of the game in the first place…

23
Sep
2008

State modeling: party over, go home now.

by William (@vambenepe on Twitter)

Is the Northwest weather softening Savas? Is it the food? I just read the “how do I model state? let me count the ways” article that he, Ian Foster, Paul Watson and Mark McKeown published in the September 2008 Communications of the ACM. In the article, the authors attempt to recap (and advance?) the 5 years-old debate between the WSRF, HTTP-only and “no convention” (e.g. Zen-SOAP as used in CMIS) approaches to interacting with stateful resources over the Web. If you were anywhere near OGF (then called GGF) around 2003, you know what I am talking about. And you remember how heated the arguments were. There was something about this subject (or maybe it was the people involved) that consistently generated great showmanship (and some bruised egos) in the debates.

With that in mind, reading this article felt like watching a Chinese opera adaptation of Apocalypse Now. Or listening to Heavy Metal with the base dialed down to zero.

This would have been a very useful article to have in 2003. At the time, it would have clearly framed the question, shown the overwhelming similarities and small differences between the approaches and allowed people to see that there wasn’t actually that much to debate at a fundamental level, but mainly practical considerations to juggle. It may have prevented the quasi-religious war that erupted.

It took a while, but that period of religious war is well over now and we are firmly in the “I’ve heard you, you’ve heard me, do what you want I’ll do what I want” stage. WSRF people are still doing WSRF (or equivalent like WSRT). REST people are HTTPing right and left. They don’t meet much but when they do they don’t bump shoulders anymore. And in a way this article is a good illustration of this much more dispassionate environment.

So why am I complaining? Because these fights were fun! At least from a spectator’s point of view, but I suspect that Savas and the gang had plenty of fun too (not sure about the other side who, at least at first, expected “why are you throwing away OGSI” kind of pushback rather than this more radical-sounding response).

I printed this ACM article a little bit on the off chance that it would provide some new way to look at the problem, one that hadn’t emerged in the past five years. But in retrospect I think my true motivation was that I expected it to capture, like in the days, some of the entertainment value of a radio talk show. Instead, the excitement level in this article is in the league of NPR’s StarDate astronomy report.

I feel cheated. I haven’t learned anything new and I haven’t been entertained either. This article feels like the end of the party, when the bottles are being put away, the lights are flickering and bad music is playing to nudge the last guests out of the house.

Now that I am grumpy, I guess I have to point out a few highly questionable statements in the article in retribution:

“Fortunately, there seems to be industry support for an integration of the WS-Transfer and WS-RF approaches, based on a WS-Transfer substrate – the WS-ResourceTransfer specification.” See the last two paragraphs of this entry.

“Support for WS-Addressing has since become quasi-universal, and now few find its use objectionable.” Time to pull out the Victor Hugo quote I have been saving for a special occasion: “Et s’il n’en reste qu’un, je serai celui-là“. But frankly I very much doubt that I am the only one still shaking his head sadly in contemplation of WS-Addressing.

In fact, Stu agrees with me on this (see item #6a in his list of disagreements with the article). Looks like he too was made a bit grumpy by the article, for different reasons.

There is one more debatable choice in this article, and it’s more serious than the two above. It introduces an arbitrary difference between the WS-Transfer and HTTP approaches. Compare the third lines of tables 4 and 5 (retrieving the status of a specific job). According to the article, WS-Transfer gives you the choice between two options:

  • retrieve the entire state of the job and fish for the status field inside of it (the approach in table 4), or
  • “a new operation (for example GetEPRtoPart) is defined that requests that a new state representation be exposed, through a different EPR, representing parts of the original state representation”

The way it works for HTTP, on the other hand is through an “application-specific convention” (in this example, appending “/status” at the end of the URL).

Except there is no reason why this third approach cannot be used in the WS-Transfer scenario. The article says that  “in WS-Transfer, the same effect [accessing a subset of the resource state] can be achieved, but only by defining an auxiliary operation that returns an EPR to a desired subset”. What, pray tell, prevents a WS-Transfer implementation from having an “application-specific convention” just like the HTTP kids next door? It can be at the URL level (e.g. adding “/status”). Or at the EPR reference parameter level. The latter is actually exactly what WS-Management does, using the wsman:SelectorSet header. It does not, as the article claims, define a special operation to get these fine-grained EPR. It uses an application convention to do so (which, in the case of WS-Management, happens to be “whatever Windows implements”, but that’s a different debate).

By the way, this question of “convention over specification” is where I don’t quite follow Stu (see his point #4 in his aforementioned list of disagreements) and his invocation of the “hypermedia constraint”. I don’t see how any of the four specifications he calls to the rescue (HTML form submission, XForms submission options, Atompub service documents and URI templates) would prevent me from having to have an application-specific agreement about how to retrieve the state (as opposed to another subset of the representation, like the creation date). URI templates, for example, might support how this agreement is expressed but it doesn’t replace it.

The article does a pretty good job at showing how close the alternatives are (even though, as illustrated above, it still portrays them as more different than they need to be). I am not saying it’s a bad article for the Communications of the ACM. I am saying that the Communications of the ACM is a bad medium for one of the few nerdy debates that have genuine entertainment value.

[UPDATED 2008/10/2: Jim Webber, Savas Parastatidis and Ian Robinson provide a full REST example for InfoQ: how to GET a cup of coffee. Includes state considerations discussed in the ACM article.]

11
Sep
2008

CMIS, APP, Zen-SOAP and WS-KitchenSink: some data points

by William (@vambenepe on Twitter)

The recent release of an early draft of a content management specification (CMIS, for Content Management Interoperability Services) provides an interesting perspective on not just SOAP-versus-REST but also Zen-SOAP versus WS-KitchenSink.

I know little about content management and I have no comment about the specification from that respect. Others have better informed opinions on that aspect.

What is of interest to me, and where I have some experience, is the way the spec-defined operations are bound to underlying protocols. Here is the way the specification is structured: Part I describes the data model and the operations exposed by all the services. Part II comes in two flavors: a REST binding (based on APP, the Atom Publishing Protocol) and a Web services binding (based on SOAP).

This is the first time, to my knowledge, that someone (who presumably isn’t a participant in the SOAP/REST religious war but simply wants to get something done) describes two ways to achieve a real-life task, using either APP or SOAP. I expect that this will attract a lot of attention and provide data in the SOAP versus REST debate.

But this is not what I want to write about. I’ll just point out that the REST binding specification somehow is twice as long as the SOAP binding specification, which I find intriguing but not necessarily meaningful (things are looking good for your bet Sanjiva).

What really caught my attention is how SOAP is used in CMIS. You can hardly tell it’s SOAP. CMIS just defines XML messages to be used as payload for requests and responses. You would be excused for forgetting halfway through your implementation that you’re supposed to wrap those in a SOAP envelope. Headers are a no-show. The specification says it uses SOAP faults but it actually goes out of its way to avoid the existing elements for fault code and fault message and instead invent its own. The only SOAP feature it really uses is MTOM.

Except for the MTOM part, this reminds me of what SOAP was at the beginning of the decade, before any header had been defined (other than those used as illustration in the SOAP specification itself). I want to call it Zen-SOAP, by opposition to the WS-KitchenSink approach in which even simple, synchronous, clear-text, request-response SOAP exchanges somehow get saddled with a half dozen WS-Addressing headers before they’ve even left the gate (did I mention that I don’t like WS-Addressing?).

Another comedian in the WS-KitchenSink theater troupe is the WS-Transfer stack and especially WS-ResourceTransfer (WS-RT). Unless I read too much into this draft of CMIS, its content is devastating in two ways for WS-ResourceTransfer: in one fell swoop it shows that the specification is mostly useless and it destroys the argument that WS-ResourceTransfer needs to be stand-alone as opposed to just a part of WS-Management.

In “who needs XPath fragment-level PUT?”, I tried to make the case that the use of XPath in WS-RT to do fine-grained updates is a case of over-engineering. That there is no real need for it. Still, in that article I try to think of cases where the feature might be justified. I came up with two and I wrote that “one is if the resource actually is a document (as opposed to having its state represented by a document). For example, a wiki page”. But I dismissed it because wiki-land is REST country. I didn’t think of it at the time, but there is an “enterprise” version of wiki, a world in which, presumably, SOAP is well-regarded: Content Management Systems. Surely, if there is a domain that needs a fine-grained SOAP-based document editing protocol it’s the CMS world.

Today’s release of CMIS demolishes this use case with two punches to the guts:

  • They do have a query language, but it is SQL-based, not XPath-based.
  • The query is only used for reads, not for updates. Updates are done through specialized operations (addObjectToFolder, moveObject, updateProperties, createRelationship…).

This goes beyond not using a generic fine-grained update mechanism. It also goes against using any generic GET/SET operation. The blow reaches all the way to WS-Transfer. For all this, CMIS comes out a much simpler specification and it also frees itself from the web of dependencies (on specifications at different stages of standardization) that has plagued specifications that use WS-Transfer and will plague WS-Federation for using WS-RT.

It will be interesting to see what happens when the WS-* architects and Microsoft and IBM get hold of the CMIS specification and of its authors in their companies. I am especially worried about the fate of the IBM CMIS authors. The recent news about Oslo show that the XML people at Microsoft are a lot more willing to put the XML tools back in the box when needed.

In truth, the CMIS authors do appear to need some help from the SOAP experts in their companies, if only to fix the way they use SOAP faults and to help the poor soul who put this comment in the WSDL:

<!– had to use include – .net wsdl.exe code generator doesn’t seem to like imports on the schema –>

But they might be getting more “suggestions” than they bargained for. In the same way that the WS-Federation folks were going on their own merry way until it was “suggested” to them by someone (who probably had an agenda) to use WS-RT. I’ll try to keep an eye on how CMIS evolves.

In the meantime, I find in CMIS data points that reinforce my opinion that WS-Transfer should be absorbed by WS-Management, WS-MeX and WS-Federation should return to defining their own operations and WS-RT should be left to die (or, for a more positive spin, be used as inspiration in the next version of WS-Management).

[UPDATED 2008/10/02: Roy Fielding doesn't like the so-called-RESTful binding. Sam Ruby cautiously defends it. Links via Billy Cripe.]

[UPDATED 2009/5/1: For some reason this entry is attracting a lot of comment spam, so I am disabling comments. Contact me if you'd like to comment.]

10
Jul
2008

WS Resource Access at W3C: the good, the bad and the ugly

by William (@vambenepe on Twitter)

As far as I know, the W3C is still reviewing the proposal that was made to them to create a new working group to standardize WS-Transfer, WS-ResourceTransfer, WS-Enumeration and WS-MetadataExchange. The suggested name, “Web Services Resource Access Working Group” or WS-RAWG is likely, if it sticks, to end up being shortened to WS-RAW. Which is a bit more cruel than needed. I’d say it’s simply half-baked.

There are many aspects to the specifications and features covered by the proposal. Some goodness, some badness and some ugliness. This post analyzes the good, points at the bad and hints at the ugly. Like your average family-oriented summer movie.

The good

The specifications proposed for W3C standardization describe a way to provide some generally useful features for SOAP messages. Some SOAP messages can get very long. In some cases, I know ahead of time what portion of the long messages promised by the contract (e.g. WSDL) I want. Wouldn’t it be nice, as an optimization, to let the message sender know about this so they can, if they are able to, filter down the message to just the part I want? Alternatively, maybe I do want the full response but I can’t consume it as one big message so I would like to get it in chunks.

You’ll notice that the paragraph above says nothing about “resources”. We are just talking about messaging features for SOAP messages. There are precedents for this. WS-Security can be used to encrypt a message. Any message. WS-ReliableMessaging can be used to ensure delivery of a message. Any message. These “quality of service” specifications are mostly orthogonal to the message content.

WS-RT and WS-Enumeration provide a solution to the “message filtering” and “message chunking”, respectively. But they only address them in the context of a GET-like operation. They can’t be layered on top of any SOAP message. How useful would WS-Security and WS-ReliableMessaging be if they had such a restriction?

If W3C takes on part of the work listed in the proposal, I hope they’ll do so in a way that expends the utility of these features to all SOAP messages.

And just like WS-Security and WS-ReliableMessaging, these features should be provided in a way that leverages the SOAP processing model. Such that I can judiciously use the soap:mustUnderstand header to not break existing services. If I’d like the message to be paired down but I can handle the complete message if need be, I’ll set this attribute to false. If I can’t handle the full message, I’ll set the attribute to true and I’ll get an error if the other party doesn’t understand this extension. At which point I can pick an alternative way to get the task accomplished. Sounds pretty basic but it’s amazing how often this important feature of SOAP (which heralds from and extends XML’s must-ignore semantics) is neglected and obstructed by designers of SOAP messages.

And then there is WS-MetadataExchange. While I am not a huge fan of this specification, I agree with the need for a simple, reliable way to retrieve different types of metadata for an endpoint.

So that’s the (potential) good. A flexible and generally useful way to pair-down long SOAP messages, to chunk them and to retrieve metadata for SOAP endpoints.

The bad

The bad is the whole “resource access” spin. It is not actually intrinsically bad. There are scenarios where such a pattern actually fits. But the way that pattern is being addressed by WS-RT and friends is overly generalized and overly XML-centric. By the latter I mean that it takes XML from an agreed-upon on-the-wire interchange format to an implicit metamodel (e.g. it assumes not just that you agree to exchange XML-formated data but that your model and your business logic are organized and implemented around an XML representation of the domain, which is a much more constraining requirement). I could go on and on about this, especially the use of XPath in the PUT operation. In fact I did go on and on with it, but I spun that off as a separate entry.

In the context of the W3C proposal at hand, this is bad because it burdens the generally useful features (see the “good” section above) with an unneeded and limiting formalism. Not to mention the fact that W3C kind of already has its resource access mechanism, but I’ll leave that aspect of the question to Mark and various bloggers (see a short list of relevant posts at the end of this entry).

The resource access part might be worth doing (one more time), but probably not in the same group as things like metadata discovery, message filtering and message chunking, which are not specific to “resource access” situations. And if someone is going to do this again, rather than repeating the not too useful approaches of the past, it may be good to consider alternatives.

The ugly

That’s the politics around this whole deal. There is, as you would expect, a lot more to it than meets the eye. The underlying drivers for all this have little to do with REST/WS or other architecture considerations. They have a lot to do with control. But that’s a topic for another post (maybe) when more of it can be publicly discussed.

A lot of what I describe in this post was already explained in the WS-ManagementHammer post from a couple of months ago. But that was before the W3C proposal and before WS-MetadataExchange was dragged into the deal. So I thought it might be useful to put the analysis in the context of that proposal. And BTW, this is a personal opinion, not an Oracle position (which is true in general for everything on this blog but is worth repeating specifically for this post).

10
Jul
2008

Who needs XPath fragment-level PUT?

by William (@vambenepe on Twitter)

WS-Management and WS-ResourceTransfer (WS-RT) both provide a mechanism to modify the XML representation of the state of a resource in a fine-grained way. The mechanisms differ a bit: WS-Management defines a SOAP header and distinguishes PUT from DELETE at the WS-Transfer operation level, while WS-RT uses the SOAP body and tunnels “modes” (remove, modify, insert) on top of the PUT WS-Transfer operation. But in their complete form both use XPath to point to any arbitrary nodeset and update it.

WS-ResourceProperties (WS-RP) takes a simpler approach. While it too supports XPath-driven retrieval of the content, it doesn’t attempt to provide an XPath-like level of flexibility when it comes to updating the content. All it offers is SET, INSERT, UPDATE and DELETE operations at the level of a property (a top-level child of the XML representation) and nothing more granular.

In this respect at least, WS-RP makes a better choice than its competitor and its aspiring successor.

First, XPath-driven updates sound easy but in fact are hard to specify. Not surprisingly, the current specifications do a pretty incomplete job at it. They often seem to assume that the XPath used to target the value to change returns only one node, but nothing guarantees this. If it picks up more than one node, do you replace all these nodes by the new values as a block (the new values get inserted once, presumably at the location of the first selected node) or do you replace each selected node by all the new values (in which case they get duplicated as needed)? Also, the specifications say nothing about what constitutes compatibility between the targeted nodes and the replacement nodes. One might assume that a “don’t be stupid” approach is all that’s needed. But there is no obvious line between “stupid” and “useful”. Does a request to replace a text node by an attribute node make sense? Not in a strongly-typed world, but a more forgiving implementation might just insert the text value of the attribute in the place of the text node to get to a valid result. What about replacing an element by a text node? Some may reject it for incompatible types but, unless the schema prevents mixed content, it may well result in a perfectly valid document. All in all, specifying a reliable way to edit XML is a pretty hairy task. Much harder than reading XML. It requires very careful considerations that have very little do with on-the-wire protocol considerations. Which is why doing this as part of a SOAP specification is a strange choice. The XQuery group is much more qualified for this. There must be a reason why that group decided to punt on this until they had taken care of the easier “read” case.

Second, it’s usually not all that useful anyway. Which is why the lack of precision in WS-Management’s specification of the fragment PUT haven’t really been a problem so far: people haven’t fully implemented that feature. A lot of the implementations are backed by a CIMOM, an MBean or some other OO store. In these stores, the exposed granularity is typically at the attribute level. The interactions used by programmers and consoles are also at that level. The XPath-driven update is then only used as a mechanism to update many properties at once (rather than going deep into individual properties) but that’s using a machine gun to kill a fly. The WS-RP approach supports these use cases without calling on XPath.

Third, XPath-driven PUT is really hard to implement unless your back-end store happens to be an XML database. You may end up having to write your own XPath parser and interpreter, an exercise during which you will face some impedance mismatches. Your back-end store may not have notions of property order for example, or attribute versus element. How do you handle these XPath instructions? And what kind of interoperability results from implementers having to make these decisions on their own? Implementing XPath selection on a GET is a lot simpler. All it assumes is that there is an XML serialization of the result, on which you can run the XPath expression before shipping it out. That XML serialization is a given in the SOAP world already. But doing an XPath-driven PUT injects XML considerations in your store itself, not just in the communication path.

Those are the practical reasons. In short, it makes the specifications at best complex and at worst non-interoperable, for a feature that is rarely needed. That should be enough already, but there are some architectural reasons to stay away too.

WS-Transfer is sometimes sold for REST over SOAP. And fragment-level WS-Transfer (what WS-Management and WS-RT do) is then REST on steroids. Sorry, not true. REST on crack if anything.

I am not a REST expert, but I know enough to understand that “everything has a URI” really means “anything meaningful has a URI”. It’s the difference between a crystal structure and a pile of mud. REST lets you interact directly with any node in the crystal, but there is a limited number of entities that are considered worthwhile of being a node. There is design involved (sorry, you can’t suddenly fire your architects, as attractive as that sounds). You can’t point to the space between two nodes in the crystal. XPath-on-top-of-WS-Transfer, on the other hand, lets you plunge your spoon anywhere in the pile of mud and scoop out whatever happens to be there.

Let’s take a look at WS-Federation (here is the latest draft), the only specification in a standard body that I know of that is currently using WS-RT. Whether it’s a wise choice or not for them, from a governance perspective, is a separate topic that I won’t cover here (answer: no. oops).

From a technical perspective, it is interesting to see how they went about using WS-RT PUT. They use it to update pseudonyms. But even though there is an XML representation for the pseudonyms, they don’t want to allow users to update any arbitrary part of that XML. So they create a specific dialect (the fed:FilterPseudonyms defined in section 6.1) that lets you, based on semantics that are meaningful in the specific domain covered by the specification, point to pseudonyms.

I believe most potential users of WS-RT PUT are in the same case as WS-Federation and are better served by a domain-specific way to identify entities of interest. At least the WS-Federation authors realized it rather than saying “great, WS-RT XPath fragment PUT gives us all this flexibility for free” and settling their implementers with the impossible task of producing interoperable implementations. Of course this begs the question of why WS-Federation uses WS-RT in the first place. A charitable interpretation is to pin this on overzealous re-use of all things WS-*. A more cynical interpretation sees this as a contrived precedent manufactured in an attempt “prove” that WS-RT provides features of general use rather than specific to the management domain.

Having described at length why XPath-driven updates aren’t as useful as they may seem, I can still think of two cases where a such a generic mechanism to modify an XML document could be useful. One is if the resource actually is a document (as opposed to having its state represented by a document). For example, a wiki page. But I haven’t exactly noticed wiki creators and users clamoring for wiki-over-SOAP, have you? The other situation is if you have a true model-driven system that is supported by a comprehensive system description and validation framework. The kind of thing that SML is trying to deliver. By using Schematron (rather than just XSD which is very limited in its expressivity beyond mere syntactical validation) to provide model validation. This would, in theory, allow the requester to validate the updated model before sending the change request. The change would still be validated on the receiver side (either explicitly or implicitly because a non-valid new model would simply fail when applied to the system), but the existence of the validation framework guarantees a high rate of successs (the sender would rarely send non-valid change requests). That’s very nice and exciting, but we don’t have this. SML is, as far as I can see, going nowhere fast in terms of adoption. Standardizing a model exchange protocol for that use case is, at this point in time, premature. Maybe one day.

23
Jun
2008

WS-Transfer, WS-ResourceTransfer, WS-Enumeration and WS-MetadataExchange on their way to W3C

by William (@vambenepe on Twitter)

A bit over a month ago, I mentioned my hope that WS-ResourceTransfer (WS-RT) would be allowed to rest in peace. This is apparently not to be and the specification is now on its way to W3C, along with WS-Transfer, WS-MetadataExchange and WS-Enumeration. This is not all that surprising and I had even hazarded a guess of who would join IBM in doing this. My list was IBM, CA, Fujitsu and Cisco. I got three out of four right, but Oracle replaced Cisco. The fact that the company I got wrong happens to be my employer is something I can’t really comment on, other than acknowledging the irony…

This is a very important development in the area of management standards. Some of the specifications listed here are used by WS-Management. They are also clearly intended to replace the WS-ResourceFramework stack that underpins WSDM. This is especially true of WS-RT which almost directly overlaps with WS-ResourceProperties. Users of both WS-Management and WSDM will take notice. As will those who have been standing on the side, waiting for things to stabilize…

If you are trying to relate this announcement to the WS-Management/WSDM convergence previously going on between Microsoft, IBM, HP and Intel (which is the forum in which WS-RT was originally produced), it looks like this is what the “convergence” has turned into. Except that three of the four vendors seem to have dropped out, thus my quotation marks around the word “convergence”.

The applicability of these specifications outside of the management domain seems to be assumed in this submission. It’s been often asserted but, in my mind, not yet proven. I don’t see the use of WS-RT by WS-Federation as a proof of this relevance (one of these days I’ll write a post to explain why).

It will be interesting to see how the W3C responds to this offer. The expected retort didn’t take long. If WS-RT wasn’t allowed to rest in peace, it won’t be allowed to REST in peace either. You can expect the blogosphere to light up with “WS-Transfer for RESTful applications” discussions (mostly making fun of WS-Transfer’s HTTP envy) very soon. Even though that’s just one of the many angles from which you can view this development, and not the most interesting one.

[UPDATED 2008/7/6: It took a little longer than expected, but the snarky/ironic blog posts have started: Steve, Mark, Tim, Bill, Stefan]

14
May
2008

WS-ManagementHammer: don’t do it but if you are going to do it anyway then…

by William (@vambenepe on Twitter)

With the IBM/Microsoft/Intel/HP WSDM/WS-Management convergence now implicitly (if not yet officially) dead, it will be interesting to see what IBM is going to do with WSRF. WSRF is being used today, rarely explicitly but rather in an embedded fashion. People who use WSDM use it, people who use CDDLM use it, people who use the Globus Toolkit use it, etc. IBM could write off the convergence work (WS-ResourceTransfer, which was published as a draft, and WS-ResourceEnumeration and WS-EventNotification which were never published) and stick to using the existing WSRF specifications when they need the corresponding functionality. That’s what I hope they do.

Alternatively, they could decide to get the forceps out of the drawer. They can create a new, IBM-friendly (e.g. Fujitsu, CA, Cisco…) private consortium to take over the unfinished drafts (if the IBM/Microsoft/Intel/HP legal agreement allows this) or start new ones. Or they could go directly to W3C, OASIS or OGF and push for a new working group to do the work in the open (and since no-one else would really care about this work IBM should have relatively free hands there, the way Microsoft did in DMTF when IBM chose to boycott WS-Management). Why W3C would care and why OASIS or OGF would want to start commitees to obsolete their existing work is a separate question.

While I hope that IBM doesn’t try to push another pile of WS-* resouce management specifications on an industry that already has too many, if they do I hope that at least they’ll do it right. And that means doing away with the approach embedded in WS-ResourceTransfer. Having personally been involved in many iterations on this problem, I hope to have some insight to contribute.

Along the lines of the age-old parental advice “don’t do it but if you are going to do it then use a condom”, here is my advice to anyone thinking of doing another iteration on the WSRF question: don’t do it but if you are going to do it then be specific about what problem you are addressing.

First, let’s separate three scenarios.

Database query

WS-ResourceTransfer should not be seen as a way to query an XML database. Use XQuery for this.

REST

While architecturally it should be possible to build RESTful applications on top of WS-Transfer’s operations, this is simply not what is happening. WS-Transfer is being used either by CIM people (who get to it via WS-Management) or by big-SOA people (who get is as part of the whole WS-* stack) and neither of them is doing anything remotely RESTful. So just leave that aside and don’t see WS-ResourceTransfer as a way to do “fine-grained REST”. No REST user is loosing sleep over WS-ResourceTransfer being in limbo.

A flexible way to interact with a complex system

This is the use case that you should focus on. You have a system made up of many parts (e.g. a composite application or a server that is made of many components) that you can represent as an XML document. The XML repesentation contains some important information about the system, but it isn’t the system. There are identified resources within the system that have lifecycles, management capabilities and internal parameters. Not everything relevant is captured in the XML model. This is why it is different from an XML database.

In general, I don’t think that XML is the best way to represent complex IT systems. It has plenty of complications that are not relevant to IT management and it doesn’t elegantly support the representation of graphs, often the most natural way to represent such a system (more on this here). CMDBf, with its graph-oriented approach, is a better choice in general. But there are plenty of areas (especially smaller, well-defined, sub-systems) in which XML formats have been defined to represent systems. SCA and SML for example.

In the case where you are dealing with such an XML-described system, then there is value in standard ways to simplify interactions with the system and its parts. But here too, we need to distinguished different patterns rather than trying to handle them all in the same way.

Filtering/sequencing of returned data

Complex IT systems can generate a lot of configuration and/or monitoring data and often you only care for a small subset. For example, an asset record has dozens of elements (lease terms, owner, assigned user…) but you may only care to retrieve the date the lease expires. When you do a GET on the record, you want to qualify it by specifying that only that date needs to be returned. That’s what WS-RP, WS-RT and the WS-Management wsman:TransferFragment header allow. In a variation of this, you want all the data but you don’t want it in one go, you want to pull it piece by piece. That’s what WS-Enumeration gives you. The problem with all these specifications is that they only offer that feature when you are retrieving the resource representation (a WS-Transfer GET or equivalent), not for other operations. But how is this different from invoking an AirlineBooking operation and saying that you only want to be sent the confirmation code, not the full itinerary, equipment type, assigned seat, etc? Bundling this inside WS-RT (or equivalent) is not helpful. A generic SOAP header that can go on any message would be more appropriate (the definition of this header would need to pay special attention to security considerations, especially if the response is signed, because it could be abused to trick the server into sending, and signing, specifically-crafted messages).

Interacting with a sub-element of the system

If you have a handle to a computer system resource and you know that it has one CPU and that this CPU is represented by the /comp:CPU element of the system, why would you need to use some out-of-band discovery mechanism to interact with that CPU? It’s right there, you can see it, you can point to it. Surely there must be a way to address operations to it directly, right? WS-Management tries to do it with its wsman:Selector mechanism, but the selectors are not tied to the model and require, effectively, a separate out-of-band agreement for addressing. There shouldn’t be a need for such an additional agreement once an agreement has already been reached on the model.

What is needed is a way, for systems that have a known XML model, to address message to subpart by using the model itself to support that addressing. Call it SOAPy mashup if you want to feel like you are part of the cool kids. I described such a mechanism a while ago. In effect, it is an improvement on wsman:Selector that an eventual new iteration of WSRF should at least consider.

In some cases, namely when the operation is a WS-Transfer GET, this capability overlaps with the “filtering of returned data” capability. One way to look at it is that you are doing a GET at the level of the overall computer system and filtering the results down to the part that represents the CPU. Another way to look at it is that you are pinpointing the message to a subset of the model (the CPU part) and doing an unmodified GET on it. It doesn’t matter how you choose to think about it. In my proposal, these two ways produce the same message. Like the wave view and particle view of a photon, that in the end, describe the same physical entity with each being the best representation for a set of situations.

The problem with WS-RT and its predecessors is that it doesn’t recognise that this is just the intersection of two orthogonal concerns (filering of output versus addressing of sub-elements) and only handles that intersection.

Interacting with a set of resources as a set

The same kind of expression (typically XPath) that lets you point at a sub-element inside of a system also lets you point at a set of such sub-elements. But even though from an XPath perspective there isn’t much of a different (the first one just happens to return a nodeset that contains only one node), from an architectural perspective it is a very different use case. If you want to support such a use case then you have handle it as such and define all the associated semantics (sequential/parallel execution, fault handling, partial completion, resource-specific permissions…). You can’t just cross your fingers and assume that you get such features “for free” just because XPath can return a nodeset.

I know that this post illustrates a way of giving free advice that virtually ensures that it gets ignored. Similar (if you’ll allow the big stretch) to the way Chirac and Villepin were arguing againt an Iraq invasion in ways that probably reinforced the Bush administration’s determination to do it. When will the world finally learn to appreciate the oh-so-slightly obnoxious undertone that is inherently French (because, let me tell you, we’re not about to loose it)? At least, when my grandchildren ask me “where were you when IBM invented WS-ManagementHammer?” I can point to this post and say “I tried to stop it, I tried”.

[UPDATED 2008/5/15: How timely! Just after publishing this I find, via Coté, what looks like another example of French abrasiveness in the systems management world: the attitude, name and the way Jeff ends with a French-language quote make it quite likely that the "Jacques" person discounting the fact that his company's SNMP agent is broken is indeed a compatriot. French obnoxiousness aside, and despite my respect for standards, my advice to Jeff is that if a given SNMP agent works with HP, IBM, BMC and CA you will probably save yourself time in the long run by finding a way to support it (even if it is not spec-compliant) rather than getting the vendor to change. There are lots of sites out there that work fine with Firefox and IE but are not compliant with Web standards. Good luck getting them all fixed.]

[UPDATED 2008/7/14: I don't really plan to turn this post into a ongoing set of updates about "French attitude" but since today is Bastille Day I'll point to this map of the world as seen from Paris. If I wasn't on strike right now, I'd explain why the commenter is wrong to assert that "French self-deprecating humour" is rare.]

07
May
2008

The elusive XPath nodeset serialization

by William (@vambenepe on Twitter)

I have been involved in various capacity with five different specifications that define a GET (or GET-like) operation that takes as input an XPath expression used to pinpoint the subset of the XML document that should be retrieved (here is a quick history as of a couple of years ago, more has happened since). And I must shamefully admit that all but one are simply impossible to implement in an interoperable way.

That’s because they instruct implementers to return an XPath nodeset in the response SOAP message but say nothing about how to serialize the nodeset. While an XPath nodeset contains the kind of things that make up an XML document, it is not an XML document by itself. There is an infinite number of possible ways to serialized an XPath nodeset into XML. To have any hope of interoperability on this, a serialization algorithm has to be clearly described by the specification. Which hasn’t happened.

Let’s start with WS-ResourceProperties (WS-RP). It has a QueryResourceProperties operation that takes an XPath expression as input. The specification says that “the response MUST contain an XML serialization of the results of evaluating the QueryExpression against the resource properties document“. Great, thanks. The example provided happens to return a nodeset with only one node (a boolean), which is implicitly serialized into the text representation of that boolean. What if there is more than one node in the nodeset? What about other types of nodes?

Moving on to WS-Management, which defines a SOAP header that uses XPath to qualify a WS-Transfer GET request such that it only retrieves a subset of the target XML document. While it does a better job than WS-RP at describing the input (e.g. it specifies the context node and what namespace declarations are in scope for the XPath evaluation) it is even more cavalier than WS-RP in describing the output: “the output (lines 53-55) is like that supplied by a typical XPath processor and might or might not contain XML namespace information or attributes“. By “a typical XPath processor” we should understand MSXML I suppose. But as far as I know a “typical XML processor” doesn’t return XML, it returns language-specific data structures (e.g. a C# or Java object, like a nu.xom.Nodes instance). And here too, the examples only use single-node nodesets.

WS-ResourceTransfer (WS-RT) was supposed to be the convergence of these two efforts, so presumably it would have learned from their mistakes. While it is better written in general than its predecessors, it fails just as badly with regards to specifying the nodeset serialization. And once again, the example provided uses a nodeset with just one node.

And then came the CMDBf query operation which, for some unclear reason, was deemed in need of a built-in XPath transformation of records. As I pointed out in my review of CMDBf 1.0 at the time, this feature was added without taking the pain to define the XML serialization of the resulting nodeset. And there isn’t even an example of the XPath serialization.

It is sad in a way, but the only specification that acknowledges the problem and addresses it came before any of the four above even got started. It is the WSMF (Web Services Management Framework) work that we did at HP, and more specifically the “note on dynamic attributes and meta information” (not available at HP anymore but available from archive.org) . This specification was the first one to define a GET operation that is qualified by an XPath expression. Unlike its successors it also explicitly narrowed down the types of nodes that could be selected (“The manager MUST NOT send as input an XPath statement that returns a nodeset containing nodes other than element, attribute and namespace nodes“). And for those valid types it described how to serialized them in XML (“When a node in the result nodeset is an attribute node, for the sake of the response it is serialized as an element node which has the same name as the name of the original attribute (see example 4 for an illustration). The element is in the same namespace as the namespace the attribute it represents is in. This applies to namespace nodes as well, they are serialized like an attributes in the xmlns namespace“). Turning an attribute into an element of the same QName might not be the smartest thing in retrospect (after all there may be an element by that QName already) but at least we recognized and addressed the problem.

But all is good now, I am told, because XPath 2.0 is here, along with a clean data model and a well-described serialization.

Not so. Anyone wanting to use XPath for a SOAP-based query language still would have to specify a serialization.

The first problem with the W3C serialization is that the XML output method doesn’t work for all nodesets. Try to use it on a nodeset that contains a top-level attribute node and you get error err:SENR0001. And even for the nodesets it accepts, it sometimes returns less-than-useful results. For example, if your XPath is of the form /employee/name/text() and you have four employees, the result will look something like this:

“Joe SmithKathy O’ConnorHelen MartinBrian Jones”

Concatenated text values without separators. I guess W3C is like a department store, they don’t offer complimentary wrapping anymore…

That’s why the nux.xom.xquery.ResultSequenceSerializer class had to define its own wrapping mechanims to produce a useful XML serialization. The API gives you the choice between the W3C_ALGORITHM and the WRAP_ALGORITHM.

Bottom line, and however much some would like to think of it that way, XPath (1 or 2) is not an XML subsetting/transformation mechanism. It could be used to create one (as XSLT does), but you have to do your own plumbing.

In addition to the technical aspects of this discussion, what else can be learned from this sad state of things? The fact that all these specifications define an XPath-driven query mechanism that is simply broken (beyond the simplest use cases) withouth anyone even noticing tells me that there isn’t a real need for full XPath query over SOAP (and I am talking about XPath 1.0, the introduction of XPath 2.0 in CMDBf is even more out there). A way to retrieve individual elements (and maybe text values) is all that is needed for 99% of the use cases addressed by these specifications. Users would be better served (especially in a version 1.0) by specifications that cover the simple case correctly than by overly generic, complex and poorly documented features. There is always time to add features later if the initial specification is successful enough that users encounter its limitations.

Categories