William Vambenepe's blog

IT management in a changing IT world

free russian ringtonesvulgar ringtonescrazy frog ringtoneselo ringtones

Archive for the 'Specs' Category

14
May
2008

WS-ManagementHammer: don’t do it but if you are going to do it anyway then…

by William Vambenepe

With the IBM/Microsoft/Intel/HP WSDM/WS-Management convergence now implicitly (if not yet officially) dead, it will be interesting to see what IBM is going to do with WSRF. WSRF is being used today, rarely explicitly but rather in an embedded fashion. People who use WSDM use it, people who use CDDLM use it, people who use the Globus Toolkit use it, etc. IBM could write off the convergence work (WS-ResourceTransfer, which was published as a draft, and WS-ResourceEnumeration and WS-EventNotification which were never published) and stick to using the existing WSRF specifications when they need the corresponding functionality. That’s what I hope they do.

Alternatively, they could decide to get the forceps out of the drawer. They can create a new, IBM-friendly (e.g. Fujitsu, CA, Cisco…) private consortium to take over the unfinished drafts (if the IBM/Microsoft/Intel/HP legal agreement allows this) or start new ones. Or they could go directly to W3C, OASIS or OGF and push for a new working group to do the work in the open (and since no-one else would really care about this work IBM should have relatively free hands there, the way Microsoft did in DMTF when IBM chose to boycott WS-Management). Why W3C would care and why OASIS or OGF would want to start commitees to obsolete their existing work is a separate question.

While I hope that IBM doesn’t try to push another pile of WS-* resouce management specifications on an industry that already has too many, if they do I hope that at least they’ll do it right. And that means doing away with the approach embedded in WS-ResourceTransfer. Having personally been involved in many iterations on this problem, I hope to have some insight to contribute.

Along the lines of the age-old parental advice “don’t do it but if you are going to do it then use a condom”, here is my advice to anyone thinking of doing another iteration on the WSRF question: don’t do it but if you are going to do it then be specific about what problem you are addressing.

First, let’s separate three scenarios.

Database query

WS-ResourceTransfer should not be seen as a way to query an XML database. Use XQuery for this.

REST

While architecturally it should be possible to build RESTful applications on top of WS-Transfer’s operations, this is simply not what is happening. WS-Transfer is being used either by CIM people (who get to it via WS-Management) or by big-SOA people (who get is as part of the whole WS-* stack) and neither of them is doing anything remotely RESTful. So just leave that aside and don’t see WS-ResourceTransfer as a way to do “fine-grained REST”. No REST user is loosing sleep over WS-ResourceTransfer being in limbo.

A flexible way to interact with a complex system

This is the use case that you should focus on. You have a system made up of many parts (e.g. a composite application or a server that is made of many components) that you can represent as an XML document. The XML repesentation contains some important information about the system, but it isn’t the system. There are identified resources within the system that have lifecycles, management capabilities and internal parameters. Not everything relevant is captured in the XML model. This is why it is different from an XML database.

In general, I don’t think that XML is the best way to represent complex IT systems. It has plenty of complications that are not relevant to IT management and it doesn’t elegantly support the representation of graphs, often the most natural way to represent such a system (more on this here). CMDBf, with its graph-oriented approach, is a better choice in general. But there are plenty of areas (especially smaller, well-defined, sub-systems) in which XML formats have been defined to represent systems. SCA and SML for example.

In the case where you are dealing with such an XML-described system, then there is value in standard ways to simplify interactions with the system and its parts. But here too, we need to distinguished different patterns rather than trying to handle them all in the same way.

Filtering/sequencing of returned data

Complex IT systems can generate a lot of configuration and/or monitoring data and often you only care for a small subset. For example, an asset record has dozens of elements (lease terms, owner, assigned user…) but you may only care to retrieve the date the lease expires. When you do a GET on the record, you want to qualify it by specifying that only that date needs to be returned. That’s what WS-RP, WS-RT and the WS-Management wsman:TransferFragment header allow. In a variation of this, you want all the data but you don’t want it in one go, you want to pull it piece by piece. That’s what WS-Enumeration gives you. The problem with all these specifications is that they only offer that feature when you are retrieving the resource representation (a WS-Transfer GET or equivalent), not for other operations. But how is this different from invoking an AirlineBooking operation and saying that you only want to be sent the confirmation code, not the full itinerary, equipment type, assigned seat, etc? Bundling this inside WS-RT (or equivalent) is not helpful. A generic SOAP header that can go on any message would be more appropriate (the definition of this header would need to pay special attention to security considerations, especially if the response is signed, because it could be abused to trick the server into sending, and signing, specifically-crafted messages).

Interacting with a sub-element of the system

If you have a handle to a computer system resource and you know that it has one CPU and that this CPU is represented by the /comp:CPU element of the system, why would you need to use some out-of-band discovery mechanism to interact with that CPU? It’s right there, you can see it, you can point to it. Surely there must be a way to address operations to it directly, right? WS-Management tries to do it with its wsman:Selector mechanism, but the selectors are not tied to the model and require, effectively, a separate out-of-band agreement for addressing. There shouldn’t be a need for such an additional agreement once an agreement has already been reached on the model.

What is needed is a way, for systems that have a known XML model, to address message to subpart by using the model itself to support that addressing. Call it SOAPy mashup if you want to feel like you are part of the cool kids. I described such a mechanism a while ago. In effect, it is an improvement on wsman:Selector that an eventual new iteration of WSRF should at least consider.

In some cases, namely when the operation is a WS-Transfer GET, this capability overlaps with the “filtering of returned data” capability. One way to look at it is that you are doing a GET at the level of the overall computer system and filtering the results down to the part that represents the CPU. Another way to look at it is that you are pinpointing the message to a subset of the model (the CPU part) and doing an unmodified GET on it. It doesn’t matter how you choose to think about it. In my proposal, these two ways produce the same message. Like the wave view and particle view of a photon, that in the end, describe the same physical entity with each being the best representation for a set of situations.

The problem with WS-RT and its predecessors is that it doesn’t recognise that this is just the intersection of two orthogonal concerns (filering of output versus addressing of sub-elements) and only handles that intersection.

Interacting with a set of resources as a set

The same kind of expression (typically XPath) that lets you point at a sub-element inside of a system also lets you point at a set of such sub-elements. But even though from an XPath perspective there isn’t much of a different (the first one just happens to return a nodeset that contains only one node), from an architectural perspective it is a very different use case. If you want to support such a use case then you have handle it as such and define all the associated semantics (sequential/parallel execution, fault handling, partial completion, resource-specific permissions…). You can’t just cross your fingers and assume that you get such features “for free” just because XPath can return a nodeset.

I know that this post illustrates a way of giving free advice that virtually ensures that it gets ignored. Similar (if you’ll allow the big stretch) to the way Chirac and Villepin were arguing againt an Iraq invasion in ways that probably reinforced the Bush administration’s determination to do it. When will the world finally learn to appreciate the oh-so-slightly obnoxious undertone that is inherently French (because, let me tell you, we’re not about to loose it)? At least, when my grandchildren ask me “where were you when IBM invented WS-ManagementHammer?” I can point to this post and say “I tried to stop it, I tried”.

[UPDATED 2008/5/15: How timely! Just after publishing this I find, via Coté, what looks like another example of French abrasiveness in the systems management world: the attitude, name and the way Jeff ends with a French-language quote make it quite likely that the "Jacques" person discounting the fact that his company's SNMP agent is broken is indeed a compatriot. French obnoxiousness aside, and despite my respect for standards, my advice to Jeff is that if a given SNMP agent works with HP, IBM, BMC and CA you will probably save yourself time in the long run by finding a way to support it (even if it is not spec-compliant) rather than getting the vendor to change. There are lots of sites out there that work fine with Firefox and IE but are not compliant with Web standards. Good luck getting them all fixed.]

[UPDATED 2008/7/14: I don't really plan to turn this post into a ongoing set of updates about "French attitude" but since today is Bastille Day I'll point to this map of the world as seen from Paris. If I wasn't on strike right now, I'd explain why the commenter is wrong to assert that "French self-deprecating humour" is rare.]

07
May
2008

The elusive XPath nodeset serialization

by William Vambenepe

I have been involved in various capacity with five different specifications that define a GET (or GET-like) operation that takes as input an XPath expression used to pinpoint the subset of the XML document that should be retrieved (here is a quick history as of a couple of years ago, more has happened since). And I must shamefully admit that all but one are simply impossible to implement in an interoperable way.

That’s because they instruct implementers to return an XPath nodeset in the response SOAP message but say nothing about how to serialize the nodeset. While an XPath nodeset contains the kind of things that make up an XML document, it is not an XML document by itself. There is an infinite number of possible ways to serialized an XPath nodeset into XML. To have any hope of interoperability on this, a serialization algorithm has to be clearly described by the specification. Which hasn’t happened.

Let’s start with WS-ResourceProperties (WS-RP). It has a QueryResourceProperties operation that takes an XPath expression as input. The specification says that “the response MUST contain an XML serialization of the results of evaluating the QueryExpression against the resource properties document“. Great, thanks. The example provided happens to return a nodeset with only one node (a boolean), which is implicitly serialized into the text representation of that boolean. What if there is more than one node in the nodeset? What about other types of nodes?

Moving on to WS-Management, which defines a SOAP header that uses XPath to qualify a WS-Transfer GET request such that it only retrieves a subset of the target XML document. While it does a better job than WS-RP at describing the input (e.g. it specifies the context node and what namespace declarations are in scope for the XPath evaluation) it is even more cavalier than WS-RP in describing the output: “the output (lines 53-55) is like that supplied by a typical XPath processor and might or might not contain XML namespace information or attributes“. By “a typical XPath processor” we should understand MSXML I suppose. But as far as I know a “typical XML processor” doesn’t return XML, it returns language-specific data structures (e.g. a C# or Java object, like a nu.xom.Nodes instance). And here too, the examples only use single-node nodesets.

WS-ResourceTransfer (WS-RT) was supposed to be the convergence of these two efforts, so presumably it would have learned from their mistakes. While it is better written in general than its predecessors, it fails just as badly with regards to specifying the nodeset serialization. And once again, the example provided uses a nodeset with just one node.

And then came the CMDBf query operation which, for some unclear reason, was deemed in need of a built-in XPath transformation of records. As I pointed out in my review of CMDBf 1.0 at the time, this feature was added without taking the pain to define the XML serialization of the resulting nodeset. And there isn’t even an example of the XPath serialization.

It is sad in a way, but the only specification that acknowledges the problem and addresses it came before any of the four above even got started. It is the WSMF (Web Services Management Framework) work that we did at HP, and more specifically the “note on dynamic attributes and meta information” (not available at HP anymore but available from archive.org) . This specification was the first one to define a GET operation that is qualified by an XPath expression. Unlike its successors it also explicitly narrowed down the types of nodes that could be selected (”The manager MUST NOT send as input an XPath statement that returns a nodeset containing nodes other than element, attribute and namespace nodes“). And for those valid types it described how to serialized them in XML (”When a node in the result nodeset is an attribute node, for the sake of the response it is serialized as an element node which has the same name as the name of the original attribute (see example 4 for an illustration). The element is in the same namespace as the namespace the attribute it represents is in. This applies to namespace nodes as well, they are serialized like an attributes in the xmlns namespace“). Turning an attribute into an element of the same QName might not be the smartest thing in retrospect (after all there may be an element by that QName already) but at least we recognized and addressed the problem.

But all is good now, I am told, because XPath 2.0 is here, along with a clean data model and a well-described serialization.

Not so. Anyone wanting to use XPath for a SOAP-based query language still would have to specify a serialization.

The first problem with the W3C serialization is that the XML output method doesn’t work for all nodesets. Try to use it on a nodeset that contains a top-level attribute node and you get error err:SENR0001. And even for the nodesets it accepts, it sometimes returns less-than-useful results. For example, if your XPath is of the form /employee/name/text() and you have four employees, the result will look something like this:

“Joe SmithKathy O’ConnorHelen MartinBrian Jones”

Concatenated text values without separators. I guess W3C is like a department store, they don’t offer complimentary wrapping anymore…

That’s why the nux.xom.xquery.ResultSequenceSerializer class had to define its own wrapping mechanims to produce a useful XML serialization. The API gives you the choice between the W3C_ALGORITHM and the WRAP_ALGORITHM.

Bottom line, and however much some would like to think of it that way, XPath (1 or 2) is not an XML subsetting/transformation mechanism. It could be used to create one (as XSLT does), but you have to do your own plumbing.

In addition to the technical aspects of this discussion, what else can be learned from this sad state of things? The fact that all these specifications define an XPath-driven query mechanism that is simply broken (beyond the simplest use cases) withouth anyone even noticing tells me that there isn’t a real need for full XPath query over SOAP (and I am talking about XPath 1.0, the introduction of XPath 2.0 in CMDBf is even more out there). A way to retrieve individual elements (and maybe text values) is all that is needed for 99% of the use cases addressed by these specifications. Users would be better served (especially in a version 1.0) by specifications that cover the simple case correctly than by overly generic, complex and poorly documented features. There is always time to add features later if the initial specification is successful enough that users encounter its limitations.

28
Apr
2008

Unhealthy fun with IP aspects of optionality in specifications

by William Vambenepe

The previous blog post has re-awaken the spec lawyer in me (on the hobby glamor scale, spec lawyering ranks just below collecting dead bugs). Which brought back to my mind a peculiar aspect of the “Microsoft Open Specification Promise“.

The promise was published to address fears some people had that adopting Microsoft-created specifications (especially non-standard ones) would put them at risk of patent claims from Microsoft. The core of the promise is only two paragraphs long. The first one contains this section:

“To clarify, ‘Microsoft Necessary Claims’ are those claims of Microsoft-owned or Microsoft-controlled patents that are necessary to implement only the required portions of the Covered Specification that are described in detail and not merely referenced in such Specification.”

That seams to pretty clearly state that only the required portions of a specification are covered by this promise. Which is a very significant limitation, as specifications often tend to (over-) use optional features. But if you read further, the list of “Covered Specifications” (those to which the promise applies), contains this statement:

“this Promise also applies to the required elements of optional portions of such specifications.”

I find this very puzzling because it seems to contradict the previous statement. And more importantly, it’s hard to understand what it really means. That’s where the fun starts:

For example, if my spec defines a document <a> with an optional element <b> that itself has an optional sub-element <c>, as in:

<a>
  ...
  <b>
    ...
    <c>...</c>
  </b>
</a>

The <b> element is a required part of the “b” optional portion of the spec (the portion of the spec that defines that element), so I guess it is covered, but is <c>? That’s an optional element of an optional portion (the “b” portion) of the spec, so it isn’t. Unless you consider the portion of the spec that defines <c> (the “c” portion of the spec) to be an optional portion of the spec itself. In which case the <c> element is covered.

But if you take that second line of reasoning, then everything in the spec is covered because for any feature, no matter how “optional” it is, there is a portion (optional or not) of the specification that describes this feature. And if you are implementing that portion, for example the portion that defines element <foo>, by definition element <foo> is required for it (how can an element not be a required part of its own definition?). But if Microsoft intended to cover all parts of the specification, why not say so rather than this recursion-inducing “required elements of optional portions” statement? And if not, why do they choose to only cover optional elements that are one degree removed from the base of the specification?

Wouldn’t it be fun to see a court of law deal with a suit that hinges on this statement (provided that you’re not a party in the suit, of course)?

When a real spec lawyer took a look at this promise, he didn’t comment on the second statement, the one that raises the most questions in my mind.

[UPDATED 2008/4/29: The "promise" has seen many updates. The original (which is the one Andy Updegrove reviewed at the previous link) came out on 2006/9/12. The one I reviewed is dated 2008/3/25. There is no change history on the Microsoft site, but the Wayback machine has archived some older versions. The oldest one I can find is dated 2006/10/23 and it does not contain the sentence about "required elements of optional portions" that puzzles me. So it's likely that the version Andy reviewed didn't include this either and as such was clearly limited to required portions of the specifications (something that Andy pointed out).]

26
Apr
2008

WS-Transfer, its WSDL and its WS-I compliance: the art of engineered uselessness

by William Vambenepe

Several years ago, Chris Ferris wrote a blog entry in which he explains that WS-Transfer is not WS-I Basic Profile (BP) compliant.

Chris’ main point is correct: the WSDL document in appendix II of the WS-Transfer specification is not compliant with the WS-I Basic Profile. But what does this mean and why should one care?

If you search for the word “wsdl” in WS-Transfer, you first find it in the table that declares namespace prefixes used in the specification. But the prefix is not used in the specification, so it could just as well be removed from that table.

We see it next mentioned in the “compliance” boilerplate where it is declared to be the least authoritative of all information in the specification.

The next occurrence is all the way down in section 8, as a reference to the WSDL 1.1 W3C note. The only place where that reference is used, is further below, in Appendix II.

In short, for all practical purposes there is no mention of WSDL in WS-Transfer except for this one appendix that contains a WSDL document. Since there is no MUST or REQUIRED statement that refers to it, it is at best a testing tool that one can use to validate WS-Transfer messages produced. There is no requirement at all that the implementation produces that WSDL (e.g. as a response to a WS-MeX request) or consumes it.

And if you look at the content of the WSDL, it is mostly XML gymnastics aimed at creating “empty” and “any” types to express almost nothing useful about the messages sent and received.

You don’t have to take my statement that the WS-Transfer WSDL is useless at face value. Here are two other proofs:

  • Chris doesn’t just point out the WS-I BP violation in the WS-Transfer WSDL, he also proposes a way to fix it. He writes: “I actually think that a more appropriate approach to handling WS-Transfer’s ‘Get’ would be to specify the output message as you would any doc-literal operation and merely annotate the operation with the appropriate wsa:Action attribute values” (he also provides an example). And he is perfectly right. If you really want a WSDL for your WS-Transfer operations, create one that is specific to the resource type (server, toaster…) that you are dealing with. By definition that WSDL can’t be baked into the model-agnostic WS-Transfer specification. While Chris doesn’t say it, the natural conclusion of his remark is that there is not point for a WSDL in WS-Transfer (because any resource-agnostic WSDL is useless).
  • The WS-Transfer XSD and WSDL have been modified, sometimes in backward-incompatible ways, without changing the target namespace. From the original version to the first W3C submission, some minor changes (message names, introduction of WS-Addressing). From the first W3C submission to the current submission, some potentially backward-incompatible changes (the GET input can now be non-empty, the CREATE response can now contain anything as a result of trying to support different versions of WS-Addressing). On top of that, all these XSD and WSDL documents embedded in various versions of the spec are “non-normative”. The normative versions are said to be the ones at xmlsoap.org (XSD, WSDL). Those have not changed, which means that both versions on the W3C web site contain an incorrect version of the XSD/WSDL in the spec. Shouldn’t that lack of XML hygiene be a big deal for a specification that is implemented (via WS-Management, which references the W3C submission) in resources with long product development cycles, such as servers from Dell, HP and others that have WS-Management support directly on the motherboard? It would, if the XSD and WSDL had any relevance for the implementers. The fact that there was no outcry is yet another proof that the WS-Transfer XSD and the WSDL are irrelevant.

So yes, Chris is right that the WS-Transfer WSDL (BTW all versions have the problem that Chris describes even though it could have been fixed in a backward-compatible way when the WSDL was altered) is not WS-I BP compliant. But since that WSDL is useless anyway, this shouldn’t keep anyone up at night. The WS-Transfer WSDL serves no purpose other than to annoy people who like things to be WS-I BP compliant.

But is it just the WS-Transfer WSDL that’s useless, or it is all of WS-Transfer?

I am not planning to go into WS-* vs. REST territory here. To those who are confused by the similarity between the names of WS-Transfer operations and HTTP methods and see WS-Transfer as a way to do “REST over SOAP” I’ll just point out that WS-Transfer is rarely used on its own but rather in conjunction with many other SOAP messages (like those defined by WS-Eventing and WS-Enumeration, plus countless custom operations). So much for uniform interfaces. WS-Transfer, at least as it is used today, is not about REST.

Rather, the reasons why I question the usefulness of WS-Transfer are more pragmatic than architectural. I can think of three potential justifications to carve out WS-Transfer as a separate specification, none of which is really convincing at this point in time.

The first reason is simply to avoid repeating the same text over and over again. If many specifications are going to describe the same SOAP message, just describe it once and refer to that description. Sounds good. But I know of three specifications that use WS-Transfer: WS-Management, WS-MeX and the Devices Profile for Web Services.

WS-MeX and the Devices Profile only use the GET operation. Which means that the only specification text that they can re-use from WS-Transfer is something like “send an empty get request and get something back”. WS-Transfer can’t say what that something is, only the domain-specific specifications can. As a result, you are spending as much time referencing WS-Transfer as would be spent defining a simple GET operation. For all practical purposes, you can implement WS-MeX and the Devices Profile without ever reading WS-Transfer.

The second potential reason is to provide a stand-alone piece of functionality that can be implemented once (e.g. as a library/module) and re-used for different purposes. Something that automatically kicks in when a WS-Transfer wsa:Action is detected. Think of a stand-alone encryption/decryption library for example, that looks for specific SOAP headers. Or WS-Eventing, for which a library can take over the task of managing the subscription lifecycle. Except WS-Transfer defines so little that it’s not clear what a stand-alone WS-Transfer implementation would do. Receive messages and do what with them? It is so tied to the back-end that there isn’t much you can do in a general fashion. Unless you are creating a library for a database product and you see WS-Transfer as a query interface for your database. But this only makes sense if you want to provide more fine-grained access to the XML content, which WS-Transfer does not do.

Which takes us to the third potential value of WS-Transfer, as a foundational specification on which to build extensions. Of the three this is the only one that I believed in at some point. WS-ResourceTransfer (WS-RT) was the main attempt at doing this. Any service that uses WS-Transfer could, via the magic of the SOAP processing model, offer a more precise/powerful access to the resources. But while this was possible in theory it hasn’t really panned out in practice for many reasons:

  • Some people (hints: Armonk; Blue) pushed hard to put WS-RT instructions in the body rather than in headers, seriously compromising its ability to seamlessly compose with existing SOAP messages.
  • WS-MeX and the Devices Profile typically deal with documents small enough that manipulating them as a whole is rarely a problem. This only leaves WS-Management which has its own “fragment transfer” mechanism so it doesn’t really need a stand-alone mechanism.
  • XQuery is now developing support for an update capability.

What then is left, in the Spring of 2008, to justify the need for WS-Transfer as a separate layer, rather than considering it an integral part of WS-Management? Not much. WS-MeX, in an earlier version, used to define its own GET operation and it wouldn’t be any worse off if it had stayed that way (or returned to it). Ditto for the Device Profile. At this point, it’s mostly a matter of pragmatically cleaning up the mess without creating another one.

In retrospect (color me partially guilty), maybe one shouldn’t use the same architectural rules when attempting to design an interoperable standard stack for an industry than when refactoring a software project. Maybe one should resist the urge to refactor the “code” (or rather the PowerPoint stack) every time one detects the smallest conceptual redundancy. There is a cost in constant changes. There is a cost in specification cross-dependencies. WSDM experienced it firth hand with the different versions of WS-Addressing (another dependency that didn’t need to be). WS-Management is seeing it from the perspective of standardization.

15
Apr
2008

IGF and GIF: it’s not a typo

by William Vambenepe

With the Oracle announcements at the RSA conference this month (things like Oracle Role Manager and this white paper), the Identity Governance Framework (IGF) is back in the news. And since HP publicly released the Governance Interoperability Framework (GIF) earlier this year, there is some potential for confusion between the two (akin to the OSGi/OGSI confusion). I am not an author or even an expert in either, but I know enough about both that I can at least help reduce the confusion.

They are both frameworks, they are both about governance, they both try to enable interoperability, they both define XML formats, they were both privately designed and they are both pushed by their authors (and supporters) towards standardization. To add to the confusion, Oracle is listed as a supporter of HP’s GIF and HP is listed as a supporter of Oracle’s IGF.

And yet they are very different.

GIF is an attempt to address SOA governance, which mostly relates to the lifecycle of services and their artifacts (like WSDL, XSD and policies). So you can track versions, deployment status, ownership, dependencies, etc. HP is making the specification available to all (here but you need to register) and has talked about submission to a standards body but as far as I know this hasn’t happened yet.

IGF is a set of specifications and APIs that pull access policy for identity related information out of the application logic and into well-understood XML declarations. With the goal of better controlling the flow of such information. The keystones are the CARML specification used to describe what identity related information an application needs and its counterpart the AAPML specification, used to describe the rules and constraints that an application puts on usage of the identity-related information it owns. The framework also defines relevant roles and service interfaces. Unlike GIF, which is still controlled by HP, IGF is now under the control of the Liberty Alliance Project. Oracle is just one participant (albeit a leading one).

Could they ever meet?

A Web service managed through a GIF-like SOA governance system could have policies related to accessing identity-related information, as addressed by IGF (and realized through CARML and AAPML elements). GIF doesn’t really care about the content of the policies. Studying the positions of the IGF and GIF specifications relative to WS-Policy would be a good way to concretely understand how they operate at a different level from one another. While there could theoretically be situations in which IGF and GIF are both involved, they do not do the same thing and have no interdependency whatsoever.

[UPDATED 2008/4/18: Phil Hunt (co-author of IGF) has a blog where he often writes about IGF. He also wrote a good overview of IGF and its applicability to governance and SOX-style compliance.]

31
Mar
2008

Where will you be when the Semantic Web gets Grid’ed?

by William Vambenepe

I see the tide rising for semantic technologies. On the other hand, I wonder if they don’t need to fail in order to succeed.

Let’s use the Grid effort as an example. By “Grid effort” I mean the work that took place in and around OGF (or GGF as it was known before its merger w/ EGA). That community, mostly made of researchers and academics, was defining “utility computing” and creating related technology (e.g. OGSA, OGSI, GridFTP, JSDL, SAGA as specs, Globus and Platform as implementations) when Amazon was still a bookstore. There was an expectation that, as large-scale, flexible, distributed computing became a more pressing need for the industry at large, the Grid vision and technology would find their way into the broader market. That’s probably why IBM (and to a lesser extent HP) invested in the effort. Instead, what we are seeing is a new approach to utility computing (marketed as “cloud computing”), delivered by Amazon and others. It addresses utility computing with a different technology than Grid. With X86 virtualization as a catalyst, “cloud computing” delivers flexible, large-scale computing capabilities in a way that, to the users, looks a lot like their current environment. They still have servers with operating systems and applications on them. It’s not as elegant and optimized as service factories, service references (GSR), service handle (GSH), etc but it maps a lot better to administrators’ skills and tools (and to running the current code unchanged). Incremental changes with quick ROI beat paradigm shifts 9 times out of 10.

Is this indicative of what is going to happen with semantic technologies? Let’s break it down chronologically:

  1. Trailblazers (often faced with larger/harder problems than the rest of us) come up with a vision and a different way to think about what computers can do (e.g. the “computers -> compute grid” transition).
  2. They develop innovative technology, with a strong theoretical underpinning (OGSA-BES and those listed above).
  3. There are some successful deployments, but the adoption is mostly limited to a few niches. It is seen as too complex and too different from current practices for broad adoption.
  4. Outsiders use incremental technology to deliver 80% of the vision with 20% of the complexity. Hype and adoption ensue.

If we are lucky, the end result will look more like the nicely abstracted utility computing vision than the “did you patch your EC2 Xen images today” cloud computing landscape. But that’s a necessary step that Grid computing failed to leapfrog.

Semantic web technologies can easily be mapped to the first three bullets. Replace “computers -> computer grid” with “documents/data -> information” in the first one. Fill in RDF, RDFS, OWL (with all its flavors), SPARQL etc as counterparts to OGSA-BES and friends in the second. For the third, consider life sciences and defense as niche markets in which semantic technologies are seeing practical adoption. What form will bullet #4 take for semantic technology (e.g. who is going to be the EC2 of semantic technology)? Or is this where it diverges from Grid and instead gets adopted in its “original” form?

26
Feb
2008

Unintentional comedy

by William Vambenepe

With these two words, “unintentional comedy”,

  • the predictability,
  • the unstated rules of the genre,
  • the stereotypical roles that keep reappearing: the bully, the calculator, the rambler, the simple-minded (that’s the one I used to play),
  • the pretentiousness,
  • the importance of appearances,
  • the necessity of conflict and tension,
  • the repetitiveness,
  • and the fact that after a while people tend to behave as caricatures of themselves.

I don’t mind being (with many others) the butt of the joke when the joke is right on. Plus, I made a similar analogy in the past: Commedia dell (stand)arte (once there, make sure you also follow the link to Umit’s verses).

To be fair, I don’t think this is limited to IT management standards. Other standard areas behave alike (OOXML vs. ODF anyone?). You can also see the bullet points above in action in many open source mailing lists. And most of all in the blogosphere. BTW Damon, why do you think the server for this blog is stage.vambenepe.com and not a more neutral blog.vambenepe.com? It’s not that I got mixed up between my staging server and my production server. It’s that I see a lot of comedy aspects to our part of the blogosphere and I wanted to acknowledge that I, like others, assume a persona on my blog and through it I play a role in a big comedy. Which is not as dismissive as it sounds, comedy can be an excellent vehicle to convey important and serious ideas. But we need people like you to remind us, from time to time, that comedy it is.

23
Feb
2008

SCA, OGSi and Spring from an IT management perspective

by William Vambenepe

March starts next week and the middleware blogging bees are busy collecting OSGi-nectar, Spring-nectar, SCA-nectar, bringing it all back to the hive and seeing what kind of honey they can make from it.

Like James Governor, I had to train myself to stop associating OSGi with OGSI (which was the framework created by GGF, now OGF, to implement OGSA, and was - not very successfully - replaced with OASIS’s WSRF, want more acronyms?). Having established that OSGi does not relate to OGSI, how does it relate to SCA and Spring? What with the Sprint-OSGi integration and this call to integrate OSGi and SCA (something Paremus says they already do)? The third leg of the triangle (SCA-Spring integration) is included in the base SCA framework. Call this a disclosure or a plug as you prefer, I’ll note that many of my Oracle colleagues on the middleware side of the house are instrumental in these efforts (Hal, Greg, Khanderao, Dave…).

There is also a white paper (getting a little dated but still very much worth reading) that describes the potential integrations in this triangle in very clear and concrete terms (a rare achievement for this kind of exercise). It ends with “simplicity, flexibility, manageability, testability, reusability. A key combination for enterprise developers”. I am happy to grant the “flexibility” (thanks OSGi), “testability” (thanks Spring) and “reusability” (thanks SCA) claims. Not so for simplicity at this point unless you are one of the handful of people involved in all three efforts. As for the “manageability”, let’s call it “manageability potential” and remain friends.

That last part, manageability, is of course what interests me the most in this area. I mentioned this before in the context of SCA alone but the conjunction of SCA with Spring and/or OSGi only increases the potential. What happened with BPEL adoption provides a good illustration of this:

There are lots of JEE management tools and technologies out there, with different levels of impact on application performance (ideally low enough that they are suitable for production systems). The extent to which enterprise Java has been instrumented, probed and analyzed is unprecedented. These tools are often focused on the performance more than the configuration/dependency aspects of the application, partly because that’s easier to measure. And while they are very useful, they struggle with the task of relating what they measure to a business view of the application, especially in the case of composite applications with many shared components. Enter BPEL. Like SCA, BPEL wasn’t designed for manageability. It was meant for increased productivity, portability and flexibility. It was designed to support the SOA vision of service re-use and to allow more tasks to be moved from Java coding to infrastructure configuration. All this it helps with indeed. But at the same time, it also provides very useful metadata for application management. Both in terms of highlighting the application flow (through activities) and in terms of clarifying the dependencies and associated policies (through partner links). This allowed a new breed of application management tools to emerge that hungrily consumer BPEL process definitions and use them to better relate application management to the user-visible aspects of the application.

But the visibility provided by BPEL only goes so far, and soon the application management tools are back in bytecode instrumentation, heap analysis, transaction tracing, etc. Using a mix of standard mechanisms and “top secret”, “patent pending” tricks. In addition to all of their well-known benefits, SCA, OGSi and Spring also help fill that gap. They provide extra application metadata that can be used by application management tools to provide more application context to management tasks. A simple example is that SCA’s service/reference mechanism extends BPEL partner links to components not implemented with BPEL (and provides a more complete policy framework). Of course, all this metadata doesn’t just magically organize itself in an application management framework and there is a lot of work to harness its value (thus the “potential” qualifier I added to “manageability”). But SCA, OSGi and Spring can improve application management in ways similar to what BPEL does.

Here I am again, taking exciting middleware technologies and squeezing them to extract boring management value. But if you can, like me, get excited about these management aspects then you want to follow the efforts around the conjunction of these three technologies. I understand SCA, but I need to spend more time on OGSi and Spring. Maybe this post is my way of motivating myself to do it (I wish my mental processes were instrumented with better metadata so I could answer this question with more certainty - oh please shoot me now).

And while this is all exciting, part of me also wonders whether it’s not too early to risk connecting these specifications too tightly. I have seen too many “standards framework” kind of powerpoint slides that show how a bunch of under-development specifications would precisely work together to meet all the needs of the world. I may have even written one myself. If one thing is certain in that space, it’s that the failure rate is high and over-eager re-use and linkage between specifications kills. That was one of the errors of WSDM. For a contemporary version, look at this “Leveraging CMDBf” plan at Eclipse. I am very supportive of the effort to create an open-source implementation of the CMDBf specification, but mixing a bunch of other unproven and evolving specifications (in addition to CMDBf, I see WS-ResourceCatalog, SML and a “TBD” WS API which I can’t imagine will be anything other than WS-ResourceTransfer) is very risky. And of course IBM’s good old CBE. Was this HTML page auto-generated from an IBM “standards strategy” powerpoint document? But I digress…

Bonus question: what’s the best acronym to refer to OGSi+SCA+Spring. OSS? Taken (twice). SOS? Taken (and too desperate-sounding). SSO? Taken (twice). OS2? Taken. S2O? Available, as far as I can tell, but who wants a name so easily confused with the stinky and acid-rain causing sulfur dioxide (SO2)? Any suggestion? Did I hear J3EE in the back of the room?

18
Feb
2008

JSR262 (JMX over WS-Management) public review

by William Vambenepe

If you care about exposing or accessing MBeans via WS-Management, now is a good time to read the public review draft of the JSR262 spec.

JSR262 is very much on the “manageability” side of the “manageability vs. management integration” chasm, which is not the most exciting side to me. But more commonality in manageability protocols is good, I guess, and this falls inside the WS-Management window of opportunity so it may help tip the balance.

There is also a nice white paper which does a nice job of retracing the history from JMX to JMX Remote API to JSR 262 and the different efforts along the way to provide access to the JMX API from outside of the local JVM. The white paper is actually too accurate for its own good: it explains well that models and protocols should be orthogonal (there is a section titled “The Holy Grail of Management: Model, Data and Protocol Independence”) which only highlights the shortcomings of JSR262 in that regard.

In a what looks from the outside like a wonderful exercise of “when you have a hammer” (and also “when you work in a hammer factory” like the JCP), this whole Java app management effort has been API-driven rather than model-driven. What we don’t get out of all this is a clearly defined metamodel and a set of model elements for Java apps with an XML serialization that can be queried and updated. What we do get is a mapping of “WS-Management protocol operations to MBean and MBean server operations” that “exposes JMX technology MBeans as WS-Management resources”.

Yes it now goes over HTTP so it can more easily fool firewalls, but I am yet to see such a need in manageability scenarios (other than from hackers who I am sure are very encouraged by the development). Yes it is easier for a non-Java endpoint to interact with a JSR262 endpoint than before but this is an incremental improvement above the previous JMX over RMI over IIOP because the messages involved still reflect the underlying API.

Maybe that’s all ok. There may very well not be much management integration possible at the level of details provided by JMX APIs. Management integration is probably better served at the SCA and OSGi levels anyway. Having JSR262 just provide incremental progress towards easier Java manageability by HP OVO and the like may be all we should ask of it. I told some of the JSR262 guys, back when they were creating their own XML over HTTP protocol to skirt the WS-Management vs. WSDM debate, that they should build on WS-Management and I am glad they took that route (no idea how much influence my opinion had on this). I just can’t get really excited about the whole thing.

All the details on the current status of JSR262 on Jean-Francois Denise’s blog.

15
Feb
2008

Comparing Joe Gregorio’s RESTful Partial Updates to WS-ResourceTransfer

by William Vambenepe

Joe Gregorio just proposed a way to do RESTful partial updates. I am not in that boat anymore but, along with my then-colleagues from HP, Microsoft, IBM and Intel, I have spent a fair bit of time trying to address the same problem, albeit in a SOAP-based way. That was WS-ResourceTransfer (WS-RT) which has been out as a draft since summer 2006. In a way, Joe’s proposal is to AtomPub what WS-ResourceTransfer is to WS-Transfer, retrofitting a partial resource update on top of a “full update” mechanism. Because of this, I read his proposal with interest. I have mentioned before that WS-RT isn’t the best-looking cow in the corral so I was ready to like Joe’s presumably simpler approach.

I don’t think it meets the bill for partial update requirements in IT management scenarios.

This is not a REST versus SOAP kind of thing and I am not about to launch in a “how do you do end to end encryption and reliable messaging” tirade. I think it is perfectly possible to meet most management scenarios in a RESTful way. And BTW, I think most management scenarios do not need partial updates at all.

But for those that do, there is just too little flexibility in Joe’s proposal. Not that it means it’s a bad proposal, I don’t have much of an idea of what his use cases are. The proposal might be perfectly adequate for them. But when I read his proposal, it’s IT management I was mentally trying to apply it to and it came short in that regard.

Joe’s proposal requires the server to annotate any element that can be updated. On the positive side, this “puts the server firmly in control of what sub-sections of a document it is willing to handle partial updates on” which can indeed be useful. On the negative side it is not very flexible. If you are interacting with a desired-state controller, the rules that govern what you can and cannot change are often a lot more complex than “you can change X, you can’t change Y”. Look at SML validation for an example.

Another aspect is that the requester has to explicitly name the elements to replace. That could make for a long list. And it creates a risk of race conditions. If I want to change all the elements that have an attribute “foo” with a value “bar” I need to retrieve them first so that I can find their xml:id. Then I need to send a message to update them. But if someone changed them in the meantime, they may not have the “bar” value anymore and I am going to end up updating elements that should not be updated. Again, not necessarily a problem in some domains. An update mechanism that lets you point at the target set via something like XPath helps prevent this round-tripping (at a significant complexity cost unfortunately, something WS-RT tries to address with simplified dialects of XPath).

Joe volunteers another obvious limitation when he writes that “this doesn’t solve all the partial update scenarios, for example this doesn’t help if you have a long sub-list that you want to append to”. Indeed. And it’s even worse if you care about element order. Not something that we normally care much about in IT management (UML, MOF, etc don’t have a notion of property order) but the overuse of XSD in system modeling has resulted in order being important to avoid validation failures (because it’s really hard to write an XSD that doesn’t care about order even though it is often not meaningful to the domain being modeled).

In early 2007, I wrote an implementation of WS-RT and in the process I found many gaps in the specification, especially in the PUT operation. It is not the ideal answer in any way. If one was to try to fix it, a good place to start might be to make the specification a bit less flexible (e.g. restricting the change granularity to the level of an element, not an attribute and not a text node). There is plenty of room to find the simplicity/flexibility sweetspot for IT management scenarios between what WS-RT tries to offer and what Joe’s proposal offers.