William Vambenepe's blog

IT management in a changing IT world

cheap generic viagrahow long does viagra last

Archive for the 'SOAP processing model' Category

10
Jul
2008

WS Resource Access at W3C: the good, the bad and the ugly

by William Vambenepe

As far as I know, the W3C is still reviewing the proposal that was made to them to create a new working group to standardize WS-Transfer, WS-ResourceTransfer, WS-Enumeration and WS-MetadataExchange. The suggested name, “Web Services Resource Access Working Group” or WS-RAWG is likely, if it sticks, to end up being shortened to WS-RAW. Which is a bit more cruel than needed. I’d say it’s simply half-baked.

There are many aspects to the specifications and features covered by the proposal. Some goodness, some badness and some ugliness. This post analyzes the good, points at the bad and hints at the ugly. Like your average family-oriented summer movie.

The good

The specifications proposed for W3C standardization describe a way to provide some generally useful features for SOAP messages. Some SOAP messages can get very long. In some cases, I know ahead of time what portion of the long messages promised by the contract (e.g. WSDL) I want. Wouldn’t it be nice, as an optimization, to let the message sender know about this so they can, if they are able to, filter down the message to just the part I want? Alternatively, maybe I do want the full response but I can’t consume it as one big message so I would like to get it in chunks.

You’ll notice that the paragraph above says nothing about “resources”. We are just talking about messaging features for SOAP messages. There are precedents for this. WS-Security can be used to encrypt a message. Any message. WS-ReliableMessaging can be used to ensure delivery of a message. Any message. These “quality of service” specifications are mostly orthogonal to the message content.

WS-RT and WS-Enumeration provide a solution to the “message filtering” and “message chunking”, respectively. But they only address them in the context of a GET-like operation. They can’t be layered on top of any SOAP message. How useful would WS-Security and WS-ReliableMessaging be if they had such a restriction?

If W3C takes on part of the work listed in the proposal, I hope they’ll do so in a way that expends the utility of these features to all SOAP messages.

And just like WS-Security and WS-ReliableMessaging, these features should be provided in a way that leverages the SOAP processing model. Such that I can judiciously use the soap:mustUnderstand header to not break existing services. If I’d like the message to be paired down but I can handle the complete message if need be, I’ll set this attribute to false. If I can’t handle the full message, I’ll set the attribute to true and I’ll get an error if the other party doesn’t understand this extension. At which point I can pick an alternative way to get the task accomplished. Sounds pretty basic but it’s amazing how often this important feature of SOAP (which heralds from and extends XML’s must-ignore semantics) is neglected and obstructed by designers of SOAP messages.

And then there is WS-MetadataExchange. While I am not a huge fan of this specification, I agree with the need for a simple, reliable way to retrieve different types of metadata for an endpoint.

So that’s the (potential) good. A flexible and generally useful way to pair-down long SOAP messages, to chunk them and to retrieve metadata for SOAP endpoints.

The bad

The bad is the whole “resource access” spin. It is not actually intrinsically bad. There are scenarios where such a pattern actually fits. But the way that pattern is being addressed by WS-RT and friends is overly generalized and overly XML-centric. By the latter I mean that it takes XML from an agreed-upon on-the-wire interchange format to an implicit metamodel (e.g. it assumes not just that you agree to exchange XML-formated data but that your model and your business logic are organized and implemented around an XML representation of the domain, which is a much more constraining requirement). I could go on and on about this, especially the use of XPath in the PUT operation. In fact I did go on and on with it, but I spun that off as a separate entry.

In the context of the W3C proposal at hand, this is bad because it burdens the generally useful features (see the “good” section above) with an unneeded and limiting formalism. Not to mention the fact that W3C kind of already has its resource access mechanism, but I’ll leave that aspect of the question to Mark and various bloggers (see a short list of relevant posts at the end of this entry).

The resource access part might be worth doing (one more time), but probably not in the same group as things like metadata discovery, message filtering and message chunking, which are not specific to “resource access” situations. And if someone is going to do this again, rather than repeating the not too useful approaches of the past, it may be good to consider alternatives.

The ugly

That’s the politics around this whole deal. There is, as you would expect, a lot more to it than meets the eye. The underlying drivers for all this have little to do with REST/WS or other architecture considerations. They have a lot to do with control. But that’s a topic for another post (maybe) when more of it can be publicly discussed.

A lot of what I describe in this post was already explained in the WS-ManagementHammer post from a couple of months ago. But that was before the W3C proposal and before WS-MetadataExchange was dragged into the deal. So I thought it might be useful to put the analysis in the context of that proposal. And BTW, this is a personal opinion, not an Oracle position (which is true in general for everything on this blog but is worth repeating specifically for this post).

07
Jul
2008

Three non-muppets walk into a bar…

by William Vambenepe

I can’t shake the feeling that if Steve Vinoski, Steve Jones and Stuart Charlton had a drink together they’d actually agree on pretty much any distributed computing question that is worded in specific and unambiguous terms.

If you are not subscribed to their three blogs (and I don’t understand why you would not be if you have enough free time to read mine), here is a quick summary of the discussion so far:

Steve Vinoski writes an article critical of RPC approaches. Steve Jones doesn’t agree and explains why in a review of the article. Steve Vinoski is not impressed by the content of the review and even less by the tone. Stu sides with Steve Vinoski.

I think they all agree that, all other things equal, it is a good thing to facilitate the task of developers by providing them with intuitive interfaces. They also all agree that you can’t write distributed applications that shield the developer from the existence of a network. The key questions then boil down to:

  • what degree of network awareness do you require from developers (or what degree do you award them, for a more positive spin)?
  • what are the most appropriate programming constructs to expose that “optimal” degree of network awareness to the developers?

These questions don’t necessarily require words like “REST”, “RPC” and “JAXM” to be thrown around, other than merely as illustrative examples. In fact, the discussion so far seems to indicate that the questions are less likely to be resolved as long as these words are involved.

Once these questions are answered, we can compare the existing toolkits/frameworks (and yes, even architectural styles) to see which ones come closer to the ideal level of network-awareness and which ones present the most useful abstractions for that level. Or how each one can be improved to come closer to the sweetspot. Of course, there isn’t one level of network-awareness that is ideal for all cases, but my guess is that most enterprise applications are not too far apart on this.

12
Jul
2007

Gutting the SOAP processing model

by William Vambenepe

One thing I have constantly witnessed in the SOAP design work I have been involved in is the fear of headers. There are several aspects to this, and it’s very hard to sort out the causes and the consequences as they are now linked in a vicious cycle.

For one thing, headers in SOAP messages don’t map well to RPC and we all know SOAP’s history there. Even now, as people have learned that they’re supposed to say they are not doing RPC (”look, my WSDL says doc/literal therefore I am not doing RPC”), the code is still RPC-ish with the grand-children of the body being serialized into Java (or another language) objects and passed as arguments to an operation inside a machine-generated stub. No room for pesky headers.

In addition (and maybe because of this view), tools often bend over backward to prevent you from processing headers along with the rest of the message, requiring instead some handlers to be separately written and registered/configured in whatever way (usually poorly documented as people aren’t expected to do this) the SOAP stack requires.

And the circle goes on, with poor tooling support for headers being used to justify staying away from them in SOAP-based interfaces.

I wrote above that “headers in SOAP messages don’t map well to RPC” but the more accurate sentence is “headers that are not handled by the infrastructure don’t map well to RPC”. If the headers are generic enough to be processed by the SOAP stack before getting to the programmer, then the RPC paradigm can still apply. Halleluiah. Which is why there is no shortage of specs that define headers. But those are the specs written by the “big boys” of Web services, the same who build SOAP stacks. They trust themselves to create code in their stacks to handle WS-Addressing, WS-Security and WS-ReliableMessaging headers for the developers. Those headers are ok. But the little guys shouldn’t be trusted with headers. They’d poke their eyes out. Or, as Elliotte Rusty Harold recently wrote (on a related topic, namely WS-* versus REST), “developers have to be told what to do and kept from getting their grubby little hands all over the network protocols because they can’t be trusted to make the right choices”. The difference is that Elliotte is blasting any use of WS-* while I am advocating a use of SOAP that gives developers full access to the network protocol. Which goes in the same direction but without closing the door on using WS-* specs when they actually help.

I don’t mean to keep banging on my IBM friends, but I’ve spent a lot of time developing specs with well respected IBM engineers who will say openly that headers should not contain “application information”, by which they mean that headers should be processed by the SOAP stack and not require the programmer to deal with them.

That sounds very kind, and certainly making life easy for programmers is good, but what it does is rob them of the single most useful feature in SOAP. Let’s take a step back and look at what SOAP is. If we charitably leave aside the parts that are lipstick on the SOAP-RPC pig (e.g. SOAP data model, SOAP encoding and other RPC features), we are left with:

  • the SOAP processing model
  • the SOAP extensibility model
  • SOAP faults

Of these three, the first two are all about headers. Loose the ability to define headers in your interface, you loose 2/3 of the benefits of SOAP (and maybe even more, I would argue that SOAP faults aren’t nearly as useful as the other two items in the list).

The SOAP processing model describes how headers can be processed by intermediaries and, most important of all, it clearly spells out how the mustUnderstand attribute works. So that everyone who processes a SOAP message knows (or should know) that a header with no such attribute (or where the attribute has a value of “false”) can be ignored if not recognized, while the message must not be processed if a non-understood header has this attribute set to “true”. It doesn’t sound like much, but it is huge in terms of reliably supporting features and extending an existing set of SOAP messages.

When they are deprived of this mechanism (or rather, scared away from using it), people who create SOAP interfaces end up re-specifying this very common behavior over and over by attaching it to parts of the body. But then it has to be understood and implemented correctly every time rather than leveraging a built-in feature of SOAP that any SOAP engine must (if compliance with the SOAP spec has any meaning) implement.

Case in point, the wsman:OptimizeEnumeration element in WS-Management. The normal pattern for an enumeration (see WS-Enumeration) is to first send an Enumerate request to retrieve an enumeration context and then a series of Pull requests to get the data. In this approach, the minimum number of interactions to retrieve even a small enumeration set is 2 (one Enumerate and one Pull). The OptimizeEnumeration element was designed to allow one to request that the Enumerate call be also the first Pull so that we can save one interaction. Not a big deal if you are going to send 500 Pull messages to retrieve the data, but if the enumeration set is small enough to retrieve all the data in one or two Pull messages then you can optimize traffic by up to 50%.

WS-Management decided to do that with an extension element in the body rather than a SOAP header. The problem then is what happens when a WS-Enumeration implementation that doesn’t use the WS-Management additions gets that element. There is no general rule in SOAP about how to handle this. In this case, the WS-Enumeration specs says to ignore the extensions, but it only says so in the “notation conventions” section, and as a SHOULD, not a MUST.

Not surprisingly, some implementers made the wrong assumption on what the presence of wsman:OptimizeEnumeration means and as a result their code faults on receiving a message with this option, rather than ignoring it. Something that would likely not have happened had the SOAP processing model been used rather than re-specifying this behavior.

Using a header not only makes it clear when an extension can be ignored, it also provides, at no additional cost, the ability to specify when it should not be ignored. Rather than baking this into the spec as WS-Management tries to do (not very successfully), using headers allows a sender to decide based on his/her specific needs and capabilities whether support for this extension is a must or not in this specific instance. Usually, flexibility and interoperability tug in opposing directions, but using headers rather than body elements for protocol extensions improves both!

Two more comments about the use of headers in WS-Management:

  • It shows that it’s not all IBM’s fault, because IBM had no involvement in the development of the WS-Management standard,
  • WS-Management is far from being the worst offender at under-using SOAP headers, since it makes good use of them in other places (which probably wouldn’t have been the case had the first bullet not been true). But not consistently enough.

Now let’s look at WS-ResourceTransfer for another example. This too illustrates that those who ignore the SOAP processing model just end up re-inventing it in the body, but it also gives a beautiful illustration of the kind of compromises that come from design by committee (and I was part of this compromise so I am as guilty as the other authors).

WS-ResourceTransfer creates a set of extensions on top of the WS-Transfer specification. The main one is the ability to retrieve only a portion of the document as opposed to the whole document. Defining this extension as a SOAP header (as WS-Management does it) leverages the SOAP processing model. Doing so means you don’t have to write any text in your spec about whether this header can be ignored or not, and the behavior can be specified clearly and interoperably by the sender, through the use of the mustUnderstand attribute. In some case (say if the underlying XML document is in fact an entire database with gigabytes of data) I definitely want the receiver to fault if it doesn’t understand that I only want a portion of the document, so I will set mustUnderstand to true. In other cases (if the subset is not much smaller than the entire document and I am willing to wad through the entire document to extract the relevant data if I have to) then I’ll set mustUnderstand to “false” so that if the receiver can’t honor the extension then I get the whole document instead of a fault.

But with WS-ResourceTransfer, we had to deal with the “headers should not contain application information” view of the world in which it is heresy to use headers for these extensions (why the portion of the document to retrieve is considered “application information” while addressing information is not is beyond my reach and frankly not a debate worth having because this whole dichotomy only makes sense in an RPC view of the world).

So, between the “this information must be in the body” camp and the “let’s leverage the SOAP processing model” camp (my view, as should be obvious by now), the only possible compromise was to do both, with a mustUnderstand header that just says “look in the body to find the extensions that you must follow”. So much for elegance and simplicity. And since that header applies to all the extensions, you have to make all of them mustUnderstand or none of them. For example, you can’t have a wsrt:Create message in which the wsmex:Metadata extension must be understood but the wsrt:Fragment extension can be ignored.

WS-Management and WS-ResourceTransfer share the characteristic of being designed as extensions to existing specifications. But even when creating a SOAP-based interaction from scratch, one benefits from using the SOAP processing model rather than re-inventing ways to express whether support for an option is required or not. I have seen several other examples (in interfaces that aren’t public like the two examples above) that also re-invent this very common behavior for no good reason. Simply because developers are scared of headers.

These examples illustrate a simple point. Most of the value in SOAP resides in how headers are used in the SOAP processing model and the SOAP extensibility model. Steering developers away from defining headers is robbing them of that value. Rather than thinking in terms of header and body, I prefer to think of them as header and tail.