William Vambenepe's blog

IT management in a changing IT world

free ringtones for alltel phonesfree ringtone downloadsgrateful dead ringtonethe killers ringtones

Archive for the 'Implementation' Category

24
Jan
2006

Schema-based XPath tool

by William Vambenepe

Most XML editors offer an XPath tool that allows one to test and fine-tune XPath expressions by running them against XML documents. Very helpful but also potentially very deceptive. With such a tool it is very easy to convince oneself that an XPath expression is correct after running it against a few instance documents. And a month later the application behave erratically (in many cases it probably won’t break it will execute the request on the wrong element which is worst) because the XPath expression is ran on a different document and what it returns is not what the programmer had in mind. This is especially likely to occur as people use and abuse shortcuts such as “//” and ignore namespaces.

What we need is an XPath tool that can run not only run the XPath against an instance document but can also run it against a schema. In the later case, the tool would flag any portion of the schema that can possibly correspond to a node in the resulting nodeset. It would force programmers to realize that their //bar can match the /foo/bar that they want to reach but it could also match something that falls under the xsd:any at the end of the schema. And the programmer has to deal with that.

23
Jan
2006

The joys of code-generated WSDL

by William Vambenepe

I recently ran into a WSDL which included an operation called “main”. The type of the request message for the operation was of course:

<complexType name=”ArrayOf_xsd_string”>
<complexContent>
<restriction base=”soapenc:Array”>
<attribute ref=”soapenc:arrayType” wsdl:arrayType=”xsd:string[]“/>
</restriction>
</complexContent>
</complexType>

Command line over SOAP. Nice…

05
Aug
2005

Apache WSRF, Pubscribe and Muse v1.0 Releases

by William Vambenepe

The WSRF, Pubscribe and Muse teams at Apache have reached a major milestone in their work: version 1.0 release. Congrats to the teams! Binary and source distributions can be downloaded from:

21
Jun
2005

New names for Apache projects

by William Vambenepe

As part of the move out of incubation into full-fledged Apache projects, the WSRF, WS-Notif and WSDM MUWS implementations in Apache have seen some name and URL changes. So here is the new list with the correct links:

20
Jun
2005

So you want to build an EPR?

by William Vambenepe

EPR (Endpoint References, from WS-Addressing) are a shiny and exciting toy. But a sharp one too. So here is my contribution to try to prevent fingers from being cut and eyes from being poked out.

So far I have seen EPRs used for five main reasons, not all of them very inspired:

1) “Dispatching on URIs is not cool”

Some tools make it hard to dispatch on URI. As a result, when you have many instances of the same service, it is easier to write the service if the instance ID is in the message rather than in the endpoint URI. Fix the tools? Nah, let’s modify the messages instead. I guess that’s what happens when tool vendors drive the standards, you see specifications that fit the tools rather than the contrary. So EPRs are used to put information that should be in the URI in headers instead. REST-heads see this as a capital crime. I am not convinced it is so harmful in practice, but it is definitely not a satisfying justification for EPRs.

2) “I don’t want to send a WSDL doc around for just the endpoint URI”

People seem to have this notion that the WSDL is a “big and static” document and the EPR is a “small and dynamic” document. But WSDL was designed to allow design-time and run-time elements to be separated if needed. If all you want to send around is the URI at which the service is available, you can just send the URI. Or, if you want it wrapped, why not send a soap:address element (assuming the binding is well-known). After all, in many cases EPRs don’t contain the optional service element and its port attribute. If the binding is not known and you want to specify it, send a around a wsdl:port element which contains the soap:address as well as the QName of the binding. And if you want to be able to include several ports (for example to offer multiple transports) or use the wsdl:import mechanism to point to the binding and portType, then ship around a simplified wsdl:descriptions with only one service that itself contains the port(s) (if I remember correctly, WS-MessageDelivery tried to formalize this approach by calling a WSRef a wsdl:service element where all the ports use the same portType). And you can hang metadata off of a service element just as well as off of an EPR.

For some reason people are happy sending an EPR that contains only the address of the endpoint but not comfortable with sending a piece of WSDL of the same size that says the same thing. Again, not a huge deal now that people seem to have settled on using EPRs rather than service elements, but clearly not a satisfying justification for inventing EPRs in the first place.

3) “I can manage contexts without thinking about it”

Dynamically generated EPRs can be used as a replacement for an explicit context mechanism, such as those provided by WS-Context and WS-Coordination. By using EPRs for this, you save yourself the expense of supporting yet-another-spec. What do you loose? This paper gives you a detailed answer (it focuses on comparing EPRs to WS-Context rather than WS-Coordination for pretty obvious reasons, but I assume that on a purely technical level the authors would also recommend WS-Coordination over EPRs, right Greg?). In a shorter and simplified way, my take on the reason why you want to be careful using dynamic EPRs for context is that by doing so you merge the context identifier on the one hand and the endpoint with which you use this context on the other hand into one entity. Once this is done you can’t reliably separate them and you loose potentially valuable information. For example, assume that your company buys from a bunch of suppliers and for each purchase you get an EPR that allows you to track the purchase as it is shipped. These EPRs are essentially one blob to you and the only way to know which one comes through FedEx versus UPS is to look at the address and try to guess based on the domain name. But you are at the mercy of any kind of redirection or load-balancing or other infrastructure reason that might modify the address. That’s not a problem if all you care about is checking the ETA on the shipment, each EPR gives you enough information to do that. But if you also want to consolidate the orders that UPS is delivering to you or if you read in the paper about a potential UPS drivers strike and want to see how it would impact you, it would be nice to have each shipment be an explicit context Id associated to a real service (UPS or FedEx), rather than a mix of both at the same time. This way you can also go to UPS.com, ask about your shipments and easily map each entry returned to an existing shipment you are tracking. With EPRs rather than explicit context you can’t do this without additional agreements.

The ironic thing is that the kind of mess one can get into by using dynamic EPRs too widely instead of explicit context is very similar in nature to the management problems HP OpenView software solves. Discovery of resources, building relationship trees, impact analysis, event correlation, etc. We do it by using both nicely-designed protocols/models (the clean way) and by using heuristics and other hacks when needed. We do what it takes to make sense of the customer’s system. So we could just as well help you manage your shipments even if they were modeled as EPRs (in this example). But we’d rather work on solving existing problems and open new possibilities than fix problems that can be avoided. And BTW using dynamic EPRs is not always bad. Explicit contexts are sometimes overkill. But keep in mind that you are loosing data by bundling the context with the endpoint. Actually, more than loosing data, you are loosing structure in your data. And these days the gold is less in the raw data than in its structure and the understanding you have of it.

4) “I use reference parameters to create new protocols, isn’t that cool!”

No it’s not. If you want to define a SOAP header, go ahead: define an XML element and then describe the semantics associated with this element when it appears as a SOAP header. But why oh why define it as a “reference parameter” (or “reference property” depending on your version of WS-A)? The whole point of an EPR is to be passed around. If you are going to build the SOAP message locally, you don’t need to first build an EPR and then deconstruct it to extract the reference parameters out of it and insert them as SOAP headers. Just build the SOAP message by putting in the SOAP headers you know are needed. If your tooling requires going through an EPR to build the SOAP message, fine, that’s your problem, but don’t force this view on people who may want to use your protocol. For example, one can argue for or against the value of WS-Management’s System and SelectorSet as SOAP headers, but it doesn’t make sense to define those as reference parameters rather than just SOAP headers (readers of this blog already know that I am the editor of the WSDM MUWS OASIS standard with which WS-Management overlaps so go ahead and question my motives for picking on WS-Management). Once they are defined as SOAP headers, one can make the adventurous decision to hard-code them in EPRs and to send the EPRs to someone else. But that’s a completely orthogonal decision (and the topic of the fifth way EPRs are used - see below). But using EPRs to define protocols is definitely not a justification for EPRs and one would have a strong case to argue that it violates the opacity of reference parameters specified in WS-Addressing.

5) “Look what I can do by hard-coding headers!”

The whole point of reference parameters is to make people include elements that they don’t understand in their SOAP headers (I don’t buy the multi-protocol aspect of WS-Addressing, as far as I am concerned it’s a SOAP thing). This mechanism is designed to open a door to hacking. Both in the good sense of the term (hacking as a clever use of technology, such as displaying Craig’s list rental data on top of Google maps without Craig’s List or Google having to know about it), and in the bad sense of the term (getting things to happen that you should not be able to make happen). Here is an example of good use for reference parameters: if the Google search SOAP input message accepted a header that specifies what site to limit the search on (equivalent to adding “site:vambenepe.com” in the Google text box on Google.com), I could distribute to people an EPR to the vambenepe.com search service by just giving them an EPR pointing to the Google search service and adding a reference parameter that corresponds to the header instructing Google to limit the search to vambenepe.com.

Some believe this is inherently evil and should be stopped, as expressed in this formal objection. I think this is a useful mechanism (to be used rarely and carefully) and I would like to see it survive. But there are two risks associated with this mechanism that people need to understand.

The first risk is that EPRs allow people to trick others into making statements that they don’t know they are making. This is explained in the formal objection from Anish and friends as their problem #1 (”Safety and Security”) and I agree with their description. But I don’t agree with the proposed solutions as they prevent reference parameters to be treated by the service like any other SOAP header. Back last November I made an alternative proposal, using a wsa:CoverMyRearside element that would not have this drawback and I know other people have made similar proposals. In any case, this risk can and should be addressed by the working group before the specification becomes a Recommendation or people will stop accepting to process reference parameters after a few high-profile hacks. Reference parameters will become the ActiveX of SOAP.

The second risk is more subtle and that one cannot be addressed by the specification. It is the fragility that will result from applications that share too many assumptions. I get suspicious when someone gives me directions to their house with instructions such as “turn left after the blue van” or “turn right after the barking dog”, don’t you? “We’re the house after the green barn” is a little better but what if I want to re-use these directions a few years later. What’s the chance that the barn will be replaced or repainted? EPRs that contain reference parameters pose the same problem. Once you’ve sent the EPR, you don’t know how long it will be around, you don’t know who it will get forwarded to, you don’t know what the consumer will know. You need to spend at least as much efforts picking what data you use as a reference parameter (if anything) as you spend designing schemas and WSDL documents. If your organization is smart enough to have a process to validate schemas (and you need that), that same process should approve any element that is put in a reference parameter.

Or you’ll poke your eye out.

08
Jun
2005

Apollo, Hermes, Muse out of incubation at Apache

by William Vambenepe

Apollo (WS-ResourceProperties open source implementation), Hermes (WS-Notification open source implementation) and Muse (WSDM MUWS open source implementation) are now full Apache projects, out of incubation mode. Congrats Ian and Sal!

09
Feb
2005

WSDM and WSRF progress in Apache

by William Vambenepe

The Apache Muse and Apollo teams put out their first releases today:

Congratulations to the teams! Looking forward to the interop sessions.

24
Nov
2004

Globus toolkit in Apache?

by William Vambenepe

As Savas already noted (and commented on), Globus seems to intend to contribute all of their toolkit to Apache (warning, this is a link to a Word doc). This follows an earlier co-submission, between HP and Globus, of the WSRF and WSN implementation. It will be interesting to see how the Web services project at Apache scales up with all these contributions.

09
Nov
2004

Can you hear the muse?

by William Vambenepe

Check out this proposal. It brings three new incubator projects to the Web services activity in Apache. And they come with working code.

The projects are:

  • Muse, an open source implementation of WSDM MUWS. With existing code contributed by HP.
  • Apollo, an open source implementation of the WS-ResourceFramework (WSRF) specifications. With existing code contributed by HP and Globus.
  • Hermes, an open source implementation of the WS-Notification specifications. With existing code contributed by HP and Globus. You’ll also notice in the description of this project that it includes support of WS-Eventing (but this is not included in the contributed code).