So you want to build an EPR?

EPR (Endpoint References, from WS-Addressing) are a shiny and exciting toy. But a sharp one too. So here is my contribution to try to prevent fingers from being cut and eyes from being poked out.

So far I have seen EPRs used for five main reasons, not all of them very inspired:

1) “Dispatching on URIs is not cool”

Some tools make it hard to dispatch on URI. As a result, when you have many instances of the same service, it is easier to write the service if the instance ID is in the message rather than in the endpoint URI. Fix the tools? Nah, let’s modify the messages instead. I guess that’s what happens when tool vendors drive the standards, you see specifications that fit the tools rather than the contrary. So EPRs are used to put information that should be in the URI in headers instead. REST-heads see this as a capital crime. I am not convinced it is so harmful in practice, but it is definitely not a satisfying justification for EPRs.

2) “I don’t want to send a WSDL doc around for just the endpoint URI”

People seem to have this notion that the WSDL is a “big and static” document and the EPR is a “small and dynamic” document. But WSDL was designed to allow design-time and run-time elements to be separated if needed. If all you want to send around is the URI at which the service is available, you can just send the URI. Or, if you want it wrapped, why not send a soap:address element (assuming the binding is well-known). After all, in many cases EPRs don’t contain the optional service element and its port attribute. If the binding is not known and you want to specify it, send a around a wsdl:port element which contains the soap:address as well as the QName of the binding. And if you want to be able to include several ports (for example to offer multiple transports) or use the wsdl:import mechanism to point to the binding and portType, then ship around a simplified wsdl:descriptions with only one service that itself contains the port(s) (if I remember correctly, WS-MessageDelivery tried to formalize this approach by calling a WSRef a wsdl:service element where all the ports use the same portType). And you can hang metadata off of a service element just as well as off of an EPR.

For some reason people are happy sending an EPR that contains only the address of the endpoint but not comfortable with sending a piece of WSDL of the same size that says the same thing. Again, not a huge deal now that people seem to have settled on using EPRs rather than service elements, but clearly not a satisfying justification for inventing EPRs in the first place.

3) “I can manage contexts without thinking about it”

Dynamically generated EPRs can be used as a replacement for an explicit context mechanism, such as those provided by WS-Context and WS-Coordination. By using EPRs for this, you save yourself the expense of supporting yet-another-spec. What do you loose? This paper gives you a detailed answer (it focuses on comparing EPRs to WS-Context rather than WS-Coordination for pretty obvious reasons, but I assume that on a purely technical level the authors would also recommend WS-Coordination over EPRs, right Greg?). In a shorter and simplified way, my take on the reason why you want to be careful using dynamic EPRs for context is that by doing so you merge the context identifier on the one hand and the endpoint with which you use this context on the other hand into one entity. Once this is done you can’t reliably separate them and you loose potentially valuable information. For example, assume that your company buys from a bunch of suppliers and for each purchase you get an EPR that allows you to track the purchase as it is shipped. These EPRs are essentially one blob to you and the only way to know which one comes through FedEx versus UPS is to look at the address and try to guess based on the domain name. But you are at the mercy of any kind of redirection or load-balancing or other infrastructure reason that might modify the address. That’s not a problem if all you care about is checking the ETA on the shipment, each EPR gives you enough information to do that. But if you also want to consolidate the orders that UPS is delivering to you or if you read in the paper about a potential UPS drivers strike and want to see how it would impact you, it would be nice to have each shipment be an explicit context Id associated to a real service (UPS or FedEx), rather than a mix of both at the same time. This way you can also go to UPS.com, ask about your shipments and easily map each entry returned to an existing shipment you are tracking. With EPRs rather than explicit context you can’t do this without additional agreements.

The ironic thing is that the kind of mess one can get into by using dynamic EPRs too widely instead of explicit context is very similar in nature to the management problems HP OpenView software solves. Discovery of resources, building relationship trees, impact analysis, event correlation, etc. We do it by using both nicely-designed protocols/models (the clean way) and by using heuristics and other hacks when needed. We do what it takes to make sense of the customer’s system. So we could just as well help you manage your shipments even if they were modeled as EPRs (in this example). But we’d rather work on solving existing problems and open new possibilities than fix problems that can be avoided. And BTW using dynamic EPRs is not always bad. Explicit contexts are sometimes overkill. But keep in mind that you are loosing data by bundling the context with the endpoint. Actually, more than loosing data, you are loosing structure in your data. And these days the gold is less in the raw data than in its structure and the understanding you have of it.

4) “I use reference parameters to create new protocols, isn’t that cool!”

No it’s not. If you want to define a SOAP header, go ahead: define an XML element and then describe the semantics associated with this element when it appears as a SOAP header. But why oh why define it as a “reference parameter” (or “reference property” depending on your version of WS-A)? The whole point of an EPR is to be passed around. If you are going to build the SOAP message locally, you don’t need to first build an EPR and then deconstruct it to extract the reference parameters out of it and insert them as SOAP headers. Just build the SOAP message by putting in the SOAP headers you know are needed. If your tooling requires going through an EPR to build the SOAP message, fine, that’s your problem, but don’t force this view on people who may want to use your protocol. For example, one can argue for or against the value of WS-Management‘s System and SelectorSet as SOAP headers, but it doesn’t make sense to define those as reference parameters rather than just SOAP headers (readers of this blog already know that I am the editor of the WSDM MUWS OASIS standard with which WS-Management overlaps so go ahead and question my motives for picking on WS-Management). Once they are defined as SOAP headers, one can make the adventurous decision to hard-code them in EPRs and to send the EPRs to someone else. But that’s a completely orthogonal decision (and the topic of the fifth way EPRs are used – see below). But using EPRs to define protocols is definitely not a justification for EPRs and one would have a strong case to argue that it violates the opacity of reference parameters specified in WS-Addressing.

5) “Look what I can do by hard-coding headers!”

The whole point of reference parameters is to make people include elements that they don’t understand in their SOAP headers (I don’t buy the multi-protocol aspect of WS-Addressing, as far as I am concerned it’s a SOAP thing). This mechanism is designed to open a door to hacking. Both in the good sense of the term (hacking as a clever use of technology, such as displaying Craig’s list rental data on top of Google maps without Craig’s List or Google having to know about it), and in the bad sense of the term (getting things to happen that you should not be able to make happen). Here is an example of good use for reference parameters: if the Google search SOAP input message accepted a header that specifies what site to limit the search on (equivalent to adding “site:vambenepe.com” in the Google text box on Google.com), I could distribute to people an EPR to the vambenepe.com search service by just giving them an EPR pointing to the Google search service and adding a reference parameter that corresponds to the header instructing Google to limit the search to vambenepe.com.

Some believe this is inherently evil and should be stopped, as expressed in this formal objection. I think this is a useful mechanism (to be used rarely and carefully) and I would like to see it survive. But there are two risks associated with this mechanism that people need to understand.

The first risk is that EPRs allow people to trick others into making statements that they don’t know they are making. This is explained in the formal objection from Anish and friends as their problem #1 (“Safety and Security”) and I agree with their description. But I don’t agree with the proposed solutions as they prevent reference parameters to be treated by the service like any other SOAP header. Back last November I made an alternative proposal, using a wsa:CoverMyRearside element that would not have this drawback and I know other people have made similar proposals. In any case, this risk can and should be addressed by the working group before the specification becomes a Recommendation or people will stop accepting to process reference parameters after a few high-profile hacks. Reference parameters will become the ActiveX of SOAP.

The second risk is more subtle and that one cannot be addressed by the specification. It is the fragility that will result from applications that share too many assumptions. I get suspicious when someone gives me directions to their house with instructions such as “turn left after the blue van” or “turn right after the barking dog”, don’t you? “We’re the house after the green barn” is a little better but what if I want to re-use these directions a few years later. What’s the chance that the barn will be replaced or repainted? EPRs that contain reference parameters pose the same problem. Once you’ve sent the EPR, you don’t know how long it will be around, you don’t know who it will get forwarded to, you don’t know what the consumer will know. You need to spend at least as much efforts picking what data you use as a reference parameter (if anything) as you spend designing schemas and WSDL documents. If your organization is smart enough to have a process to validate schemas (and you need that), that same process should approve any element that is put in a reference parameter.

Or you’ll poke your eye out.

2 Comments

Filed under Everything, Implementation, Security, Standards, Tech

2 Responses to So you want to build an EPR?