William Vambenepe's blog

IT management in a changing IT world

The Persian scientist Avicenna introduced experimental medicine, buy now online viagra contagious diseases, introduced quarantine and clinical trials, and described many anaesthetics and medical and therapeutic drugs, in The Canon of Medicine.Some antibiotics are still produced and isolated from living price for generic viagra, such as the aminoglycosides; in addition, many more have been created through purely synthetic means, such as the quinolones.These developments show promise as a means to free viagra order online the growing bacterial resistance to existing antibiotics.In general, most experts agree that "mental health" and "mental natural viagra alternative" are not opposites.While major developments in medicine were occurring in the generic viagra pills world during the medieval period, the Western world remained dependent upon the Greco-Roman theory of humorism, which led to questionable treatments such as bloodletting.

Archive for the 'Specs' Category

05
Aug
2010

Updates on Microsoft Oslo and “SSH on Windows”

by William (@vambenepe on Twitter)

I’ve been tracking the modeling technology previously known as “Microsoft Oslo” with a sympathetic eye for the almost three years since it’s been introduced. I look at it from the perspective of model-driven IT management but the news hadn’t been good on that front lately (except for Douglas Purdy’s encouraging hint).

The prospects got even bleaker today, at least according to the usually-well-informed Mary Jo Foley, who writes: “Multiple contacts of mine are telling me that Microsoft has decided to shelve Quadrant and ‘refocus’ M.” Is “M” the end of the SDM/SML/M model-driven management approach at Microsoft? Or is the “refocus” a hint that M is returning “home” to address IT management use cases? Time (or Doug) will tell…

While we’re talking about Microsoft and IT automation, I have one piece of free advice for the Microsofties: people *really* want to SSH into Windows servers. Here’s how I know. This blog rarely talks about Microsoft but over the course of two successive weekends over a year ago I toyed with ways to remotely manage Windows machines using publicly documented protocols. In effect, showing what to send on the wire (from Linux or any platform) to leverage the SOAP-based management capabilities in recent versions of Windows. To my surprise, these posts (1, 2, 3) still draw a disproportionate amount of traffic. And whenever I look at my httpd logs, I can count on seeing search engine queries related to “windows native ssh” or similar keywords.

If heterogeneous Cloud is something Microsoft cares about they need to better leverage the potential of the PowerShell Remoting Protocol. They can release open-source Python, Java and Ruby client-side libraries. Alternatively, they can drastically simplify the protocol, rather than its current “binary over SOAP” (you read this right) incarnation. Because the poor Kridek who is looking for the “WSDL for WinRM / Remote Powershell” is in for a nasty surprise if he finds it and thinks he’ll get a ready-to-use stub out of it.

That being said, a brave developer willing to suck it up and create such a Python/Ruby/Java library would probably make some people very grateful.

18
Jul
2010

Introducing the Oracle Cloud API

by William (@vambenepe on Twitter)

Oracle recently published a Cloud management API on OTN and also submitted a subset of the API to the new DMTF Cloud Management working group. The OTN specification, titled “Oracle Cloud Resource Model API”, is available here. In typical DMTF fashion, the DMTF-submitted specification is not publicly available (if you have a DMTF account and are a member of the right group you can find it here). It is titled the “Oracle Cloud Elemental Resource Model” and is essentially the same as the OTN version, minus sections 9.2, 9.4, 9.6, 9.8, 9.9 and 9.10 (I’ll explain below why these sections have been removed from the DMTF submission). Here is also a slideset that was recently used to present the submitted specification at a DMTF meeting.

So why two documents? Because they serve different purposes. The Elemental Resource Model, submitted to DMTF, represents the technical foundation for the IaaS layer. It’s not all of IaaS, just its core. You can think of its scope as that of the base EC2 service (boot a VM from an image, attach a volume, connect to a network). It’s the part that appears in all the various IaaS APIs out there, and that looks very similar, in its model, across all of them. It’s the part that’s ripe for a simple standard, hopefully free of much of the drama of a more open-ended and speculative effort. A standard that can come out quickly and provide interoperability right out of the gate (for the simple use cases it supports), not after years of plugfests and profiles. This is the narrow scope I described in an earlier rant about Cloud standards:

I understand the pain of customers today who just want to have a bit more flexibility and portability within the limited scope of the VM/Volume/IP offering. If we really want to do a standard today, fine. Let’s do a very small and pragmatic standard that addresses this. Just a subset of the EC2 API. Don’t attempt to standardize the virtual disk format. Don’t worry about application-level features inside the VM. Don’t sweat the REST or SOA purity aspects of the interface too much either. Don’t stress about scalability of the management API and batching of actions. Just make it simple and provide a reference implementation. A few HTTP messages to provision, attach, update and delete VMs, volumes and IPs. That would be fine. Anything else (and more is indeed needed) would be vendor extensions for now.

Of course IaaS goes beyond the scope of the Elemental Resource Model. We’ll need load balancing. We’ll need tunneling to the private datacenter. We’ll need low-latency sub-networks. We’ll need the ability to map multi-tier applications to different security zones. Etc. Some Cloud platforms support some of these (e.g. Amazon has an answer to all but the last one), but there is a lot more divergence (both in the “what” and the “how”) between the various Cloud APIs on this. That part of IaaS is not ready for standardization.

Then there are the extensions that attempt to make the IaaS APIs more application-aware. These too exist in some Cloud APIs (e.g. vCloud vApp) but not others. They haven’t naturally converged between implementations. They haven’t seen nearly as much usage in the industry as the base IaaS features. It would be a mistake to overreach in the initial phase of IaaS standardization and try to tackle these questions. It would not just delay the availability of a standard for the base IaaS use cases, it would put its emergence and adoption in jeopardy.

This is why Oracle withheld these application-aware aspects from the DMTF submission, though we are sharing them in the specification published on OTN. We want to expose them and get feedback. We’re open to collaborating on them, maybe even in the scope of a standard group if that’s the best way to ensure an open IP framework for the work. But it shouldn’t make the upcoming DMTF IaaS specification more complex and speculative than it needs to be, so we are keeping them as separate extensions. Not to mention that DMTF as an organization has a lot more infrastructure expertise than middleware and application expertise.

Again, the “Elemental Resource Model” specification submitted to DMTF is the same as the “Oracle Cloud Resource Model API” on OTN except that it has a different license (a license grant to DMTF instead of the usual OTN license) and is missing some resources in the list of resource types (section 9).

Both specifications share the exact same protocol aspects. It’s pretty cleanly RESTful and uses a JSON serialization. The credit for the nice RESTful protocol goes to the folks who created the original Sun Cloud API as this is pretty much what the Oracle Cloud API adopted in its entirety. Tim Bray described the genesis and design philosophy of the Sun Cloud API last year. He also described his role and explained that “most of the heavy lifting was done by Craig McClanahan with guidance from Lew Tucker“. It’s a shame that the Oracle specification fails to credit the Sun team and I kick myself for not noticing this in my reviews. This heritage was noted from the get go in the slides and is, in my mind, a selling point for the specification. When I reviewed the main Cloud APIs available last summer (the first part in a “REST in practice for IT and Cloud management” series), I liked Sun’s protocol design the best.

The resource model, while still based on the Sun Cloud API, has seen many more changes. That’s where our tireless editor, Jack Yu, with help from Mark Carlson, has spent most of the countless hours he devoted to the specification. I won’t do a point to point comparison of the Sun model and the Oracle model, but in general most of the changes and additions are motivated by use cases that are more heavily tilted towards private clouds and compatibility with existing application infrastructure. For example, the semantics of a Zone have been relaxed to allow a private Cloud administrator to choose how to partition the Cloud (by location is an obvious option, but it could also by security zone or by organizational ownership, as heretic as this may sound to Cloud purists).

The most important differences between the DMTF and OTN versions relate to the support for assemblies, which are groups of VMs that jointly participate in the delivery of a composite application. This goes hand-in-hand with the recently-released Oracle Virtual Assembly Builder, a framework for creating, packing, deploying and configuring multi-tier applications. To support this approach, the Cloud Resource Model (but not the Elemental Model, as explained above) adds resource types such as AssemblyTemplate, AssemblyInstance and ScalabilityGroup.

So what now? The DMTF working group has received a large number of IaaS APIs as submissions (though not the one that matters most or the one that may well soon matter a lot too). If all goes well it will succeed in delivering a simple and useful standard for the base IaaS use cases, and we’ll be down to a somewhat manageable triplet (EC2, RackSpace/OpenStack and DMTF) of IaaS specifications. If not (either because the DMTF group tries to bite too much or because it succumbs to infighting) then DMTF will be out of the game entirely and it will be between EC2, OpenStack and a bunch of private specifications. It will be the reign of toolkits/library/brokers and hell on earth for all those who think that such a bridging approach is as good as a standard. And for this reason it will have to coalesce at some point.

As far as the more application-centric approach to hypervisor-based Cloud, well, the interesting things are really just starting. Let’s experiment. And let’s talk.

25
May
2010

Dear Cloud API, your fault line is showing

by William (@vambenepe on Twitter)

Most APIs are like hospital gowns. They seem to provide good coverage, until you turn around.

I am talking about the dreadful state of fault reporting in remote APIs, from Twitter to Cloud interfaces. They are badly described in the interface documentation and the implementations often don’t even conform to what little is documented.

If, when reading a specification, you get the impression that the “normal” part of the specification is the result of hours of whiteboard debate but that the section that describes the faults is a stream-of-consciousness late-night dump that no-one reviewed, well… you’re most likely right. And this is not only the case for standard-by-committee kind of specifications. Even when the specification is written to match the behavior of an existing implementation, error handling is often incorrectly and incompletely described. In part because developers may not even know what their application returns in all error conditions.

After learning the lessons of SOAP-RPC, programmers are now more willing to acknowledge and understand the on-the-wire messages received and produced. But when it comes to faults, there is still a tendency to throw their hands in the air, write to the application log and then let the stack do whatever it does when an unhandled exception occurs, on-the-wire compliance be damned. If that means sending an HTML error message in response to a request for a JSON payload, so be it. After all, it’s just a fault.

But even if fault messages may only represent 0.001% of the messages your application sends, they still represent 85% of those that the client-side developers will look at.

Client developers can’t even reverse-engineer the fault behavior by hitting a reference implementation (whether official or de-facto) the way they do with regular messages. That’s because while you can generate response messages for any successful request, you don’t know what error conditions to simulate. You can’t tell your Cloud provider “please bring down your user account database for five minutes so I can see what faults you really send me when that happens”. Also, when testing against a live application you may get a different fault behavior depending on the time of day. A late-night coder (or a daytime coder in another time zone) might never see the various faults emitted when the application (like Twitter) is over capacity. And yet these will be quite common at peak time (when the coder is busy with his day job… or sleeping).

All these reasons make it even more important to carefully (and accurately) document fault behavior.

The move to REST makes matters even worse, in part because it removes SOAP faults. There’s nothing magical about SOAP faults, but at least they force you to think about providing an information payload inside your fault message. Many REST APIs replace that with HTTP error codes, often accompanied by a one-line description with a sometimes unclear relationship with the semantics of the application. Either it’s a standard error code, which by definition is very generic or it’s an application-defined code at which point it most likely overlaps with one or more standard codes and you don’t know when you should expect one or the other. Either way, there is too much faith put in the HTTP code versus the payload of the error. Let’s be realistic. There are very few things most applications can do automatically in response to a fault. Mainly:

  • Ask the user to re-enter credentials (if it’s an authentication/permission issue)
  • Retry (immediately or after some time)
  • Report a problem and fail

So make sure that your HTTP errors support this simple decision tree. Beyond that point, listing a panoply of application-specific error codes looks like an attempt to look “RESTful” by overdoing it. In most cases, application-specific error codes are too detailed for most automated processing and not detailed enough to help the developer understand and correct the issue. I am not against using them but what matters most is the payload data that comes along.

On that aspect, implementations generally fail in one of two extremes. Some of them tell you nothing. For example the payload is a string that just repeats what the documentation says about the error code. Others dump the kitchen sink on you and you get a full stack trace of where the error occurred in the server implementation. The former is justified as a security precaution. The latter as a way to help you debug. More likely, they both just reflect laziness.

In the ideal world, you’d get a detailed error payload telling you exactly which of the input parameters the application choked on and why. Not just vague words like “invalid”. Is parameter “foo” invalid for syntactical reasons? Is it invalid because inconsistent with another parameter value in the request? Is it invalid because it doesn’t match the state on the server side? Realistically, implementations often can’t spend too many CPU cycles analyzing errors and generating such detailed reports. That’s fine, but then they can include a link to a wiki a knowledge base where more details are available about the error, its common causes and the workarounds.

Your API should document all messages accurately and comprehensively. Faults are messages too.

02
Apr
2010

Enterprise application integration patterns for IT management: a blast from the past or from the future?

by William (@vambenepe on Twitter)

In a recent blog post, Don Ferguson (CTO at CA) describes CA Catalyst, a major architectural overhaul which “applies enterprise application integration patterns to the problem of integrating IT management systems”. Reading this was fascinating to me. Not because the content was some kind of revelation, but exactly for the opposite reason. Because it is so familiar.

For the better part of the last decade, I tried to build just this at HP. In the process, I worked with (and sometimes against) Don’s colleague at IBM, who were on the same mission. Both companies wanted a flexible and reliable integration platform for all aspects of IT management. We had decided to use Web services and SOA to achieve it. The Web services management protocols that I worked on (WSMF, WSDM, WS-Management and the “reconciliation stack”) were meant for this. We were after management integration more than manageability. Then came CMDBf, another piece of the puzzle. From what I could tell, the focus on SOA and Web services had made Don (who was then Mr. WebSphere) the spiritual father of this effort at IBM, even though he wasn’t at the time focused on IT management.

As far as I know, neither IBM nor HP got there. I covered some of the reasons in this post-mortem. The standards bickering. The focus on protocols rather than models. The confusion between the CMDB as a tool for process/service management versus a tool for software integration. Within HP, the turmoil from the many software acquisitions didn’t help, and there were other reasons. I am not sure at this point whether either company is still aiming for this vision or if they are taking a different approach.

But apparently CA is still on this path, and got somewhere. At least according to Don’s post. I have no insight into what was built beyond what’s in the post. I am not endorsing CA Catalyst, just agreeing with the design goals listed by Don. If indeed they have built it, and the integration framework resists the test of time, that’s impressive. And exciting. It apparently even uses some the same pieces we were planning to use, namely WS-Management and CMDBf (I am reluctantly associated with the first and proudly with the second).

While most readers might not share my historical connection with this work, this is still relevant and important to anyone who cares about IT management in the enterprise. If you’re planning to be at CA World, go listen to Don. Web services may have a bad name, but the technical problems of IT management integration remain. There are only a few routes to IT management automation (I count seven, the one taken by CA is #2). You can throw away SOAP if you want, you still need to deal with protocol compatibility, model alignment and instance reconciliation. You need to centralize or orchestrate the management operations performed. You need to be able to integrate with complementary products or at the very least to effectively incorporate your acquisitions. It’s hard stuff.

Bonus point to Don for not forcing a “Cloud” angle for extra sparkle. This is core IT management.

01
Mar
2010

Two versions of a protocol is one too many

by William (@vambenepe on Twitter)

There is always a temptation, when facing a hard design decision in the process of creating an interface or a protocol, to produce two (or more) versions. It’s sometimes a good idea, as a way to explore where each one takes you so you can make a more informed choice. But we know how this invariably ends up. Documents get published that arguably should not. It’s even harder in a standard working group, where someone was asked (or at least encouraged) by the group to create each of the alternative specifications. Canning one is at best socially awkward (despite the appearances, not everyone in standards is a psychopath or a sadist) and often politically impossible.

And yet, it has to be done. Compare the alternatives, then pick one and commit. Don’t confuse being accommodating with being weak.

The typical example these days is of course SOAP versus REST: the temptation is to support both rather than make a choice. This applies to standards and to proprietary interfaces. When a standard does this, it hurts rather than promote interoperability. Vendors have a bit more of an excuse when they offer a choice (“the customer is always right”) but in reality it forces customers to play Russian roulette whether they want it or not. Because one of the alternatives will eventually be left behind (either discarded or maintained but not improved). If you balance the small immediate customer benefit of using the interface style they are most used to with the risk of redoing the integration down the road, the value proposition of offering several options crumbles.

[Pedantic disclaimer: I use the term "REST" in this post the way it is often (incorrectly) used, to mean pretty much anything that uses HTTP without a SOAP wrapper. The technical issues are a topic for other posts.]

CMDBf

CMDBf v1 is a DMTF standard. It is a SOAP-based protocol. For v2, it has been suggested that there should a REST version. I don’t know what the CMDBf group (in which I participate) will end up doing but I’ve made my position clear: I could go either way (remain with SOAP or dump it) but I do not want to have two versions of the protocol (one SOAP one REST). If we think we’re better off with a REST version, then let’s make v2 REST-only. Supporting both mechanisms in v2 would be stupid. They would address the same use cases and only serve to provide political ass-coverage. There is no functional need for both. The argument that we need to keep supporting SOAP for the benefit of those who implemented v1 doesn’t fly. As an implementer, nobody is saying that you need to turn off your v1 services the second you launch the v2 version.

DMTF Cloud

Between the specifications submitted directly to DMTF, the specifications developed by DMTF “partner” organizations and the existing DMTF protocols, the DMTF Cloud effort is presented with a mix of SOAP, RESTful and XML-RPC-over-HTTP options. In the process of deciding what to create or adopt I am sure that the temptation will be high to take the easy route of supporting several versions to placate everyone. But such a “consensus” would be achieved on the back of the implementers so I very much hope it won’t be the case.

When it is appropriate

There are cases where supporting alternatives options is worth the cost. But it typically happens when they serve very different use cases. Think of SAX versus DOM, which have clearly differentiated sweetspots. In the Cloud world, Amazon S3 gives us interesting examples of both justified and extraneous alternatives. The extraneous one is the choice between REST and SOAP for the S3 API. I often praise AWS for its innovation and pragmatism, but this is an example of something that only looks pragmatic. On the other hand, the AWS import/export mechanism is a useful alternative. It allows you to physically ship a device with a few terabytes of data to Amazon. This is technically an alternative to the S3 programmatic interface, but one with obviously differentiated use cases. I recommend you reserve the use of “alternative APIs” for such scenarios.

If it didn’t work for Tiger Woods, it won’t work for your Cloud API either. Learn to commit.

[CLARIFICATION: based on some of the early Twitter feedback on this entry, I want to clarify that it's alternative versions that I am against, not successive versions (i.e. an evolution of the interface over time). How to manage successive versions properly is a whole other debate.]

19
Feb
2010

HP has submitted a specification to the DMTF Cloud incubator

by William (@vambenepe on Twitter)

When I lamented, in a previous post, that I couldn’t tell you about recent submissions to the DMTF Cloud incubator, one of those I had in mind was a submission from HP. I can now write this, because the author of the specification, Nigel Cook, has recently blogged about it. Unfortunately he is isn’t publishing the specification itself, just an announcement that it was submitted. Hopefully he is currently going through the long approval process to make the submitted document public (been there, done that, I know it takes time).

In the blog, Nigel makes a good argument for the need to go beyond a hypervisor-centric view of Cloud computing. Even at the IaaS layer there are cases of automated-but-not-virtualized deployment that have all the characteristics of Cloud computing and need to be supported by Cloud management APIs. Not to mention OS-level isolation like Solaris Containers.

Nigel also offers a spirited defense of SOAP-based protocols. I don’t necessarily agree with all his points (“one could easily map the web service definition I described to REST if that was important” suggests a “it’s just SOAP without the wrapper” view of REST), but I am glad he is launching this debate. We need to discuss this rather than assume that REST is the obvious answer. Remember, a few years ago SOAP was just as obvious an answer to any protocol question. It may well be that indeed REST comes out ahead of this discussion, but the process will force us to be explicit about what benefits of REST we are trying to achieve and will allow us to be practical in the way we approach it.

17
Feb
2010

Waiting for events (in Cloud APIs)

by William (@vambenepe on Twitter)

Events/alerts/notifications have been a central concept in IT management at least since the first SNMP trap was emitted, and probably even long before that. And yet they are curiously absent from all the Cloud management APIs/protocols. If you think that’s because “THE CLOUD CHANGES EVERYTHING” then you may have to think again. Over the last few days, two of the most experienced practitioners of Cloud computing pointed out that this omission is a real pain in the neck. RightScale’s Thorsten von Eicken was first to request “an event based interface instead of a request-reply based interface”, pointing out that “we run a good number of machines that do nothing but chew up 100% cpu polling EC2 to detect changes”. George Reese seconded and started to sketch a solution. And while these blog posts gave the issue increased visibility recently, it has been a recurring topic on the AWS Forum and other similar discussion boards for quite some time. For example, in this thread going back to 2006, an Amazon employee wrote that “this is a feature we’ve discussed recently and we’re looking at options” (incidentally, I see a post by Thorsten in that old thread). We’re still waiting.

Let’s look at what it would take to define such a feature.

I have some experience with events for IT management, having been involved in the WS-Notification family of specifications and having co-chaired the OASIS technical committee that standardized them. This post is not about foisting WS-Notification on Cloud APIs, but just about surfacing some of the questions that come up when you try to standardize such a mechanism. While the main use cases for WS-Notification came from IT (and Grid) management, it was supposed to be a generic mechanism. A Cloud-centric eventing protocol can be made simpler by focusing on fewer use cases (Cloud scenarios only). In addition, WS-Notification was marred by the complexity-is-a-sign-of-greatness spirit of the time . On this too, a Cloud eventing protocol could improve things by keeping IBM at bay simplicity in mind.

Types of event

When you pull the state of a resource to see if anything changed,  you don’t have to tell the provider what kind of change you are interested in. If, on the other hand, you want the provider to notify you, then they need to know what you care about. You may not want to be notified on every single change in the resource state. How do you describe the changes you care about? Is there an agreed-upon set of states for the resource and you are only notified on state transitions? Can you indicate the minimum severity level for an event to be emitted? Who determines the severity of an event? Or do you get to specify what fields in the resource state you want to watch? What about numeric values for which you may not want to be notified of every change but only when a threshold is crossed? Do you get to specify a query and get notified whenever the query result changes? In WS-Notification some of this is handled by WS-Topics which I still like conceptually (I co-edited it) but is too complex for the task at hand.

Event formats

What format are the events serialized in? How is the even metadata captured (e.g. time stamp of observation, which may not be the same as the time at which the notification message was sent)? If the event payload is a representation of the new state of the resource, does it indicate what field changes (and what the old value was)? How do you keep event payloads consistent with the resource representation in the request/response interactions? If many events occur near the same time, can you group them in one notification message for better scalability?

Subscription creation

Presumably you need a subscription mechanism. Is the subscription set in stone when the resource is created? Or can you come later and subscribe? If subscription is an operation on the resource itself, how do you subscribe for events on something that doesn’t exist yet (e.g. “create a VM and notify me once it’s started”)? Do you get to set subscriptions on a per-resource-basis? Or is this a global setting for all the resources that you own? Can you have two different subscriptions on the same resource (e.g. a “critical events only” subscription that exist throughout the life of the resource, plus a “lots of events please” subscription that you keep for a few hours while troubleshooting)?

Subscription management

Do you get to come back and update/pause/delete a subscription? Do you get to change what filter the subscription carries? Or is it set in stone until the subscription expires? Can you change the delivery endpoint? What if events fail to be delivered? Does the provider cancel your subscription? After how many failures? Does it just pause it for a few hours? Keep trying?

Subscription expiration

Who sets the expiration period? The subscriber? Can the provider set a max duration? Do you get a warning message before the subscription expires? Can you renew a subscription or do you have to create a new one? Do you get a message telling you that it has expired? Where are these subscription-lifecycle messages sent? To the same endpoint as the regular messages? What if your subscription is being killed because your deliver endpoint is down, clearly it makes no sense to send the warning message to that same endpoint. Do you provide a separate “subscription management” endpoint (different from the event delivery endpoint) when you subscribe? Alternatively, does an email message get sent to the registered user who set the subscription?

Delivery reliability

How reliable do you want the notifications to be? Should the emitter retry until they’ve received a confirmation? How long do they keep messages that can’t be delivered? Some may have a very short shelf life while others are still useful weeks later. If you don’t have a reliable mechanism but you really “need to know about a lost server within a minute of it disappearing” (the example Georges gives) then in reality you may still have to poll just to make sure that an event wasn’t lost. If you haven’t received an event in a while, how can you test if the subscription is still working? Should subscriptions send a heartbeat message once a while?

Delivery mechanism

How do you deliver notifications? Do you keep HTTP connections open through tricks similar to how self-updating web pages work (e.g. COMET, long polling and soon WebSockets)? Or do you just provide a listener endpoint to which the notifier tries to connect (which, in the case of public cloud deployments, means you need to have a publicly-addressable listener, but hopefully not on the same Cloud infrastructure). Do you use XMPP? AMQP? Email? Can I have you hold my events and let me come pull them?

Security

Do you need to verify the origin of the events you receive? Or do you assume they may be forged and always initiate a connection to the provider to double-check? And on the other side, what are the security requirements for event delivery? If a user looses some of their privileges, do you have to go and cancel the still-active subscriptions that they created?

Throttling

Is there a maximum event rate? Do you get charged for the events the Cloud provider sends you? How do you make sure that someone doesn’t create a subscription pointing to the wrong endpoint (either erroneously or maliciously, e.g. DoS). Do you send a test message at registration asking the delivery endpoint to acknowledge that they indeed want to receive these notifications?

Conclusion

My goal is not to argue that we cannot have a simple yet good enough notification system or to scare anyone from attempting to define it. It’s just to show that it’s not as simple as it may seem at first blush. But there probably is a sweetspot and people like Thorsten and George are very well qualified to find it.

[UPDATED 2010/4/7: Amazon releases AWS Simple notification Service. Not just as an eventing feature for the Cloud API, as a generic notification service. Which can, of course, also carry Cloud management events. Though at this point you're on your own to publish them from your instances, it doesn't look like the AWS infrastructure can do it for you. Which means, for example, that you're not going to be able to publish an event for a sudden crash.]

14
Feb
2010

Can Cloud standards be saved?

by William (@vambenepe on Twitter)

Then: Web services standards

One of the most frustrating aspects of how Web services standards shot themselves in the foot via unchecked complexity is that plenty of people were pointing out the problem as it happened. Mark Baker (to whom I noticed Don Box also paid tribute recently) is the poster child. I remember Tom Jordahl tirelessly arguing for keeping it simple in the WSDL working group. Amberpoint’s Fred Carter did it in WSDM (in the post announcing the recent Amberpoint acquisition, I mentioned that “their engineers brought to the [WSDM] group a unique level of experience and practical-mindedness” but I could have added “… which we, the large companies, mostly ignored.”)

The commonality between all these voices is that they didn’t come from the large companies. Instead they came from the “specialists” (independent contractors and representatives from small, specialized companies). Many of the WS-* debates were fought along alliance lines. Depending on the season it could be “IBM vs. Microsoft”, “IBM+Microsoft vs. Oracle”, “IBM+HP vs. Microsoft+Intel”, etc… They’d battle over one another’s proposal but tacitly agreed to brush off proposals from the smaller players. At least if they contained anything radically different from the content of the submission by the large companies. And simplicity is radical.

Now: Cloud standards

I do not reminisce about the WS-* standards wars just for old time sake or the joy of self-flagellation. I also hope that the current (and very important) wave of standards, related to all things Cloud, can do better than the Web services wave did with regards to involving on-the-ground experts.

Even though I still work for a large company, I’d like to see this fixed for Cloud standards. Not because I am a good guy (though I hope I am), but because I now realize that in the long run this lack of perspective even hurts the large companies themselves. We (and that includes IBM and Microsoft, the ringleaders of the WS-* effort) would be better off now if we had paid more attention then.

Here are two reasons why the necessity to involve and include specialists is even more applicable to Cloud standards than Web services.

First, there are many more individuals (or small companies) today with a lot of practical Cloud experience than there were small players with practical Web services experience when the WS-* standardization started (Shlomo Swidler, Mitch Garnaat, Randy Bias, John M. Willis, Sam Johnston, David Kavanagh, Adrian Cole, Edward M. Goldberg, Eric Hammond, Thorsten von Eicken and Guy Rosen come to mind, though this is nowhere near an exhaustive list). Which means there is even more to gain by ensuring that the Cloud standard process is open to them, should they choose to engage in some form.

Second, there is a transparency problem much larger than with Web services standards. For all their flaws, W3C and OASIS, where most of the WS-* work took place, are relatively transparent. Their processes and IP policies are clear and, most importantly, their mailing list archives are open to the public. DMTF, where VMWare, Fujitsu and others have submitted Cloud specifications, is at the other hand of the transparency spectrum. A few examples of what I mean by that:

  • I can tell you that VMWare and Fujitsu submitted specifications to DMTF, because the two companies each issued a press release to announce it. I can’t tell you which others did (and you can’t read their submissions) because these companies didn’t think it worthy of a press release. And DMTF keeps the submission confidential. That’s why I blogged about the vCloud submission and the Fujitsu submission but couldn’t provide equivalent analysis for the others.
  • The mailing lists of DMTF working groups are confidential. Even a DMTF member cannot see the message archive of a group unless he/she is a member of that specific group. The general public cannot see anything at all. And unless I missed it on the site, they cannot even know what DMTF working groups exist. It makes you wonder whether Dick Cheney decided to call his social club of energy company executives a “Task Force” because he was inspired by the secrecy of the DMTF (“Distributed Management Task Force”). Even when the work is finished and the standard published, the DMTF won’t release the mailing list archive, even though these discussions can be a great reference for people who later use the specification.
  • Working documents are also confidential. Working groups can decide to publish some intermediate work, but this needs to be an explicit decision of the group, then approved by its parent group, and in practice it happens rarely (mileage varies depending on the groups).
  • Even when a document is published, the process to provide feedback from the outside seems designed to thwart any attempt. Or at least that’s what it does in practice. Having blogged a fair amount on technical details of two DMTF standards (CMDBf and WS-Management) I often get questions and comments about these specifications from readers. I encourage them to bring their comments to the group and point them to the official feedback page. Not once have I, as a working group participant, seen the comments come out on the other end of the process.

So let’s recap. People outside of DMTF don’t know what work is going on (even if they happen to know that a working group called “Cloud this” or “Cloud that” has been started, the charter documents and therefore the precise scope and list of deliverables are also confidential). Even if they knew, they couldn’t get to see the work. And even if they did, there is no convenient way for them to provide feedback (which would probably arrive too late anyway). And joining the organization would be quite a selfless act because they then have to pay for the privilege of sharing their expertise while not being included in the real deciding circles anyway (unless there are ready to pony up for the top membership levels). That’s because of the unclear and unstable processes as well as the inordinate influence of board members and officers who all are also company representatives (in W3C, the strong staff balances the influence of the sponsors, in OASIS the bylaws limit arbitrariness by the board members).

What we are missing out on

Many in the standards community have heard me rant on this topic before. What pushed me over the edge and motivated me to write this entry was stumbling on a crystal clear illustration of what we are missing out on. I submit to you this post by Adrian Cole and the follow-up (twice)by Thorsten von Eicken. After spending two days at a face to face meeting of the DMTF Cloud incubator (in an undisclosed location) this week, I’ll just say that these posts illustrate a level of practically and a grounding in real-life Cloud usage that was not evident in all the discussions of the incubator. You don’t see Adrian and Thorsten arguing about the meaning of the word “infrastructure”, do you? I’d love to point you to the DMTF meeting minutes so you can judge for yourself, but by now you should understand why I can’t.

So instead of helping in the forum where big vendors submit their specifications, the specialists (some of them at least) go work in OGF, and produce OCCI (here is the mailing list archive). When Thorsten von Eicken blogs about his experience using Cloud APIs, they welcome the feedback and engage him to look at their work. The OCCI work is nice, but my concern is that we are now going to end up with at least two sets of standard specifications (in addition to the multitude of company-controlled specifications, like the ubiquitous EC2 API). One from the big companies and one from the specialists. And if you think that the simplest, clearest and most practical one will automatically win, well I envy your optimism. Up to a point. I don’t know if one specification will crush the other, if we’ll have a “reconciliation” process, if one is going to be used in “private Clouds” and the other in “public Clouds” or if the conflict will just make both mostly irrelevant. What I do know is that this is not what I want to see happen. Rather, the big vendors (whose imprimatur is needed) and the specialists (whose experience is indispensable) should work together to make the standard technically practical and widely adopted. I don’t care where it happens. I don’t know whether now is the right time or too early. I just know that when the time comes it needs to be done right. And I don’t like the way it’s shaping up at the moment. Well-meaning but toothless efforts like cloud-standards.org don’t make me feel better.

I know this blog post will be read both by my friends in DMTF and by my friends in Clouderati. I just want them to meet. That could be quite a party.

IBM was on to something when it produced this standards participation policy (which I commented on in a cynical-yet-supportive way – and yes I realize the same cynicism can apply to me). But I haven’t heard of any practical effect of this policy change. Has anyone seen any? Isn’t the Cloud standard wave the right time to translate it into action?

Transparency first

I realize that it takes more than transparency to convince specialists to take a look at what a working group is doing and share their thoughts. Even in a fully transparent situation, specialists will eventually give up if they are stonewalled by process lawyers or just ignored and marginalized (many working group participants have little bandwidth and typically take their cues from the big vendors even in the absence of explicit corporate alignment). And this is hard to fix. Processes serve a purpose. While they can be used against the smaller players, they also in many cases protect them. Plus, for every enlightened specialist who gets discouraged, there is a nutcase who gets neutralized by the need to put up a clear proposal and follow a process. I don’t see a good way to prevent large vendors from using the process to pressure smaller ones if that’s what they intend to do. Let’s at least prevent this from happening unintentionally. Maybe some of my colleagues  from large companies will also ask themselves whether it wouldn’t be to their own benefit to actually help qualified specialists to contribute. Some “positive discrimination” might be in order, to lighten the process burden in some way for those with practical expertise, limited resources, and the willingness to offer some could-otherwise-be-billable hours.

In any case, improving transparency is the simplest, fastest and most obvious step that needs to be taken. Not doing it because it won’t solve everything is like not doing CPR on someone on the pretext that it would only restart his heart but not cure his rheumatism.

What’s at risk if we fail to leverage the huge amount of practical Cloud expertise from smaller players in the standards work? Nothing less than an unpractical set of specifications that will fail to realize the promises of Cloud interoperability. And quite possibly even delay them. We’ve seen it before, haven’t we?

Notice how I haven’t mentioned customers? It’s a typical “feel-good” line in every lament about standards to say that “we need more customer involvement”. It’s true, but the lament is old and hasn’t, in my experience, solved anything. And today’s economical climate makes me even more dubious that direct customer involvement is going to keep us on track for this standardization wave (though I’d love to be proven wrong). Opening the door to on-the-ground-working-with-customers experts with a very neutral and pragmatic perspective has a better chance of success in my mind.

As a point of clarification, I am not asking large companies to pick a few small companies out of their partner ecosystem and give them a 10% discount on their alliance membership fee in exchange for showing up in the standards groups and supporting their friendly sponsor. This is a common trick, used to pack a committee, get the votes and create an impression of overwhelming industry support. Nobody should pick who the specialists are. We should do all we can to encourage them to come. It will be pretty clear who they are when they start to ask pointed questions about the work.

Finally, from the archives, a more humorous look at how various standards bodies compare. And the proof that my complaints about DMTF secrecy aren’t new.

28
Dec
2009

PaaS as the path to MDA?

by William (@vambenepe on Twitter)

Lots of communities think of Cloud Computing as the realization of a vision that they have been pusuing for a while (“sure we didn’t call it Cloud back then but…”). Just ask the Grid folks, the dynamic data center folks (DCML, IBM’s “Autonomic Computing”, HP’s “Adaptive Enterprise”,  Microsoft’s DSI), the ASP community, and those of us who toiled on what was going to be the SOAP-based management stack for all IT (e.g. my HP colleagues and I can selectively quote mentions of “adaptation mechanisms like resource reservation, allocation/de-allocation” and “management as a service” in this WSMF white paper from 2003 to portray WSMF as a precursor to all the Cloud APIs of today).

I thought of another such community today, as I ran into older OMG specifications: the Model-Driven Architecture (MDA) community. I have no idea what people in this community actually think of Cloud Computing, but it seems to me that PaaS is a chance to come close to part of their vision. For two reasons: PaaS makes it easier and more rewarding, all at the same time, to practice model-driven design. More bang for less buck.

Easier

My understanding of the MDA value proposition is that it would allow you to create a high-level design (at the level of something like an augmented version of UML) and have it automatically turn into executable code (e.g. that can run in a JEE or .NET container). I am probably making it sound more naive than it really is, but not by much. That’s a might wide gap to bridge, for QVT and friends, from UMLish to byte-code and it’s no surprise that the practical benefits of MDA are still to be seen (to put it kindly).

In a PaaS/SaaS world, on the other hand, you are mapping to something that is higher level than byte code. Depending on what types of PaaS containers you envision, some of the abstractions provided by these containers (e.g. business process execution, event processing) are a lot closer to the concepts manipulated in your PIM (Platform-independent model, the UMLish mentioned above). Thus a smaller gap to bridge and a better chance of it being automagical. Especially if you add a few SaaS building blocks to the mix.

More rewarding

Not only should it be easier to map a PIM to a PaaS deployment environments, the benefits you get once you are done are incommensurably greater. Rather than getting a dump of opaque auto-generated byte-code running in a regular JVM/CLR, you get an environments in which the design concepts (actors/services, process, rules, events) and the business model elements are first class citizens of the platform management infrastructure. So that you can monitor and set policies on the same things that you manipulate in you PIM. As opposed to falling down to the lowest common denominator of CPU/memory metrics. Or, god forbid, trying to diagnose/optimize machine-generated code.

We shall see

I wasn’t thinking of Microsoft SQL Server Modeling (previously known as Oslo) when I wrote this, but Doug Purdy’s tweet made the connection for me. And indeed, one can see in SQLSM+Azure the leading candidate today to realizing the MDA vision… minus the OMG MDA specifications.

[Note: I wasn't planning to blog this, but after I tweeted the basic idea ("Attempting MDA (model-driven architecture) before inventing model-driven deployment and mgmt was hopeless. Now possibly getting there.") Shlomo requested more details and I got frustrated by the difficulty to explain my point in twitterisms. In effect, this blog entry is just an expanded tweet, not something as intensely believed, fanatically researched and authoritatively supported as my usual blog posts (ah!).]

[UPDATED 2009/12/29: Some relevant presentations from OMG-land, thanks to Jean Bezivin. Though I don't see mention of any specific plan to use/adapt MOF/XMI/QVT/etc for the Cloud.]

10
Dec
2009

REST in practice for IT and Cloud management (part 3: wrap-up)

by William (@vambenepe on Twitter)

[Preface: a few months ago I shared some thoughts about how REST was (or could) be applied to IT and Cloud management. Part 1 was a comparison of the RESTful aspects of four well-known IaaS Cloud APIs and part 2 was an analysis of how REST applies to configuration management. Both of these entries received well-informed reader comments BTW, so if you read the posts but didn't come back for the comments you really owe it to yourself to do so now. At the time, I jotted down thoughts for subsequent entries in this series, but I never got around to posting them. Since the topic seems to be getting a lot of attention these days (especially in DMTF) I decided to go back to these notes and see if I could extract a few practical recommendations in the form of a wrap-up.]

The findings listed below should be relevant whether your protocol is trying to be truly RESTful, just HTTP-centric or even zen-SOAPy. Many of the issues that arise when creating a protocol that maps well to IT management use cases should transcend these variations and that’s what I try to cover.

Finding #1: Relationships (links) are first-class entities (a.k.a. “hypermedia”)

The clear conclusion of both part 1 and part 2 was that the most relevant part of REST for IT and Cloud management is the use of hypermedia. IT management enjoys a head start on this compared to other domains, because its models are already rich in explicit relationships (e.g. CIM associations), as opposed to other business domains in which relationships are more implicit (to the end user at least). But REST teaches us that just having relationships in your model is not enough. They need to be exposed in a way that maps directly to the protocol, so that following a relationship is an infrastructure-level task, not an application-level task: passing an ID as a parameter for some domain-specific function is not it.

This doesn’t violate the rule to not mix the protocol and the model because the alignment should take place in the metamodel. XML is famously weak in that respect, but that’s where Atom steps in, handling relationships in a generic way. Similarly, support for references is, in addition to its accolade to Schematron, one of the main benefits of SML (extra kudos for apparently dropping the “EPR” reference scheme between submission and standardization, in favor of just the “URI” scheme). Not to mention RDFa and friends. Or HTTP Link headers (explained) for link-challenged types.

Finding #2: Put IDs on steroids

There is little to argue about the value of clearly identifying things of interest and we didn’t wait for the Web to realize this. But it is also one of the most vexing and complex problems in many areas of computing (including IT management). Some of the long-standing questions include:

  • Use an opaque ID (some random-looking string a characters) or an ID grounded in “unique” properties of the resource (if you can find any)?
  • At what point does a thing stop being the same (typical example: if I replace each hardware component of a server one after the other, at which point is it not the same server anymore? Does it make sense for the IT guys to slap an “asset id” sticker on the plastic box around it?)
  • How do you deal with reconciling two resources (with their own IDs) when you realize they represent the same thing?

REST guidelines don’t help with these questions. There often is an assumption, which is true for many web apps, that the application “owns” the resource. My “inbox” only exists as a resource within the mail server application (e.g. Gmail or an Exchange server). Whatever URI GMail assigns for it is the URI for my inbox, period. Things are not as simple when the resources exist outside of any specific application: take a server, for example: the board management controller (or the hypervisor in the case of a VM), the OS management layer and the management agent installed on the machine all have claims to report on the machine (and therefore a need to identify it).

To some extent, Cloud computing simplifies many of these issues by providing controllers that “own” infrastructure resources and can authoritatively identify them. But it really is only pushing the problem to the next level of the stack.

Making the ID a URI doesn’t magically answer these questions. Though it helps in that it lets you leverage reconciliation mechanisms developed around URIs (such as <atom:link rel=”alternate”> or owl:sameAs). What REST does is add another constraint to this ID mechanism: Make the IDs dereferenceable URLs rather than just URIs.

I buy into this. A simple GET on a resource URI doesn’t solve everything but it has so many advantages that it should be attempted in all cases. And make this HTTP GET please (see finding #6).

In this adoption of GET, we just have to deal with small details such as:

  • What URL do I use for resources that have more than one agent/controller?
  • How close to the resource do I point this URL? If it’s too close to it then it may change as the resource evolves (e.g. network changes) or be affected by the resource performance (e.g. a crashed machine or application that does not respond to its management API). If it’s removed from the resource, then I introduce a scope (e.g. one controller) within which the resource has to remain, which may cause scalability concerns (how many VMs can/should one controller handle, what if I want to migrate a VM across the ocean…).

These are somewhat corner cases (and the more automation and virtualization you get, the fewer possible controllers you have per resource). While they need to be addressed, they don’t come close to negating the value of dereferenceable IDs. In addition, there are plenty of mechanisms to help with the issues above, from links in the representations (obviously) to RDDL-style lightweight directory to a last resort “give Saint Peter a call” mechanism (the original WSRF proposal had a sub-specification called WS-RenewableReferences that would let you ask for a new version of an expired EPR but it was never published — WS-Naming in then-GGF also touched on that with its reference resolvers — showing once again that the base challenges don’t change as fast as technology flavors).

Implicit in this is the fact that URIs are vastly superior to EPRs. The latter were only just a band-aid on a broken system (which may have started back when WSDL 1.1 decided to define “ports” as message aggregators that can have only one URL) and it’s been more debilitating to SOAP than any other interoperability issue. Web services containers internalized this assumption to the point of providing a stunted dispatch mechanism that made it very hard to assign distinct URLs to resources.

Finding #3: If REST told you to jump off a bridge, would you do it?

Adherence to REST is not required to get the benefits I describe in this series. There is a lot to be inspired by in REST, but it shouldn’t be a religion. Sure, if you squint hard enough (and poke it here and there) you can call your interface RESTful, but why bother with the contortions if some parts are not so. As long as they don’t detract from the value of REST in the other parts. As in all conversions, the most fervent adepts of RPC will likely be tempted to become its most violent denunciators once they’re born again. This is a tired scenario that we don’t need to repeat. Don’t think of it as a conversion but as a new perspective.

Look at the “RESTful with many parameters?” comment thread on Stefan Tilkov’s excellent InfoQ introduction to REST. It starts with some shared distaste for parameter-laden URIs and a search for a more RESTful approach. This gets suggested:

You could do a post on some URI like ./query/product_dep which would create a query resource. Now you “add” products to the query either by sending a product uri list with the initial post or by calling post on ./query/product_dep/{id}. With every post to the query resource the get on the query resource would change.

Yeah, you could. But how about an RPC-like query operation rather than having yet another resource lifecycle to manage just for the sake of being REST-compliant? And BTW, how do you think any sane consumer of your API is going to handle this? You guessed it, by packaging the POST/POST/GET/DELETE in one convenient client-side library function called “query”. As much as I criticize RPC-centric toolkits (see finding #5 below), it would be justified in this case.

Either you understand why/how REST principles benefit you or you don’t. If you do, then use this understanding to interpret the REST principles to best fit your needs. If you don’t, then no amount of CONTENT-TYPE-pixie-dust-spreading, GET-PUT-POST-DELETE-golden-rule-following and HATEOAS-magical-incantation-reciting will help you. That’s the whole point, for me at least, of this tree-part investigation. Stefan says essential the same, but in a converse way, in his article: “there are often reasons why one would violate a REST constraint, simply because every constraint induces some trade-off that might not be acceptable in a particular situation. But often, REST constraints are violated due to a simple lack of understanding of their benefits.” He says “understand why you violate” and I say “understand why you obey”. It is essentially the same (if you’re into stereotypes you can attribute the difference to his Germanic heritage and my Gallic blood).

Even worse than bending your interface to appear RESTful, don’t cherry-pick your use cases to only keep those that you feel you can properly address via REST, leaving the others aside. Conversely, don’t add requirements just because REST makes them easy to support (interesting how quickly “why do you force me to manage the lifecycle of yet another resource just to run a query” turns into “isn’t this great, you can share queries among users and you can handle long-running queries, I am sure we need this”).

This is not to say that you should not create a fully RESTful system. Just that you don’t necessarily have to and you can still get many benefits as long as you open your eyes to the cost/benefits trade-off involved.

Finding #4: Learn humility from REST

Beyond the technology, there is a vibe behind REST design. You can copy the technology and still miss it. I described it in 2005 as Humble Architecture, and applied to SOA at the time. But it describes REST just as well:

More practically, this means that the key things to keep in mind when creating a service, is that you are not at the center of the universe, that you don’t know who is going to consume your service, that you don’t know what they are going to do with it, that you are not necessarily the one who can make the best use of the information you have access to and that you should be willing to share it with others openly…

The SOA Manifesto recently called this “intrinsic interoperability”.

In IT management terms, it means that you can RESTify your CMDB and your event console and your asset management software and your automation engine all you want, if you see your code as the ultimate consumer and the one that knows best, as the UI that users have to go through, the “ultimate source of truth” and the “manager of managers” then it doesn’t matter how well you use HTTP.

Finding #5: Beware of tools bearing gifts

To a large extent, the great thing about REST is how few tools there are to take it away from you. So you’re pretty much forced to understand what is going on in your contract as opposed to being kept ignorant by a wsdl2java type of toolkit. Sure, Java (and .NET) have improved in that regard, but really the cultural damage is done and the expectations have been set. Contrast this to “the ‘router’ is just a big case statement over URI-matching regexps”, from Tim Bray’s post on the Sun Cloud API, one of my main inspirations for this investigation.

REST is not inherently immune to the tool-controlling-the-hand syndrome. It’s just a matter of time until such tools try to make REST “accessible” to the “normal” developer (who can supposedly prevent thread deadlocks but not parse XML). Joe Gregorio warns about this in the context of WADL (to summarize: WADL brings XSD which leads to code generation). Keep this in mind next time someone states that REST is more “loosely coupled” than SOAP. It’s how you use it that matters.

Finding #6: Use screws, not glue, so we can peer inside and then close the lid again

The “view source” option is how I and many others learned HTML. It unfortunately created a generation of HTML monsters who never went past version 3.2 (the marbled background makes me feel young again). But it also fueled the explosion of the Web. On-the-wire inspection through soapUI is what allowed me to perform this investigation and report on it (WMI has allowed this for years, but WS-Management is what made it accessible and usable for anyone on any platform). This was, of course, in the context of SOAP which is also inspectable. Still, in that respect nothing beats plain HTTP which is why I recommend HTTP GET in finding #2 (make IDs dereferenceable) even though I don’t expect that the one-page-per-resource view is going to be the only way to access it in the finished product.

These (HTML source, on-the-wire XML and resource-description pages) rarely hit the human eye and yet their presence enables the development of the more commonly used views. Making it as easy as possible to see what is going on under the covers helps with learning, with debugging, with extending and with innovating. In the same way that 99% of web users don’t look at the HTML source (and 99.99% of them don’t see the HTTP requests) but the Web would not be what it is to them if this inspectability wasn’t been there to fuel its development.

Along the same line, make as few assumptions as possible about the consumers in your interfaces. Which, in practice, often means document what goes on the wire. WSDL/WADL can be used as a format, but they are at most one small component. Human-readable semantics are much more important.

Finding #7: Nothing is free

Part of what was so attractive about SOAP is everything you were going to get “for free” by using it. Message-level security (for all these use cases where your messages starts over HTTP, then hops onto a train, then get delivered by a carrier pigeon). Reliable messaging. Transactionality. Intermediaries (they were going to be a big deal in SOAP, as you can see in vestigial form today in the Nodes/Roles left in the spec – also, do you remember WS-Routing? I do.)

And it’s true that by now there is a body of specifications that support this as composable SOAP headers. But the lack of usage of these features contrasts with how often they were bandied in the early days of SOAP.

Well, I am detecting some of the same in the REST camp. How often have you heard about how REST enables caching? Or about how content types allows an ISP to compress images on the fly to speed up delivery over dial-up? Like in the SOAP case, these are real features and sometimes useful. It doesn’t mean that they are valuable to you. And if they are not, then don’t let them be used as justifications. Especially since they are not free. If caching doesn’t help me (because of low volume, because security considerations prevent a shared cache, etc) then its presence actually adds a cost to me, since I now have to worry whether something is cached or not and deal with ETags. Or I have to consistently remember to request the cache to be bypassed.

Finding #8: Starting by sweeping you front door.

Before you agonize about how RESTful your back-end management protocol is, how about you make sure that your management application (the user front-end) is a decent Web application? One with cool URIs , where the back button works, where bookmarks work, where the data is not hidden in some over-encompassing Flash/Silverlight thingy. Just saying.

***

Now for some questions still unanswered.

Question #1: Is this a flee market?

I am highly dubious of content negotiation and yet I can see many advantages to it. Mostly along the lines of finding #6: make it easy for people to look under the hood and get hold of the data. If you let them specify how they want to see the data, it’s obviously easier.

But there is no free lunch. Even if your infrastructure takes care of generating these different views for you (“no coding, just check the box”), you are expanding the surface of your contract. This means more documentation, more testing, more interoperability problems and more friction when time comes to modify the interface.

I don’t have enough experience with format negotiation to define the sweetspot of this practice. Is it one XML representation and one HTML, period (everything else get produced by the client by transforming the XML)? But is the XML Atom-wrapped or not? What about RDF? What about JSON? Not to forget that SOAP wrapper, how hard can it be to add. But soon enough we are in legacy hell.

Question #2: Mime-types?

The second part of Joe Gregorio’s WADL entry is all about Mime types and I have a harder time following him there. For one thing, I am a bit puzzled by the different directions in which Mime types go at the same time. For example, we have image formats (e.g. “image/png”), packaging/compression formats (e.g. “application/zip”) and application formats (e.g. “application/vnd.oasis.opendocument.text” or “application/msword”). But what if I have a zip full of PNG images? And aren’t modern word processing formats basically a zip of XML files? If I don’t have the appropriate viewer, maybe I’d like them to be at least recognized as ZIP files. I don’t see support for such composition and taxonomy in these types.

And even within one type, things seem a bit messy in practice. Looking at the registered applications in the “options” menu of my Firefox browser, I see plenty of duplication:

  • application/zip vs. application/x-zip-compressed
  • application/ms-powerpoint vs. application/vnd.ms-powerpoint
  • application/sdp vs. application/x-sdp
  • audio/mpeg vs. audio/x-mpeg
  • video/x-ms-asf vs. video/x-ms-asf-plugin

I also wonder at what level of depth I want to take my Mime types. Sure I can use Atom as a package but if the items I am passing around happen to be CIM classes (serialized to XML), doesn’t it make sense to advertise this? And within these classes, can I let you know which domain (e.g. which namespace) my resources are in (virtual machines versus support tickets)?

These questions may simply be a reflection of my lack of maturity in the fine art of using Mime types as part of protocol design. My experience with them is more of the “find the type that works through trial and error and then leave it alone” kind.

[Side note: the first time I had to pay attention to Mime types was back in 1995/1996, playing with non-parsed headers and the multipart/x-mixed-replace type to bring some dynamism to web pages (that was before JavaScript or even animated GIFs). The site is still up, but the admins have messed up the Apache config so that the CGIs aren't executed anymore but return the Python code. So, here are some early Python experiments from yours truly: this script was a "pushed" countdown and this one was a "pushed" image animation. Cool stuff at the time, though not in a "get a date" kind of way.]

On the other hand, I very much agree with Joe’s point that “less is more”, i.e. that by not dictating how the semantics of a Mime type are defined the system forces you to think about the proper way to define them (e.g. an English-language RFC). As opposed to WSDL/XSD which gives the impression that once your XML validator turns green you’re done describing your interface. These syntactic validations are a complement at best, and usually not a very useful one (see “fat-bottomed specs”).

In comments on previous posts, Stu Charlton also emphasizes the value that Mime types bring. “Hypermedia advocates exposing a variety of links for such state-transitions, along with potentially unique media types to describe interfaces to those transitions.” I get the hypermedia concept, the HATEOAS approach and its very practical benefits. But I am still dubious about the role of Mime types in achieving them and I am not the only one with such qualms. I have too much respect for Joe and Stu to dismiss it entirely, but until I get an example that makes it “click” in practice for me I won’t sweat about Mime types too much.

Question #3: Riding the Zeitgeist?

That’s a practical question rather than a technical one, but as a protocol creator/promoter you are going to have to decide whether you market it as “RESTful”. If I have learned one thing in my past involvement with standards it is that marketing/positioning/impressions matter for standards as much as for products. To a large extent, for Clouds, Linked Data is a more appropriate label. But that provides little marketing/credibility humph with CIOs compared to REST (and less buzzword-compliance for the tech press). So maybe you want to write your spec based on Linked Data and then market it with a REST ribbon (the two are very compatible anyway). Just keep in mind that REST is the obvious choice for protocols in 2009 in the same way that SOAP was a few years ago.

Of course this is not an issue if you specification is truly RESTful. But none of the current Cloud “RESTful” APIs is, and I don’t expect this to change. At least if you go by Roy Fielding’s definition (or Paul’s handy summary):

A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC's functional coupling].

And (in a comment) Mark Baker adds:

I’ve reviewed lots of “REST APIs”, many of them privately for clients, and a common theme I’ve noticed is that most folks coming from a CORBA/DCE/DCOM/WS-* background, despite all the REST knowledge I’ve implanted into their heads, still cannot get away from the need to “specify the interface”. Sometimes this manifests itself through predefined relationships between resources, specifying URI structure, or listing the possible response codes received from different resources in response to the standard 4 methods (usually a combination of all those). I expect it’s just habit. But a second round of harping on the uniform interface – that every service has the same interface and so any service-specific interface specification only serves to increase coupling – sets them straight.

So the question of whether you want to market yourself as RESTful (rather than just as “inspired by the proper use of HTTP illustrated by REST”) is relevant, if only because you may find the father of REST throwing (POSTing?) tomatoes at you. There is always a risk in wearing clothes that look good but don’t quite fit you. The worst time for your pants to fall off is when you suddenly have to start running.

For more on this, refer to Ted Neward’s excellent Roy decoder ring where he not only explains what Roy means but more importantly clarifies that “if you’re not doing REST, it doesn’t mean that your API sucks” (to which I’d add that it is actually more likely to suck if you try to ape REST than if you allow yourself to be loosely inspired by it).

***

Wrapping up the wrap-up

There is one key topic that I had originally included in this wrap-up but decided to remove: extensibility. Mark Hapner brings it up in a comment on a previous post:

It is interesting to note that HTML does not provide namespaces but this hasn’t limited its capabilities. The reason is that links are a very effective mechanism for composing resources. Rather than composition via complicated ‘embedding’ mechanisms such as namespaces, the web composes resources via links. If HTML hadn’t provided open-ended, embeddable links there would be no web.

I am the kind of guy who would have namespace-qualified his children when naming them (had my wife not stepped in) so I don’t necessarily see “extension via links” as a negation of the need for namespaces (best example: RDF). The whole topic of embedding versus linking is a great one but this post doesn’t need another thousand words and the “REST in practice” umbrella is not necessarily the best one for this discussion. So I hereby conclude my “REST in practice for IT and Cloud management” series, with the intent to eventually start a “Linked Data in practice for IT and Cloud management” series in which extensibility will be properly handled. And we can also talk about querying (conspicuously absent from Cloud APIs, unless CMDBf is now a Cloud API) and versioning. As a teaser for the application of Linked Data to IT/Cloud, I will leave you with what Vint Cerf has to say.

[UPDATED 2010/1/27: I still haven't written the promised "Linked Data in practice for IT and Cloud management" post, but this explanation of the usage of Linked Data for data.gov.uk pretty much says it all. I may still write a post describing how what Jeni says about government data applies to Cloud management APIs, but it's almost too obvious to bother. Actually, there may be reasons why Cloud management benefits even more from Linked Data than UK government data, so it may still be worth a post. At some point. When I convince myself that it may influence things rather than be background noise.]

19
Nov
2009

Review of Fujitsu’s IaaS Cloud API submission to DMTF

by William (@vambenepe on Twitter)

Things are heating up in the DMTF Cloud incubator. Back in September, VMWare submitted its vCloud API (or rather a “reader’s digest” version of it) to the group. Last week, the group released a white paper titled “Interoperable Clouds”. And a second submission, from Fujitsu, was made last week and publicly announced today.

The Fujitsu submission is called an “API design”. What this means is that it doesn’t tell you anything about what things look like on the wire. It could materialize as another “XML over HTTP” protocol (with or without SOAP wrapper), but it could just as well be implemented as a binary RPC protocol. It’s really more of an esquisse of a resource model than a remote API. The only invocation-related aspect of the document is that it defines explicit operations on various resources (though not their input and outputs). This suggest that the most obvious mapping would be to some XML/HTTP RPC protocol (SOAPy or not). In that sense, it stands out a bit from the more recent Cloud API proposals that take a “RESTful” rather than RPC approach. But in these days of enthusiastic REST-washing I am pretty sure a determined designer could produce a RESTful-looking (but contorted) set of resources that would channel the operations in the specification as HTTP-like verbs on these resources.

Since there are few protocol aspects to this “API design”, if we are to compare it to other “Cloud APIs”, it’s really the resource model that’s worth evaluating. The obvious comparison is to the EC2 model as it provides a pretty similar set of infrastructure resources (it’s entirely focused on the IaaS layer). It lacks EC2 capabilities around availability, security and monitoring. But it adds to the EC2 resource model the notions of VDC (“virtual data center”, a container of IaaS resources), VSYS (see below) and a lightly-defined EFM (Extended Function Module) concept which intends to encompass all kinds of network/security appliances (and presumably makes up for the lack of security groups).

The heart of the specification is the VSYS and its accompanying VSYS Descriptor. We are encouraged to think of the VSYS Descriptor as an extension of OVF that lets you specify this kind of environment:

Example content for a VSYS Descriptor

Example content for a VSYS Descriptor

By forcing the initial VSYS instance to be based on a VSYS Descriptor, but then allowing the VSYS to drift away from the descriptor via direct management actions, the specification takes a middle-of-the-road approach to the “model-based versus procedural” debate. Disciples of the procedural approach will presumably start from a very generic and unconstrained VSYS Descriptor and, from there, script their way to happiness. Model geeks will look for a way to keep the system configuration in sync with a VSYS Descriptor.

How this will work is completely undefined. There is supposed to be a getVSYSConfiguration() operation which “returns the configuration information on the VSYS” but there is no format/content proposed for the response payload. Is this supposed to return every single config file, every setting (OS, MW, application) on all the servers in the VSYS? Surely not. But what then is it supposed to return? The specification defines five VSYS attributes (VSYSID, creator, createTime, description and baseDescriptor) so I know what getSYSAttributes() returns. But leaving getVSYSConfiguration() undefined is like handing someone an airplane maintenance manual that simply reads “put the right part in the right place”. A similar feature is also left as an exercise to the reader in section that sketches an “external configuration service”. We are provided with a URL convention to address the service, but zero information about the format and content of the configuration instructions provided to the VServer.

EC2 has a keypair access mechanism for Linux instances and a clumsy password-retrieval system for Windows instances. The Fujitsu proposal adopts the lowest common denominator (actually the greatest common divisor, but that’s a lost rhetorical cause): random password generation/retrieval for everyone.

I also noticed the statement that a VServer must be “implemented as a virtual machine” which is an unnecessary constraint/assumption. The opposite statement is later made for EFMs, which “can be implemented in various ways (e.g. run on virtual machines or not)”, so I don’t want to read too much into the “hypervisor-required” VServer statement which probably just needs an editorial clean-up.

From a political perspective this specification looks more like a case of “can I play with you? I brought some marbles” than a more aggressive “listen everybody, we’re playing soccer now and I am the captain”. In other words, this may not be as much an attempt to shape the outcome of the incubator as much as to contribute to its work and position Fujitsu as a respected member whose participation needs to be acknowledged.

While this is an alternative submission to the vCloud API, I don’t think VMWare will feel very challenged by it. The specification’s core (VSYS Descriptor) intends to build on OVF, which should be music to VMWare’s ears (it’s the model, not the protocol, which is strategic). And it is light enough on technical details that it will be pretty easy for vCloud to claim that it, indeed, aligns with the intent of this “design”.

All in all, it is good to see companies take the time to write down what they expect out of the DMTF work. And it’s refreshing to see genuine single-company contributions rather than pre-negotiated documents by a clique. Whether they look more like implementable specifications of position paper, they all provide good input to the DMTF Cloud incubator.

28
Oct
2009

OWL news you can use

by William (@vambenepe on Twitter)

The W3C released OWL 2 today. Most readers of this blog are IT management people (whether they call it “cloud computing” or “boring old system management”) and don’t follow RDF, OWL, SPARQL etc too closely (if at all). Yet there is a lot of potential value in using these technologies for IT management, so I thought it might be helpful to provide some practical resources on the topic. I have selected articles that cover the special (some may say “twisted”) approach of using OWL and its friends for validation rather than just inference, as this use case is very relevant to IT management.

Of course you can also go to the W3C standard itself, starting with the overview of OWL 2.

Just so you don’t feel lonely if you decide to explore this path, have a look at Elastra’s sexy technology stack. ECML, EDML and EMML are all defined as OWL ontologies.

Categories