William Vambenepe's blog

IT management in a changing IT world

While the US health care system has come under fire for lack of buy cheap viagra prescription online , new legislation may encourage greater openness.There are many perspectives from which to understand and pfizer viagra online it.short chain fatty acids, other fatty acids, amino acids, buy online order viagra, polyamines, carbohydrates, vitamins, numerous antioxidants and phytosterols, growth factors, coagulation factors, various signal molecules such as cytokine-like bacteriokines.Metchnikoff had also observed that certain rural populations in Europe, for viagra sales uk in Bulgaria and the Russian Steppes who lived largely on milk fermented by lactic-acid bacteria were exceptionally long lived.Cultural differences, subjective assessments, and competing cheapest viagra prices theories all affect how "mental health" is defined.

Archive for the 'DMTF' Category

25
Jul
2010

The Tragedy of the Commons in Cloud standards

by William (@vambenepe on Twitter)

I wasn’t at the OSCON Cloud Summit this past week, but I’ve spent some time over the weekend trying to collect the good bits. Via Twitter, I had heard echos of an interesting debate on Cloud standards between Sam Johnston and Benjamin Black. Today I got to see Benjamin’s slides and read reports from two audience members, Charles Engelke and Krishnan Subramanian. Sam argued that Cloud standards are needed, Benjamin that they would be premature.

Benjamin is right about what to think and Sam is right about what to do.

Let me put it differently: Benjamin is right in theory, but it doesn’t matter. Here is why.

Say I’m a vendor and Benjamin convinces me

Assume I truly believe the industry would be better served if we all waited. Does this mean I’ll stay away from Cloud standards efforts for now? Not necessarily, because nothing is stopping my competitors from doing it. In the IT standards world, your only choice is to participate or opt out. For the most part you can’t put your muscle towards stopping an effort. Case in point, Amazon has so far chosen to opt out; has that stopped VMWare and others from going to DMTF and elsewhere to ratify specifications as standards? Of course not. To the contrary, it has made the option even more attractive because when the leader stays home it is a lot easier for less popular candidates to win the prize. So as a vendor-who-was-convinced-by-Benjamin I now have the choice between letting my competitor get his specification rubberstamped (and then hit me with the competitive advantage of being “standard compliant” and even “the standard leader”) or getting involved in an effort that I know to be counterproductive for the industry. Guess what most will choose?

Even the initial sinner (who sets the wheels of premature standardization in motion) may himself be convinced that it’s too early for Cloud standards. But he has to assume that one of his competitors will make the move, and in that context why give them first mover advantage (and the choice of the battlefield). It’s the typical Tragedy of the Commons scenario. By acting in a rational and self-interested way, participants invariably end up creating a bad situation, one that they might all know is against everyone’s self interest.

And it’s not just vendors.

Say I’m an officer of a Standard-setting organization and Benjamin convinces me

If you expect that I would use my position in the organization to prevent companies from starting a Cloud standard effort there, you live in fantasy-land. Standard-setting organizations compete with one another just as fiercely as companies do. If I have achieved a position of leadership in a given standard organization, the last thing I want is to see another organization lay claims to a strategic and fast-growing area of the IT landscape. It takes a lot of time and money for a company to get elected on the right board and gets its employees (or other reliable allies) in the right leadership positions. Or to acquire people already in that place. You only get a return on that investment if the organization manages to be the one where the key standards get created. That’s what’s behind the landgrab reflex of many standards organizations.

And it goes beyond vendors and standards organizations

Say I’m an IT buyer and Benjamin convinces me

Assume I really believe Cloud standards are premature. Assume they get created anyway and I have to choose between a vendor who supports them and one who doesn’t. Do I, as a matter of principle, refuse consider the “standard-compliant” label in my purchasing decision? Even if I know that the standard shouldn’t have been created, I also know that, all other things being equal, the “standard-compliant” product will attract more tools and complementary solutions and will likely ease future integration problems.

And then there is the question of how I’ll explain this to my boss. Will Benjamin be by my side with his beautiful slides when I am called in an emergency meeting to explain to the CIO why we, unlike the competitors, didn’t pick “a standards-based solution”?

In the real world, the only way to solve problems caused by the Tragedy of the Commons is to have some overarching authority regulate the usage of the resource at risk of being ruined. This seems unlikely to be a workable solution when the resource is not a river to protect from sewer discharges but an IT domain to protect from premature standardization. If called, I’d be happy to serve as benevolent dictator for the IT industry (I could fix a few other things beyond the Cloud standards landgrab issue). But as long as neither I nor anyone else is in a dictatorial position, Benjamin’s excellent exposé has no audience for which his call to arms (or rather to lay down the arms) is actionable. I am not saying that everyone agrees with Benjamin, but that even if everyone did it still wouldn’t make a difference. Many of us in the industry share his views and rationally act as if we didn’t.

[UPDATED 2010/7/25: In a nice example of Blog/Twitter synergy, minutes after posting this I was having a conversation on Twitter with Benjamin Black about my interpretation of what he said. Based on this conversation, I realize that I should clarify that what I mean by "standards" in this post is "something that comes out of a standard-setting organization" (whether or not it gets adopted), in other words what Benjamin calls a "standard specification". He uses the word "standard" to mean "what most people use", which may or may not be a "standard specification". That's a big part of the disconnect that led to our Twitter chat. The other part is that what I presented as Benjamin's thesis in my post is actually only one of the propositions in his talk, and not even the main one. It's the proposition that it is damaging for the industry when a standard specification comes out of a standard organization too early. I wasn't at the conference where Benjamin presented but it's hard to understand anything else out of slide 61 ("standardize too soon, and you lock to the wrong thing") and 87 ("to discover the right standards, we must eschew standards"). So if I misrepresented him I believe it was in making it look like this was the focus of his talk while in fact it was only one of the points he made. As he himself clarified for me: "My _actual_ argument is that it doesn't matter what we think about cloud standards, if they are needed, they will emerge" (again, in this sentence he uses "standards" to mean "something that people have converged on").

More generally, my main point here has nothing to do with Benjamin, Sam and their OSCON debate, other than the fact that reading about it prompted me to type this blog entry. It's simply that there is a perversion in the IT standards landscape that makes it impossible for premature standardization *not* to happen. It's something I've written before, e.g. in this post:

Saying “it’s too early” in the standards world is the same as saying nothing. It puts you out of the game and has no other effect. Amazon, the clear leader in the space, has taken just this position. How has this been understood? Simply as “well I guess we’ll do it without them”. It’s sad, but all it takes is one significant (but not necessarily leader) company trying to capitalize on some market influence to force the standards train to leave the station. And it’s a hard decision for others to not engage the pursuit at that point. In the same way that it only takes one bellicose country among pacifists to start a war.

Benjamin is just a messenger; and I wasn't trying to shoot him.]

[UPDATED 2010/8/13: The video of the debate between Sam Johnston and Benjamin Black is now available, so you can see for yourself.]

18
Jul
2010

Introducing the Oracle Cloud API

by William (@vambenepe on Twitter)

Oracle recently published a Cloud management API on OTN and also submitted a subset of the API to the new DMTF Cloud Management working group. The OTN specification, titled “Oracle Cloud Resource Model API”, is available here. In typical DMTF fashion, the DMTF-submitted specification is not publicly available (if you have a DMTF account and are a member of the right group you can find it here). It is titled the “Oracle Cloud Elemental Resource Model” and is essentially the same as the OTN version, minus sections 9.2, 9.4, 9.6, 9.8, 9.9 and 9.10 (I’ll explain below why these sections have been removed from the DMTF submission). Here is also a slideset that was recently used to present the submitted specification at a DMTF meeting.

So why two documents? Because they serve different purposes. The Elemental Resource Model, submitted to DMTF, represents the technical foundation for the IaaS layer. It’s not all of IaaS, just its core. You can think of its scope as that of the base EC2 service (boot a VM from an image, attach a volume, connect to a network). It’s the part that appears in all the various IaaS APIs out there, and that looks very similar, in its model, across all of them. It’s the part that’s ripe for a simple standard, hopefully free of much of the drama of a more open-ended and speculative effort. A standard that can come out quickly and provide interoperability right out of the gate (for the simple use cases it supports), not after years of plugfests and profiles. This is the narrow scope I described in an earlier rant about Cloud standards:

I understand the pain of customers today who just want to have a bit more flexibility and portability within the limited scope of the VM/Volume/IP offering. If we really want to do a standard today, fine. Let’s do a very small and pragmatic standard that addresses this. Just a subset of the EC2 API. Don’t attempt to standardize the virtual disk format. Don’t worry about application-level features inside the VM. Don’t sweat the REST or SOA purity aspects of the interface too much either. Don’t stress about scalability of the management API and batching of actions. Just make it simple and provide a reference implementation. A few HTTP messages to provision, attach, update and delete VMs, volumes and IPs. That would be fine. Anything else (and more is indeed needed) would be vendor extensions for now.

Of course IaaS goes beyond the scope of the Elemental Resource Model. We’ll need load balancing. We’ll need tunneling to the private datacenter. We’ll need low-latency sub-networks. We’ll need the ability to map multi-tier applications to different security zones. Etc. Some Cloud platforms support some of these (e.g. Amazon has an answer to all but the last one), but there is a lot more divergence (both in the “what” and the “how”) between the various Cloud APIs on this. That part of IaaS is not ready for standardization.

Then there are the extensions that attempt to make the IaaS APIs more application-aware. These too exist in some Cloud APIs (e.g. vCloud vApp) but not others. They haven’t naturally converged between implementations. They haven’t seen nearly as much usage in the industry as the base IaaS features. It would be a mistake to overreach in the initial phase of IaaS standardization and try to tackle these questions. It would not just delay the availability of a standard for the base IaaS use cases, it would put its emergence and adoption in jeopardy.

This is why Oracle withheld these application-aware aspects from the DMTF submission, though we are sharing them in the specification published on OTN. We want to expose them and get feedback. We’re open to collaborating on them, maybe even in the scope of a standard group if that’s the best way to ensure an open IP framework for the work. But it shouldn’t make the upcoming DMTF IaaS specification more complex and speculative than it needs to be, so we are keeping them as separate extensions. Not to mention that DMTF as an organization has a lot more infrastructure expertise than middleware and application expertise.

Again, the “Elemental Resource Model” specification submitted to DMTF is the same as the “Oracle Cloud Resource Model API” on OTN except that it has a different license (a license grant to DMTF instead of the usual OTN license) and is missing some resources in the list of resource types (section 9).

Both specifications share the exact same protocol aspects. It’s pretty cleanly RESTful and uses a JSON serialization. The credit for the nice RESTful protocol goes to the folks who created the original Sun Cloud API as this is pretty much what the Oracle Cloud API adopted in its entirety. Tim Bray described the genesis and design philosophy of the Sun Cloud API last year. He also described his role and explained that “most of the heavy lifting was done by Craig McClanahan with guidance from Lew Tucker“. It’s a shame that the Oracle specification fails to credit the Sun team and I kick myself for not noticing this in my reviews. This heritage was noted from the get go in the slides and is, in my mind, a selling point for the specification. When I reviewed the main Cloud APIs available last summer (the first part in a “REST in practice for IT and Cloud management” series), I liked Sun’s protocol design the best.

The resource model, while still based on the Sun Cloud API, has seen many more changes. That’s where our tireless editor, Jack Yu, with help from Mark Carlson, has spent most of the countless hours he devoted to the specification. I won’t do a point to point comparison of the Sun model and the Oracle model, but in general most of the changes and additions are motivated by use cases that are more heavily tilted towards private clouds and compatibility with existing application infrastructure. For example, the semantics of a Zone have been relaxed to allow a private Cloud administrator to choose how to partition the Cloud (by location is an obvious option, but it could also by security zone or by organizational ownership, as heretic as this may sound to Cloud purists).

The most important differences between the DMTF and OTN versions relate to the support for assemblies, which are groups of VMs that jointly participate in the delivery of a composite application. This goes hand-in-hand with the recently-released Oracle Virtual Assembly Builder, a framework for creating, packing, deploying and configuring multi-tier applications. To support this approach, the Cloud Resource Model (but not the Elemental Model, as explained above) adds resource types such as AssemblyTemplate, AssemblyInstance and ScalabilityGroup.

So what now? The DMTF working group has received a large number of IaaS APIs as submissions (though not the one that matters most or the one that may well soon matter a lot too). If all goes well it will succeed in delivering a simple and useful standard for the base IaaS use cases, and we’ll be down to a somewhat manageable triplet (EC2, RackSpace/OpenStack and DMTF) of IaaS specifications. If not (either because the DMTF group tries to bite too much or because it succumbs to infighting) then DMTF will be out of the game entirely and it will be between EC2, OpenStack and a bunch of private specifications. It will be the reign of toolkits/library/brokers and hell on earth for all those who think that such a bridging approach is as good as a standard. And for this reason it will have to coalesce at some point.

As far as the more application-centric approach to hypervisor-based Cloud, well, the interesting things are really just starting. Let’s experiment. And let’s talk.

18
Mar
2010

Standards Disconnect at Cloud Connect

by William (@vambenepe on Twitter)

Yesterday’s panel session on the future of Cloud standards at Cloud Connect is still resonating on Twitter tonight. Many were shocked by how acrimonious the debate turned. It didn’t have to be that way but I am not surprised that it was.

The debate was set up and moderated by Bob Marcus (ET-Strategies CTO and master standards coordinator). On stage were Krishna Sankar (Cisco and DMTF Cloud incubator), Archie Reed (HP and CSA), Winston Bumpus (VMWare and DMTF), a gentleman whose name I unfortunately forgot (and who isn’t listed on the program) and me.

If the goal was to glamorize Cloud standards, it was a complete failure. If the goal was to come out with some solutions and agreements, it was also a failure. But if the goal, as I believe, was to surface the current issues, complexities, emotions and misunderstandings surrounding Cloud standards, then I’d say it was a success.

I am not going to attempt to summarize the whole discussion. Charles Babcock, who was in the audience, does a good enough job in this InformationWeek article and, unlike me, he doesn’t have a horse in the race [side note: I am not sure why my country of origin is relevant to his article, but my guess is that this is the main thing he remembered from my presentation during the Cloud Connect keynote earlier that morning, thanks to the "guillotine" slide].

Instead of reporting on what happened during the standards discussion, I’ll just make one comment and provide one take-away.

The comment: the dangers of marketing standards

Early in the session, audience member Reuven Cohen complained that standards organizations don’t do enough to market their specifications. Winston was more than happy to address this and talk about all the marketing work that DMTF does, including trade shows and PR. He added that this is one of the reasons why DMTF needs to charge membership fees, to pay for this marketing. I agree with Winston at one level. Indeed, the DMTF does what he describes and puts a fair amount of efforts into marketing itself and its work. But I disagree with Reuven and Winston that this is a good thing.

First it doesn’t really help. I don’t think that distributing pens and tee-shirts to IT admins and CIO-wannabes results in higher adoption of your standard. Because the end users don’t really care what standard is used. They just want a standard. Whether it comes from DMTF, SNIA, OGF, or OASIS is the least of their concerns. Those that you have to convince to adopt your standard are the vendors and the service providers. The Amazon, Rackspace and GoGrid of the world. The Microsoft, Oracle, VMWare and smaller ones like… Enomaly (Reuven’s company). The highly-specialized consultants who work with them, like Randy. And also, very importantly, the open source developers who provide all the Cloud libraries and frameworks that are the lifeblood of many deployments. I have enough faith left in human nature to assume that all these guys make their strategic standards decisions on a bit more context than exhibit hall loot and press releases. Well, at least we do where I work.

But this traditional approach to marketing is worse than not helping. It’s actually actively harmful, for two reasons. The first is that the cost of these activities, as Winston acknowledges, creates a barrier for participation by requiring higher dues. To Winston it’s an unfortunate side effect, to me it’s a killer. Not necessarily because dropping the membership fee by 50% would bring that many more participants. But because the organizations become so dependent on dues that they are paranoid about making anything public for fear of lowering the incentive for members to keep paying. Which is the worst thing you can do if you want the experts and open source developers, who are the best chance Cloud standards have to not repeat the mistakes of the past, to engage with the standard. Not necessarily as members of the group, also from the outside. Assuming the work happens in public, which is the key issue.

The other reason why it’s harmful to have a standards organization involved in such traditional marketing is that it has a tendency to become a conduit for promoting the agenda of the board members. Promoting a given standard or organization sounds good, until you realize that it’s rarely so pure and unbiased. The trade shows in which the organization participates are often vendor-specific (e.g. Microsoft Management Summit, VMWorld…). The announcements are timed to coincide with relevant corporate announcements. The press releases contain quotes from board members who promote themselves at the same time as the organization. Officers speaking to the press on behalf of the standards organization are often also identified by their position in their company. Etc. The more a standards organization is involved in marketing, the more its low-level members are effectively subsiding the marketing efforts of the board members. Standards have enough inherent conflicts of interest to not add more opportunities.

Just to be clear, that issue of standards marketing is not what consumed most of the time during the session. But it came up and I since I didn’t get a chance to express my view on this while on the panel, I used this blog instead.

My take-away from the panel, on the other hand, is focused on the heart of the discussion that took place.

The take away: confirmation that we are going too fast, too early

Based on this discussion and other experiences, my current feeling on Cloud standards is that it is too early. If you think the practical experience we have today in Cloud Computing corresponds to what the practice of Cloud Computing will be in 10 years, then please go ahead and standardize. But let me tell you that you’re a fool.

The portion of Cloud Computing in which we have some significant experience (get a VM, attach a volume, assign an IP) will still be relevant in 10 years, but it will be a small fraction of Cloud Computing. I can tell you that much even if I can’t tell you what the whole will be. I have my ideas about what the whole will look like but it’s just a guess. Anybody who pretends to know is fooling you, themselves, or both.

I understand the pain of customers today who just want to have a bit more flexibility and portability within the limited scope of the VM/Volume/IP offering. If we really want to do a standard today, fine. Let’s do a very small and pragmatic standard that addresses this. Just a subset of the EC2 API. Don’t attempt to standardize the virtual disk format. Don’t worry about application-level features inside the VM. Don’t sweat the REST or SOA purity aspects of the interface too much either. Don’t stress about scalability of the management API and batching of actions. Just make it simple and provide a reference implementation. A few HTTP messages to provision, attach, update and delete VMs, volumes and IPs. That would be fine. Anything else (and more is indeed needed) would be vendor extensions for now.

Unfortunately, neither of these (waiting, or a limited first standard) is going to happen.

Saying “it’s too early” in the standards world is the same as saying nothing. It puts you out of the game and has no other effect. Amazon, the clear leader in the space, has taken just this position. How has this been understood? Simply as “well I guess we’ll do it without them”. It’s sad, but all it takes is one significant (but not necessarily leader) company trying to capitalize on some market influence to force the standards train to leave the station. And it’s a hard decision for others to not engage the pursuit at that point. In the same way that it only takes one bellicose country among pacifists to start a war.

Prepare yourself for some collateral damages.

While I would prefer for this not to proceed now (not speaking for my employer on this blog, remember), it doesn’t mean that one should necessarily stay on the sidelines rather than make lemonade out of lemons. But having opened the Cloud Connect panel session with somewhat of a mea-culpa (at least for my portion of responsibility) with regards to the failures of the previous IT management standardization wave, it doesn’t make me too happy to see the seeds of another collective mea-culpa, when we’ve made a mess of Cloud standards too. It’s not a given yet. Just a very high risk. As was made clear yesterday.

01
Mar
2010

Two versions of a protocol is one too many

by William (@vambenepe on Twitter)

There is always a temptation, when facing a hard design decision in the process of creating an interface or a protocol, to produce two (or more) versions. It’s sometimes a good idea, as a way to explore where each one takes you so you can make a more informed choice. But we know how this invariably ends up. Documents get published that arguably should not. It’s even harder in a standard working group, where someone was asked (or at least encouraged) by the group to create each of the alternative specifications. Canning one is at best socially awkward (despite the appearances, not everyone in standards is a psychopath or a sadist) and often politically impossible.

And yet, it has to be done. Compare the alternatives, then pick one and commit. Don’t confuse being accommodating with being weak.

The typical example these days is of course SOAP versus REST: the temptation is to support both rather than make a choice. This applies to standards and to proprietary interfaces. When a standard does this, it hurts rather than promote interoperability. Vendors have a bit more of an excuse when they offer a choice (“the customer is always right”) but in reality it forces customers to play Russian roulette whether they want it or not. Because one of the alternatives will eventually be left behind (either discarded or maintained but not improved). If you balance the small immediate customer benefit of using the interface style they are most used to with the risk of redoing the integration down the road, the value proposition of offering several options crumbles.

[Pedantic disclaimer: I use the term "REST" in this post the way it is often (incorrectly) used, to mean pretty much anything that uses HTTP without a SOAP wrapper. The technical issues are a topic for other posts.]

CMDBf

CMDBf v1 is a DMTF standard. It is a SOAP-based protocol. For v2, it has been suggested that there should a REST version. I don’t know what the CMDBf group (in which I participate) will end up doing but I’ve made my position clear: I could go either way (remain with SOAP or dump it) but I do not want to have two versions of the protocol (one SOAP one REST). If we think we’re better off with a REST version, then let’s make v2 REST-only. Supporting both mechanisms in v2 would be stupid. They would address the same use cases and only serve to provide political ass-coverage. There is no functional need for both. The argument that we need to keep supporting SOAP for the benefit of those who implemented v1 doesn’t fly. As an implementer, nobody is saying that you need to turn off your v1 services the second you launch the v2 version.

DMTF Cloud

Between the specifications submitted directly to DMTF, the specifications developed by DMTF “partner” organizations and the existing DMTF protocols, the DMTF Cloud effort is presented with a mix of SOAP, RESTful and XML-RPC-over-HTTP options. In the process of deciding what to create or adopt I am sure that the temptation will be high to take the easy route of supporting several versions to placate everyone. But such a “consensus” would be achieved on the back of the implementers so I very much hope it won’t be the case.

When it is appropriate

There are cases where supporting alternatives options is worth the cost. But it typically happens when they serve very different use cases. Think of SAX versus DOM, which have clearly differentiated sweetspots. In the Cloud world, Amazon S3 gives us interesting examples of both justified and extraneous alternatives. The extraneous one is the choice between REST and SOAP for the S3 API. I often praise AWS for its innovation and pragmatism, but this is an example of something that only looks pragmatic. On the other hand, the AWS import/export mechanism is a useful alternative. It allows you to physically ship a device with a few terabytes of data to Amazon. This is technically an alternative to the S3 programmatic interface, but one with obviously differentiated use cases. I recommend you reserve the use of “alternative APIs” for such scenarios.

If it didn’t work for Tiger Woods, it won’t work for your Cloud API either. Learn to commit.

[CLARIFICATION: based on some of the early Twitter feedback on this entry, I want to clarify that it's alternative versions that I am against, not successive versions (i.e. an evolution of the interface over time). How to manage successive versions properly is a whole other debate.]

19
Feb
2010

HP has submitted a specification to the DMTF Cloud incubator

by William (@vambenepe on Twitter)

When I lamented, in a previous post, that I couldn’t tell you about recent submissions to the DMTF Cloud incubator, one of those I had in mind was a submission from HP. I can now write this, because the author of the specification, Nigel Cook, has recently blogged about it. Unfortunately he is isn’t publishing the specification itself, just an announcement that it was submitted. Hopefully he is currently going through the long approval process to make the submitted document public (been there, done that, I know it takes time).

In the blog, Nigel makes a good argument for the need to go beyond a hypervisor-centric view of Cloud computing. Even at the IaaS layer there are cases of automated-but-not-virtualized deployment that have all the characteristics of Cloud computing and need to be supported by Cloud management APIs. Not to mention OS-level isolation like Solaris Containers.

Nigel also offers a spirited defense of SOAP-based protocols. I don’t necessarily agree with all his points (“one could easily map the web service definition I described to REST if that was important” suggests a “it’s just SOAP without the wrapper” view of REST), but I am glad he is launching this debate. We need to discuss this rather than assume that REST is the obvious answer. Remember, a few years ago SOAP was just as obvious an answer to any protocol question. It may well be that indeed REST comes out ahead of this discussion, but the process will force us to be explicit about what benefits of REST we are trying to achieve and will allow us to be practical in the way we approach it.

14
Feb
2010

Can Cloud standards be saved?

by William (@vambenepe on Twitter)

Then: Web services standards

One of the most frustrating aspects of how Web services standards shot themselves in the foot via unchecked complexity is that plenty of people were pointing out the problem as it happened. Mark Baker (to whom I noticed Don Box also paid tribute recently) is the poster child. I remember Tom Jordahl tirelessly arguing for keeping it simple in the WSDL working group. Amberpoint’s Fred Carter did it in WSDM (in the post announcing the recent Amberpoint acquisition, I mentioned that “their engineers brought to the [WSDM] group a unique level of experience and practical-mindedness” but I could have added “… which we, the large companies, mostly ignored.”)

The commonality between all these voices is that they didn’t come from the large companies. Instead they came from the “specialists” (independent contractors and representatives from small, specialized companies). Many of the WS-* debates were fought along alliance lines. Depending on the season it could be “IBM vs. Microsoft”, “IBM+Microsoft vs. Oracle”, “IBM+HP vs. Microsoft+Intel”, etc… They’d battle over one another’s proposal but tacitly agreed to brush off proposals from the smaller players. At least if they contained anything radically different from the content of the submission by the large companies. And simplicity is radical.

Now: Cloud standards

I do not reminisce about the WS-* standards wars just for old time sake or the joy of self-flagellation. I also hope that the current (and very important) wave of standards, related to all things Cloud, can do better than the Web services wave did with regards to involving on-the-ground experts.

Even though I still work for a large company, I’d like to see this fixed for Cloud standards. Not because I am a good guy (though I hope I am), but because I now realize that in the long run this lack of perspective even hurts the large companies themselves. We (and that includes IBM and Microsoft, the ringleaders of the WS-* effort) would be better off now if we had paid more attention then.

Here are two reasons why the necessity to involve and include specialists is even more applicable to Cloud standards than Web services.

First, there are many more individuals (or small companies) today with a lot of practical Cloud experience than there were small players with practical Web services experience when the WS-* standardization started (Shlomo Swidler, Mitch Garnaat, Randy Bias, John M. Willis, Sam Johnston, David Kavanagh, Adrian Cole, Edward M. Goldberg, Eric Hammond, Thorsten von Eicken and Guy Rosen come to mind, though this is nowhere near an exhaustive list). Which means there is even more to gain by ensuring that the Cloud standard process is open to them, should they choose to engage in some form.

Second, there is a transparency problem much larger than with Web services standards. For all their flaws, W3C and OASIS, where most of the WS-* work took place, are relatively transparent. Their processes and IP policies are clear and, most importantly, their mailing list archives are open to the public. DMTF, where VMWare, Fujitsu and others have submitted Cloud specifications, is at the other hand of the transparency spectrum. A few examples of what I mean by that:

  • I can tell you that VMWare and Fujitsu submitted specifications to DMTF, because the two companies each issued a press release to announce it. I can’t tell you which others did (and you can’t read their submissions) because these companies didn’t think it worthy of a press release. And DMTF keeps the submission confidential. That’s why I blogged about the vCloud submission and the Fujitsu submission but couldn’t provide equivalent analysis for the others.
  • The mailing lists of DMTF working groups are confidential. Even a DMTF member cannot see the message archive of a group unless he/she is a member of that specific group. The general public cannot see anything at all. And unless I missed it on the site, they cannot even know what DMTF working groups exist. It makes you wonder whether Dick Cheney decided to call his social club of energy company executives a “Task Force” because he was inspired by the secrecy of the DMTF (“Distributed Management Task Force”). Even when the work is finished and the standard published, the DMTF won’t release the mailing list archive, even though these discussions can be a great reference for people who later use the specification.
  • Working documents are also confidential. Working groups can decide to publish some intermediate work, but this needs to be an explicit decision of the group, then approved by its parent group, and in practice it happens rarely (mileage varies depending on the groups).
  • Even when a document is published, the process to provide feedback from the outside seems designed to thwart any attempt. Or at least that’s what it does in practice. Having blogged a fair amount on technical details of two DMTF standards (CMDBf and WS-Management) I often get questions and comments about these specifications from readers. I encourage them to bring their comments to the group and point them to the official feedback page. Not once have I, as a working group participant, seen the comments come out on the other end of the process.

So let’s recap. People outside of DMTF don’t know what work is going on (even if they happen to know that a working group called “Cloud this” or “Cloud that” has been started, the charter documents and therefore the precise scope and list of deliverables are also confidential). Even if they knew, they couldn’t get to see the work. And even if they did, there is no convenient way for them to provide feedback (which would probably arrive too late anyway). And joining the organization would be quite a selfless act because they then have to pay for the privilege of sharing their expertise while not being included in the real deciding circles anyway (unless there are ready to pony up for the top membership levels). That’s because of the unclear and unstable processes as well as the inordinate influence of board members and officers who all are also company representatives (in W3C, the strong staff balances the influence of the sponsors, in OASIS the bylaws limit arbitrariness by the board members).

What we are missing out on

Many in the standards community have heard me rant on this topic before. What pushed me over the edge and motivated me to write this entry was stumbling on a crystal clear illustration of what we are missing out on. I submit to you this post by Adrian Cole and the follow-up (twice)by Thorsten von Eicken. After spending two days at a face to face meeting of the DMTF Cloud incubator (in an undisclosed location) this week, I’ll just say that these posts illustrate a level of practically and a grounding in real-life Cloud usage that was not evident in all the discussions of the incubator. You don’t see Adrian and Thorsten arguing about the meaning of the word “infrastructure”, do you? I’d love to point you to the DMTF meeting minutes so you can judge for yourself, but by now you should understand why I can’t.

So instead of helping in the forum where big vendors submit their specifications, the specialists (some of them at least) go work in OGF, and produce OCCI (here is the mailing list archive). When Thorsten von Eicken blogs about his experience using Cloud APIs, they welcome the feedback and engage him to look at their work. The OCCI work is nice, but my concern is that we are now going to end up with at least two sets of standard specifications (in addition to the multitude of company-controlled specifications, like the ubiquitous EC2 API). One from the big companies and one from the specialists. And if you think that the simplest, clearest and most practical one will automatically win, well I envy your optimism. Up to a point. I don’t know if one specification will crush the other, if we’ll have a “reconciliation” process, if one is going to be used in “private Clouds” and the other in “public Clouds” or if the conflict will just make both mostly irrelevant. What I do know is that this is not what I want to see happen. Rather, the big vendors (whose imprimatur is needed) and the specialists (whose experience is indispensable) should work together to make the standard technically practical and widely adopted. I don’t care where it happens. I don’t know whether now is the right time or too early. I just know that when the time comes it needs to be done right. And I don’t like the way it’s shaping up at the moment. Well-meaning but toothless efforts like cloud-standards.org don’t make me feel better.

I know this blog post will be read both by my friends in DMTF and by my friends in Clouderati. I just want them to meet. That could be quite a party.

IBM was on to something when it produced this standards participation policy (which I commented on in a cynical-yet-supportive way – and yes I realize the same cynicism can apply to me). But I haven’t heard of any practical effect of this policy change. Has anyone seen any? Isn’t the Cloud standard wave the right time to translate it into action?

Transparency first

I realize that it takes more than transparency to convince specialists to take a look at what a working group is doing and share their thoughts. Even in a fully transparent situation, specialists will eventually give up if they are stonewalled by process lawyers or just ignored and marginalized (many working group participants have little bandwidth and typically take their cues from the big vendors even in the absence of explicit corporate alignment). And this is hard to fix. Processes serve a purpose. While they can be used against the smaller players, they also in many cases protect them. Plus, for every enlightened specialist who gets discouraged, there is a nutcase who gets neutralized by the need to put up a clear proposal and follow a process. I don’t see a good way to prevent large vendors from using the process to pressure smaller ones if that’s what they intend to do. Let’s at least prevent this from happening unintentionally. Maybe some of my colleagues  from large companies will also ask themselves whether it wouldn’t be to their own benefit to actually help qualified specialists to contribute. Some “positive discrimination” might be in order, to lighten the process burden in some way for those with practical expertise, limited resources, and the willingness to offer some could-otherwise-be-billable hours.

In any case, improving transparency is the simplest, fastest and most obvious step that needs to be taken. Not doing it because it won’t solve everything is like not doing CPR on someone on the pretext that it would only restart his heart but not cure his rheumatism.

What’s at risk if we fail to leverage the huge amount of practical Cloud expertise from smaller players in the standards work? Nothing less than an unpractical set of specifications that will fail to realize the promises of Cloud interoperability. And quite possibly even delay them. We’ve seen it before, haven’t we?

Notice how I haven’t mentioned customers? It’s a typical “feel-good” line in every lament about standards to say that “we need more customer involvement”. It’s true, but the lament is old and hasn’t, in my experience, solved anything. And today’s economical climate makes me even more dubious that direct customer involvement is going to keep us on track for this standardization wave (though I’d love to be proven wrong). Opening the door to on-the-ground-working-with-customers experts with a very neutral and pragmatic perspective has a better chance of success in my mind.

As a point of clarification, I am not asking large companies to pick a few small companies out of their partner ecosystem and give them a 10% discount on their alliance membership fee in exchange for showing up in the standards groups and supporting their friendly sponsor. This is a common trick, used to pack a committee, get the votes and create an impression of overwhelming industry support. Nobody should pick who the specialists are. We should do all we can to encourage them to come. It will be pretty clear who they are when they start to ask pointed questions about the work.

Finally, from the archives, a more humorous look at how various standards bodies compare. And the proof that my complaints about DMTF secrecy aren’t new.

19
Nov
2009

Review of Fujitsu’s IaaS Cloud API submission to DMTF

by William (@vambenepe on Twitter)

Things are heating up in the DMTF Cloud incubator. Back in September, VMWare submitted its vCloud API (or rather a “reader’s digest” version of it) to the group. Last week, the group released a white paper titled “Interoperable Clouds”. And a second submission, from Fujitsu, was made last week and publicly announced today.

The Fujitsu submission is called an “API design”. What this means is that it doesn’t tell you anything about what things look like on the wire. It could materialize as another “XML over HTTP” protocol (with or without SOAP wrapper), but it could just as well be implemented as a binary RPC protocol. It’s really more of an esquisse of a resource model than a remote API. The only invocation-related aspect of the document is that it defines explicit operations on various resources (though not their input and outputs). This suggest that the most obvious mapping would be to some XML/HTTP RPC protocol (SOAPy or not). In that sense, it stands out a bit from the more recent Cloud API proposals that take a “RESTful” rather than RPC approach. But in these days of enthusiastic REST-washing I am pretty sure a determined designer could produce a RESTful-looking (but contorted) set of resources that would channel the operations in the specification as HTTP-like verbs on these resources.

Since there are few protocol aspects to this “API design”, if we are to compare it to other “Cloud APIs”, it’s really the resource model that’s worth evaluating. The obvious comparison is to the EC2 model as it provides a pretty similar set of infrastructure resources (it’s entirely focused on the IaaS layer). It lacks EC2 capabilities around availability, security and monitoring. But it adds to the EC2 resource model the notions of VDC (“virtual data center”, a container of IaaS resources), VSYS (see below) and a lightly-defined EFM (Extended Function Module) concept which intends to encompass all kinds of network/security appliances (and presumably makes up for the lack of security groups).

The heart of the specification is the VSYS and its accompanying VSYS Descriptor. We are encouraged to think of the VSYS Descriptor as an extension of OVF that lets you specify this kind of environment:

Example content for a VSYS Descriptor

Example content for a VSYS Descriptor

By forcing the initial VSYS instance to be based on a VSYS Descriptor, but then allowing the VSYS to drift away from the descriptor via direct management actions, the specification takes a middle-of-the-road approach to the “model-based versus procedural” debate. Disciples of the procedural approach will presumably start from a very generic and unconstrained VSYS Descriptor and, from there, script their way to happiness. Model geeks will look for a way to keep the system configuration in sync with a VSYS Descriptor.

How this will work is completely undefined. There is supposed to be a getVSYSConfiguration() operation which “returns the configuration information on the VSYS” but there is no format/content proposed for the response payload. Is this supposed to return every single config file, every setting (OS, MW, application) on all the servers in the VSYS? Surely not. But what then is it supposed to return? The specification defines five VSYS attributes (VSYSID, creator, createTime, description and baseDescriptor) so I know what getSYSAttributes() returns. But leaving getVSYSConfiguration() undefined is like handing someone an airplane maintenance manual that simply reads “put the right part in the right place”. A similar feature is also left as an exercise to the reader in section that sketches an “external configuration service”. We are provided with a URL convention to address the service, but zero information about the format and content of the configuration instructions provided to the VServer.

EC2 has a keypair access mechanism for Linux instances and a clumsy password-retrieval system for Windows instances. The Fujitsu proposal adopts the lowest common denominator (actually the greatest common divisor, but that’s a lost rhetorical cause): random password generation/retrieval for everyone.

I also noticed the statement that a VServer must be “implemented as a virtual machine” which is an unnecessary constraint/assumption. The opposite statement is later made for EFMs, which “can be implemented in various ways (e.g. run on virtual machines or not)”, so I don’t want to read too much into the “hypervisor-required” VServer statement which probably just needs an editorial clean-up.

From a political perspective this specification looks more like a case of “can I play with you? I brought some marbles” than a more aggressive “listen everybody, we’re playing soccer now and I am the captain”. In other words, this may not be as much an attempt to shape the outcome of the incubator as much as to contribute to its work and position Fujitsu as a respected member whose participation needs to be acknowledged.

While this is an alternative submission to the vCloud API, I don’t think VMWare will feel very challenged by it. The specification’s core (VSYS Descriptor) intends to build on OVF, which should be music to VMWare’s ears (it’s the model, not the protocol, which is strategic). And it is light enough on technical details that it will be pretty easy for vCloud to claim that it, indeed, aligns with the intent of this “design”.

All in all, it is good to see companies take the time to write down what they expect out of the DMTF work. And it’s refreshing to see genuine single-company contributions rather than pre-negotiated documents by a clique. Whether they look more like implementable specifications of position paper, they all provide good input to the DMTF Cloud incubator.

02
Sep
2009

VMWare publishes (and submits) vCloud API

by William (@vambenepe on Twitter)

VMWare published its vCloud API yesterday (it was previously only available to a few partners) and submitted it to the DMTF, as had been previously announced. So much for my speculations involving IBM.

It may be time to update the Cloud API comparison. After a very quick first pass, vCloud looks quite similar to the Sun Cloud API (that’s a compliment). For example, they both handle long-lived operations via a “202 Accepted” complemented by a resource that represents the progress (“status” for Sun, “task” for vCloud). A very visible (but not critical) difference is the use of JSON (Sun) versus XML (vCloud).

As expected, OVF/OVA is central to vCloud. More once I have read the whole specification.

In any case, things are going to get interesting in the DMTF Cloud incubator. I there a path to adoption?Assuming that Amazon keeps sitting it out, what will the other Cloud vendors with an API (Rackspace, GoGrid, Sun…) do? I doubt they ever had plans/aspirations to own or even drive the standard, but how much are they willing to let VMWare do it? How much does Citrix/Xen want to steer standards versus simply implement them in the context of the Xen Cloud project? What about OGF/OCCI with which the DMTF is supposedly collaborating?How much support is VMWare going to receive from its service provider partners? How much traction does VMWare have with Cisco, HP (server division) and IBM on this? What are the plans at Oracle and Microsoft? Speaking of Microsoft, maybe it will at some point want its standard strategy playbook back. At least when VMWare is done using it.

28
Jul
2009

REST in practice for IT and Cloud management (part 2: configuration management)

by William (@vambenepe on Twitter)

What benefits does REST provide for configuration management (in traditional data centers and in Clouds)?

Part 1 of the “REST in practice for IT and Cloud management” investigation looked at Cloud APIs from leading IaaS providers. It examined how RESTful they are and what concrete benefits derive from their RESTfulness. In part 2 we will now look at the configuration management domain. Even though it’s less trendy, it is just as useful, if not more, in understanding the practical value of REST for IT management. Plus, as long as Cloud deployments are mainly of the IaaS kind, you are still left with the problem of managing the configuration of everything that runs of top the virtual machines (OS, middleware, DB, applications…). Or, if you are a glass-half-full person, here is another way to look at it: the great thing about IaaS (and host virtualization in general) is that you can choose to keep your existing infrastructure, applications and management tools (including configuration management) largely unchanged.

At first blush, REST is ideally suited to configuration management.

The RESTful Cloud APIs have no problem retrieving resource descriptions, but they seem somewhat hesitant in the way they deal with resource-specific actions. Tim Bray described one of the challenges in his well-considered Slow REST post. And indeed, applying REST to these “do something that may take some time and not result exactly in what was requested” scenarios is a lot less straightforward than when you’re just doing document/data retrieval. In contrast you’d think that applying REST to the task of retrieving configuration data from a CMDB or other configuration store would be a no-brainer. Especially in the IT management world, where we already have explicit resource models and a rich set of relationships defined. Let’s give each resource a URI that responds to HTTP GET requests, let’s turn the associations into hyperlinks in the resource presentation, let’s mint a MIME type to represent this format and we are out of the office in time for a 4:00PM tennis game when all the courts are available (hopefully our tennis partners are as bright as us and can get out early too). This “work smarter not harder” approach would allow us to present this list of benefits in our weekly progress report:

-1- A URI-based scheme makes the protocol independent of the resource topology, unlike today’s data stores that usually struggle to represent relationships between stores.

-2- It is simpler to code against than CIM-over-HTTP or WS-Management. It is cross-platform, unlike WMI or JMX.

-3- It makes it trivial to browse the configuration data from a Web browser (the resources themselves could provide an HTML representation based on content-type negotiation, or a simple transformation could generate it for the Web browser).

-4- You get REST-induced caching and scalability.

In the shower after the tennis game, it becomes apparent that benefit #4 is largely irrelevant for IT management use cases. That the browser in #3 would not be all that useful beyond simple use cases. That #2 is good for karma but developers will demand a library that hides this benefit anyway. And that the boss is going to say that he doesn’t care about #1either because his product is “the single source of truth” so it needs to import from the other configuration store, not reference them.

Even if we ignore the boss (once again) it only  leaves #1 as a practical benefit. Surprise, that’s also the aspect that came out on top of the analysis in part 1 (see “the API doesn’t constrain the design of the URI space” highlight, reinforced by Mark’s excellent comment on the role of hypertext). Clearly, there is something useful for IT management in this “hypermedia” thing. This will largely be the topic of part 3.

There are also quite a few things that this RESTification of the configuration management store doesn’t solve:

-1- The ability to query: “show me all the WebLogic instances that run on a Windows host and don’t have patch xyz applied”. You don’t have much of a CMDB if you can’t answer this. For an analogy, remember (or imagine) a pre-1995 Web with no search engine, where you can only navigate by starting from your browser home page and clicking through static links step by step, or through bookmarks.

-2- The ability to retrieve the configuration change history and to compare configurations across resources (or to a reference configuration).

This is not to say that these two features cannot be built on top of a RESTful IT resource model. Just that they are the real meat of configuration management (rather than a simple resource-by-resource configuration browser) and that your brilliant re-architecture hasn’t really helped in addressing them. Does a RESTful foundation make these features harder to build? Not necessarily, but there are some tricky aspects to take care of:

-1- In hypermedia systems, the links are usually part of the resource representation, not resources of their own. In IT management, relationships/associations can have their own lifecycle and configuration properties.

-2- Be careful that you can really maintain the address of a resource. It’s one thing to make sure that a UUID gets maintained as a resource configuration changes, it’s another to ensure that a dereferenceable URI remains unchanged. For example, the admin server of a cluster may move over time from one node to another.

More fundamentally, the ability to deal with multiple resources at the same time and/or to use the model at different levels of granularity is often a challenge. Either you make your protocol more complex to account for this or your pollute your resource model (with a bunch of arbitrary “groups”, implicit or explicit).

We saw this in the Cloud APIs too. It typically goes something like this: you can address an individual server (called “foo”) by sending requests to http://Cloudprovider.com/server/foo. Drop the “foo” part of the URL and now you can address all the servers, for example to retrieve their configuration or possibly to reboot them. This gives me a way of dealing with multiple resources at time, but only along the lines pre-defined by the API. What if I want to deal only with the servers that host nodes of a given cluster. Sorry, not possible. What if the servers have different hosts in their URIs (remember, “the API doesn’t constrain the design of the URI space”)? Oops.

WS-Management, in the SOAP world, takes this one step further with Selectors, through which you can embed some kind of query, the result of which is what you are addressing in your message. Or, if all you want to do is GET, you can model you entire datacenter as one giant virtual XML doc (a document which is never assembled in practice) and use WSRF/WSDM’s “QueryExpression” or WS-Management’s “FragmentTransfer” to the same effect. BTW, I have issues with the details of how these mechanisms work (and I have described an alternative under the motto “if you are going to suffer with WS-Addressing, at least get some value out of it”).

These are all non-RESTful atrocities to a RESTafarian, but in my mind the Cloud REST API reviewed in part 1 have open Pandora’s box by allowing less-qualified URIs to address all instances of a class. I expect you’ll soon see more precise query parameters in these URIs and they’ll look a lot like WS-Management Selectors (e.g. http://Cloudprovider.com/server?OS=Linux&CPUType=X86). Want to take bets about when a Cloud API URI format with an embedded regex first arrives?

When you need this, my gut feeling is that you are better off not worrying too much about trying to look RESTful. There is no shame to using an RPC pattern in the right circumstances. Don’t be the stupid skier who ends up crashing in a tree because he is just too cool for the using snowplow position.

One of the most common reasons to deal with multiple resources together is to run queries such as the “show me all the WebLogic instances that run on a Windows host and don’t have patch xyz applied” example above. Such a query mechanism recently became a DMTF standard, it’s called CMDBf. It is SOAP-based and doesn’t attempt to have anything to do with REST. Not that it didn’t cross the mind of a bunch of people, lead by Michael Coté when CMDBf first emerged (read the comments too). But as James Governor rightly predicted in the first comment, Coté heard “dick” from us on this (I represented HP in CMDBf and ended up being an editor of the specification, focusing on the “query” part). I don’t remember reading the entry back then but I must have since I have been a long time Coté fan. I must have dismissed the idea so quickly that it didn’t even register with my memory. Well, it’s 2009 now, CMDBf v1 is a DMTF standard and guess what? I, and many other SOAP-the-world-till-it-shines alumni, are looking a lot more seriously into what’s in this REST thing (thus this series of posts for me). BTW in this piece Coté also correctly predicted that CMDBf would be “more about CMDB interoperation than federation” but that didn’t take as much foresight (it was pretty obvious to me from the start).

Frankly I am still not sure that there is much benefit from REST in what CMDBf does, which is mostly a query interface. Yes the CMDBf query and its response go over SOAP. Yes in this case SOAP is mostly a useless wrapper since none of the implementations will likely support any WS-* SOAP header (other than paying the WS-Addressing tax). Sure we could remove it and send plain XML over HTTP. Or replace the SOAP wrapper with an Atom wrapper. Would it be anymore RESTful? Not one bit.

And I don’t see how to make it more RESTful. There are plenty of things in the periphery the query operation that can be made RESTful, along the lines of what I described above. REST could make the discovery/reconciliation tasks of the CMDB more efficient. The CMDBf query result format could be improved so that from the returned elements I can navigate my way among resources by following hyperlinks. But the query operation itself looks fundamentally RPCish to me, just like my interaction with the Google search page is really an RPC call that happens to return a Web page full of hyperlinks. In a way, this query (whether Google or CMDBf) can at best be the transition point from RPC to REST. It can return results that open a world of RESTful requests to you, but the query invocation itself is not RESTful. And that’s OK.

In part 3 (now available), I will try to synthesize the lessons from the Cloud APIs (part 1) and configuration management (this post) and extract specific guidance to get the best of what REST has to offer in future IT management protocols. Just so you can plan ahead, in part 4 I will reform the US health care system and in part 5 I will provide a practical roadmap for global nuclear disarmament. Suggestions for part 6 are accepted.

13
Jul
2009

YACSOE

by William (@vambenepe on Twitter)

Yet another cloud standards organization effort. This one is better than the others because it has the best domain name.

A press release to announce a Wiki. Sure. Whatever. Electrons are cheap.

Cynicism aside, it can’t hurt. But what would be really useful is if all these working groups opened up their mailing list archives and document repositories so that the Wiki can be a launching pad to actual content rather than a set of one-line descriptions of what each group is supposed to work on. With useful direct links to the most recent drafts and lists of issues under consideration. Similar to the home page of a W3C working group, but across groups. Let’s hope this is a first step in that direction.

I am also interested in where they’ll draw the line between Cloud computing and IT management. If such a line remains.

06
Jul
2009

The CMDBf specification is now a DMTF standard

by William (@vambenepe on Twitter)

The CMDBf specification has finished its trek through the DMTF standard process. The last step was board approval and finally here is the official DMTF standard. It’s called version 1.0.0 which is a bit confusing since the version submitted to DMTF was dubbed “version 1.0″. I guess it means that this standard is the first version of the DMTF specification called CMDBf.

If you have been following the process closely, then you won’t find many technical changes since the last public draft. If you last read the specification when it was submitted to DMTF, then you’ll notice several improvements but no drastic change. If you are yet to take a first look at CMDBf, now is the perfect time.

To help you in that endeavor, I plan to update the query pseudo-algorithm to conform to the standard version of the specification when I get a chance. In the meantime, the slightly-outdated one is probably still helpful in wrapping your mind around the query mechanism.

Gentle(wo)men, rev your (query) engines.

29
Jun
2009

Uploading a file to a Windows machine via WMI/WS-Management

by William (@vambenepe on Twitter)

[UPDATED 2009/6/30: Check the following post for a more practical solution.]

Here is a simple way to upload a text (i.e. not binary) file to a Windows machine. Because my interest is to be able to do it from any platform, I investigated the use of WS-Management. But the method relies on invoking WMI methods over WS-Management, so I don’t see why it would not also work in a straight WMI scenario if you prefer.

I am not a Windows management expert, so there may be a much better way to do this (e.g. BITS). But if what you’re after is the simplest possible way to drop a file on a Windows machine it from a non-Windows machine, it doesn’t get much simpler than sending an XML doc over HTTP and calling it a day. Here is how.

The easiest would be if the CIM_DataFile WMI class had a “create” method to create a new file. It doesn’t. But Win32_Process does. Invoking this method creates a new process and you get to specify the command line to execute. All you need to do is come up with a command line that invokes a program that will create the file that you want to upload.

There may be alternatives, but the command line I came up with for this purpose uses the “cmd.exe” interpreter (the Windows command-line shell). By using the “/c” option, you can invoke this interpreter with its instructions as parameters directly on the command line (it gets a bit confusing because we have two “command lines” here, the one that is used to launch the “cmd.exe” shell and the one that is presented inside the “cmd.exe” shell).

Anyway, if you type the following line inside the “start/run” field in Windows

cmd /c echo 1st line > test1.txt

It will have the same effect as opening a command shell, typing “echo 1st line > test1.txt” in it and the closing it. It creates a new file called “test1.txt” with one line of content (“1st line”). If you want a second line, you can do this by adding a second command that uses “>>” (append) instead of “>”. And the two commands can be joined by “&&” to invoke them in one pass. So to create a file with three lines, we’d execute:

cmd /c echo 1st line > test1.txt && echo 2nd line >> test1.txt
&& echo 3rd line >> test1.txt

Now all we have to do is package this in a WS-Management SOAP message and post it to the WS-Management listener of the Windows machine. In the process, we have to escape the “&” in the command line to “&” because of XML syntax rules. The resulting message looks like:

<s:Envelope
  xmlns:s="http://www.w3.org/2003/05/soap-envelope"
  xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing"
  xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd">
<s:Header>
<a:To>http://localhost/wsman</a:To>
<w:ResourceURI s:mustUnderstand="true">

http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process

</w:ResourceURI>
<a:ReplyTo>
<a:Address s:mustUnderstand="true">

http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous

</a:Address>
</a:ReplyTo>
<a:Action s:mustUnderstand="true">

http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process/Create

</a:Action>
<a:MessageID>uuid:9A989269-283B-4624-BAC5-BC291F72E854</a:MessageID>
</s:Header>
<s:Body>
<p:Create_INPUT
  xmlns:p="http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process">
<p:CommandLine>cmd /c echo 1st line > test1.txt &amp;&amp; echo 2nd line >>
  test1.txt &amp;&amp; echo 3rd line >> test1.txt</p:CommandLine>
<p:CurrentDirectory>C:\data\winrm-test\</p:CurrentDirectory>
</p:Create_INPUT>
</s:Body>
</s:Envelope>

You don’t even need a WS-Management toolkit to do this as the only WS-Management header is w:ResourceURI which can easily be set manually. You don’t need a WS-Addressing library either as all the headers are also static (except for the MessageID even though nobody will care in practice if you always send the same value; I hereby authorize you to re-use the one in my example as much as you want). As a side note, this is yet another illustration of how useless this header (and more generally WS-Addressing) is in 95% of the case. And yet the Microsoft WS-Management implementation (like many others) will make a point to fault if you don’t send it. But ranting against WS-Addressing is a topic for another day (look for a future post titled “WS-IfInteroperabilityWasEasyItWouldNotBeFunWouldIt”).

I should mention that you want to set the Content-Type HTTP header to “application/soap+xml;charset=UTF-8″ for this message. Or UTF-16 if that’s what you’re sending.

A few comments:

  • This obviously only works for character-based files, not binaries
  • I’ve noticed that the parsing of the wsa:Action header is pretty minimalistic. The Microsoft implementation seems to just pick up the text behind the last “/”. So you can type send “blahblah/Create” and it works just as well as the correct value, “http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process/Create” (it knows what class to apply the operation on from the Resource URI). Interestingly, there is only one URL ending in “/Create” that doesn’t work and it’s the WS-Transfer “Create” operation (“http://schemas.xmlsoap.org/ws/2004/09/transfer/Create”). That’s because the “Create” operation invoked in the message above is not the WS-Transfer “Create” operation but rather the homonymous operation on the WMI class.
  • Using the “/k” modifier on “cmd” in the command line (instead of “/c”) would also work, but the command shell would stay alive after returning so over time you’d have quite a few of them hanging out and using up memory on the remote machine. Not a good move.
  • As part of this exercise, I noticed an error in the MSDN page describing the “invoke” method of Win32_Process. In the SOAP body, the URI for the “p” namespace prefix uses “…/cim/…” instead of “…/cimv2/…”, which caused my first attempts to fail.

If the file you want to upload is large, you can break the upload over several successive messages similar to the one above. As long as you use the same file name and use “>>” instead of “>” you’ll keep appending to the end of the file until it’s complete.

Of course this could be any type of text file, including XML (watch for the character-escaping rules though, both for XML and for “cmd” as you have to apply them in the right sequence). Even better, it could be a Python, Perl or PowerShell script too. And in that case (assuming the corresponding interpreter is installed on the machine) you can use the same mechanism to also invoke the script for execution. So that you use this WS-Management interface just to bootstrap into a more comfortable remote-control mechanism.

The next logical question (for extra credit) is whether WS-Management can be used to read files remotely instead of writing them. In theory yes, though in practice you’re much better off with alternate solutions, like the remote shell extension to WS-Management that I have described as “dumb SSH” previously.

But since you ask, here is the theory. My first attempt was to do a WS-Management “Get” (the Get operation from WS-Transfer) on an instance of CIM_DataFile (using the “Name” selector and setting it to “C:\data\winrm-test\test1.txt”). But this returns the properties of the file rather than its content. Whether this is kosher is an interesting theoretical question to ponder from a REST-beard-stroking perspective, but it’s useless for my file retrieval purpose. As before, one solution is to use the magical Win32_Process “Create” method to overcome the shortcomings of the CIM_DataFile class. The windows command shell “type” command can be used to display the content of a text file. But the WMI Win32_Process “create” operation that we use here only returns the processId and a result code, not the stdout stream (unlike the remote shell protocol that I mentioned above). We cannot therefore use it directly to return the output of the “type” command over the wire.

The solution is to use one Win32_Process “create” operation over WS-Management to write the content of the file in a place where a subsequent WS-Management opeation can read it. I can think of two examples off the top of my head: directory names and environment variables.

Here is how you’d do it with directory names. The following command takes the test1.txt file, reads it and creates nested subdirectories, one for each line in the input file. The name of the directory is the content of the corresponding line in the file.

for /f "delims=" %I in (test1.txt) do @mkdir "%I" && cd "%I"

For example, if the file content is

1st line
2nd line
3rd line

The command will generate the following three subdirectories:

1st line
  |_ 2nd line
      |_ 3rd line

What’s the point? You can use WS-Management enumeration to retrieve the names of all directories (using the Win32_Directory WMI class). Now that may be a bit overwhelming, so you want to add a WS-Enumeration filter to your WS-Management request. The Microsoft WS-Management implementation supports the WQL filter syntax that lets you do just that.

BTW, you can presumably do the same thing with files, but directories by their nesting make it easy to read the lines in the order in which their appear in the file. Though you’d quickly run into path length limitations (and characters that are not valid in file/directory names).

A slightly more robust approach may be to set each line of the file in an environment variable (again via the “for”, and using “set” after the “do”). You can then read these environment variables over WS-Management by doing a WS-Transfer Get on the Win32_Environment WMI class. Unlike CIM_DataFile (for which Get only return properties, not the content), a Get on Win32_Environment includes the value of the environment variable as one of the properties. The pragmatic reasons for this dichotomy are obvious, but the architectural consequences will give a headache to anyone who still has any illusion that WS-Transfer has anything to do with REST.

As a side note, the “for” instruction can keep no more than 52 variables at a time, so if your file has more than 52 lines you’d have to send successive WS-Management requests and add a “skip” option to the “for” operation on subsequent requests (“skip=52″, “skip=104″, etc…). Again, practicality isn’t much of a concern here, we’re just playing with theory (Ed: “we”? how many people do you expect will still be reading at this point?).

That’s it for today’s episod of “Windows management for the on-the-wire-protocol guy”. Maybe next weekend I’ll take some time to look more into the remote shell over WS-Management protocol extention and how it can be misued/abused.

[UPDATE: The next post describes a more practical approach.]

Categories