Category Archives: Automation

Cloud API: what’s cooking between IBM and VMWare?

In the previous entry, I declared that I had a “guess as to why [the DMTF Cloud] incubator was created without a submission”, that I may later reveal. Well here it is: VMWare and IBM are negotiating a joint Cloud API submission to DMTF and need more time before they can submit it.

This is 100% speculation on my part. It’s not even based on rumors or leaks. I made it up. Here are the data points that influenced me. You decide what they’re worth.

  • VMWare has at numerous time announced (comments here and here) that they would submit a vCloud API to DMTF in the first half of 2009.
  • In the transcript of this VMWare webcast we learn that an important part of the vCloud API is its adoption of REST as part of a move towards more abstraction and simplicity (“this is not simply proxy-ing of VIM APIs”).
  • IBM, meanwhile, has been trying to get a SOAP-based IT management framework for a while. Unsuccessfully so far. WSDM was a first failed attempt. The WS-Management/WSDM reconciliation was another one (I was in the same boat on both of these). The WS-RA working group at W3C (where the ashes of WS-RT are smoldering) could be where the third attempt springs from. But IBM is currently very quiet about their plans (compared to all the conference talks, PowerPoint slides and white papers that that heralded the previous two attempts). They obviously haven’t given up, but they are planning the next move. And the emergence of Cloud computing in the meantime is redefining the IT automation landscape in a way that they will make sure to incorporate in their updated standards plans.
  • Then comes the DMTF Cloud incubator of which the co-chairs are from VMWare and IBM (“interim” co-chairs in theory, but we know how these things go). Which seems to imply an agreement around a proposal (this is what the incubator process is explicitly designed for: “allow vendors aligned with a certain proposal to move forward and produce an interoperability specification”). But there is no associated specification submission, which suggest that the agreed-upon proposal is still being negotiated.

VMWare has a lot of momentum in a virtualization-focused view of IT automation (the predominant view right now, though I am not sure it will always be) and IBM sees them as the right partner for their third attempt (HP was the main partner in the first, Microsoft in the second). VMWare knows that they are going against Microsoft and they need IBM’s strength to control the standard. This could justify an alliance.

It seems pretty clear that VMWare has an API specification already (they supposedly even gave it to partners). It is also pretty clear that IBM would not agree to it in a wholesale way. For technical and pride reasons. They did it for OVF because it is a narrow specification, but a more comprehensive Cloud API would touch on a lot of aspects where IBM has set ideas and existing products. Here are some of the aspects that may be in contention.

REST versus WS-* – Yes, that old rathole. Having just moved to REST, the VMWare folks probably don’t feel like turning around. IBM has invested a lot in a WS-* approach over the years. It doesn’t mean that they won’t go with the REST approach, but it would take them some time to get over it. Lots of fellows and distinguished engineers would need to be convinced. There are some very REST-friendly parts in IBM (in Rational, in WebSphere) but Tivoli has seemed a lot less so to me. The worst outcome is if they offer both options. If you see this (or if you see XPath/XQuery expressions embedded inside URLs or HTTP headers), run for the escape hatches.

While REST versus WS-* is an easy one to grab on, I don’t think it’s the most important issue. Both parties are smart enough to realize it’s not that critical (it’s the model, not the protocol, that matters).

CBE/WEF – IBM has been trying to get a standard stamp on its Common Base Event format (CBE) forever. When they did (as WEF, the WSDM Event Format) it was in a simplified form (by yours truly, among others) and part of a standard that wasn’t widely adopted. But it’s still there in Tivoli and you can expect it to resurface in some form in their next proposal.

Software packaging – I am not sure what’s up with SDD, but whether it’s this specification or something else I would expect that IBM would have a lot to say about software packaging and patching. A lot more than VMWare probably cares about. Expect IBM’s fingerprints all over that part.

Security – I have criticized IBM many times for the “security considerations” boilerplate that they stick on every specification. But this in an area in which it actually make sense to have a very focused security analysis, something that IBM could do a lot better than VMWare I suspect.

ITSM / ITIL – In addition to the technical aspect of IT management operations, there are plenty of process and human aspects. Many areas of ITSM are applicable (e.g. I have written about the role of service catalogs, or you can think about the link to CMDBf). IBM has a lot more exposure there than VMWare.

Grid – IBM’s insistence to align Grid computing and IT management is one of the things that weighted WSDM down. Will they repeat this? In a way, Cloud computing *is* that junction of IT management and Grid that they were after with WSRF. But how much of the existing GGF Grid infrastructure are they going to try to accommodate? I don’t think they’ll be too rigid on this, but it’s worth watching.

Seeing how the topics above are handled in the VMWare/IBM proposal (if such a proposal ever materializes) will tell the alert readers a lot about the balance of power between VMWare and IBM.

As a side note, there are very smart people in the EMC CTO office (starting with the CTO himself and my friend Tom Maguire) who came from IBM and are veterans of the WSDM/WSRF/OGSI efforts. These people could play an interesting role in the IBM/VMWare relationship if the corporate arrangement between EMC and VMWare allows it (my guess is it doesn’t). Another interesting side note is to ask what Microsoft would do if indeed VMWare and IBM were dancing together on this. Microsoft is listed in the members of the DMTF Cloud incubator, but I notice a certain detachment in this post from Steve Martin. For now at least.

Did I mention that this is all pure speculation on my part? We’ll see what happens. Hopefully it’s at least entertaining. And even if I am wrong, the questions raised (around the links between previous IT management efforts and the new wave of Cloud standards) are relevant anyway. I am still in “lessons learned” mode on this.

[UPDATED 2009/5/5: Here is a first-hand source for the data point that VMWare plans to submit the vCloud API (rather than second-hand reports from reporters): Winsont Bumpus (VMWare’s Director of Standards Architecture) says that “VMware announced its intention to submit its key elements of the vCloud API to an existing standards organization for the basis of developing an industry standard”.]

1 Comment

Filed under Automation, Cloud Computing, DMTF, Everything, Grid, IBM, IT Systems Mgmt, Mgmt integration, OVF, SOAP, Specs, Standards, Utility computing, Virtualization, VMware

A post-mortem on the previous IT management revolution

Before rushing to standardize “Cloud APIs”, let’s take a look back at the previous attempt to tackle the same problem, which is one of IT management integration and automation. I am referring to the definition of specifications that attempted to use the then-emerging SOAP-based Web services framework to easily integrate IT management systems and their targets.

Leaving aside the “Cloud” spin of today and the “Web services” frenzy of yesterday, the underlying problem remains to provide IT services (mostly applications) in a way that offers the best balance of performance, availability, security and economy. Concretely, it is about being able to deploy whatever IT infrastructure and application bits need to be deployed, configure them and take any required ongoing action (patch, update, scale up/down, optimize…) to keep them humming so customers don’t notice anything bothersome and you don’t break any regulation. Or rather so that any disruption a customer sees and any mandate you violate cost you less than it would have cost to avoid them.

The realization that IT systems are moving more and more towards distributed/connected applications was the primary reason that pushed us towards the definition of Web services protocols geared towards management interactions. By providing a uniform and network-friendly interface, we hoped to make it convenient to integrate management tasks vertically (between layers of the IT stack) and horizontally (across distributed applications). The latter is why we focused so much on managing new entities such as Web services, their execution environments and their conversations. I’ll refer you to the WSMF submission that my HP colleagues and I made to OASIS in 2003 for the first consistent definition of such a management framework. The overview white paper even has a use case called “management as a service” if you’re still not convinced of the alignment with today’s Cloud-talk.

Of course there are some differences between Web service management protocols and Cloud APIs. Virtualization capabilities are more advanced than when the WS effort started. The prospect of using hosted resources is more realistic (though still unproven as a mainstream business practice). Open source component are expected to play a larger role. But none of these considerations fundamentally changes the task at hand.

Let’s start with a quick round-up and update on the most relevant efforts and their status.

Protocols

WSMF (Web Services Management Framework): an HP-created set of specifications, submitted to the OASIS WSDM working group (see below). Was subsumed into WSDM. Not only a protocol BTW, it includes a basic model for Web services-related artifacts.

WS-Manageability: An IBM-led alternative to parts of WSDM, also submitted to OASIS WSDM.

WSDM (Web Services Distributed Management): An OASIS technical committee. Produced two standards (a protocol, “Management Using Web Services” and a model of Web services, “Management Of Web Services”). Makes use of WSRF (see below). Saw a few implementations but never achieved real adoption.

OGSI (Open Grid Services Infrastructure): A GGF (the organization now known as OGF) standard to provide a service-oriented resource manipulation infrastructure for Grid computing. Replaced with WSRF.

WSRF: An OASIS technical committee which produced several standards (the main one is WS-ResourceProperties). Started as an attempt to align the GGF/OGSI approach to resource access with the IT management approach (represented by WSDM). Saw some adoption and is currently quietly in use under the cover in the GGF/OGF space. Basically replaced OGSI but didn’t make it in the IT management world because its vehicle there, WSDM, didn’t.

WS-Management: A DMTF standard, based on a Microsoft-led submission. Similar to WSDM in many ways. Won the adoption battle with it. Based on WS-Transfer and WS-Enumeration.

WS-ResourceTransfer (aka WS-RT): An attempt to reconcile the underlying foundations of WSDM and WS-Management. Stalled as a private effort (IBM, Microsoft, HP, Intel). Was later submitted to the W3C WS-RA working group (see below).

WSRA (Web Services Resource Access): A W3C working group created to standardize the specifications that WS-Management is built on (WS-Transfer etc) and to add features to them in the form of WS-RT (which was also submitted there, in order to be finalized). This is (presumably) the last attempt at standardizing a SOAP-based access framework for distributed resources. Whether the window of opportunity to do so is still open is unclear. Work is ongoing.

WS-ResourceCatalog : A discovery helper companion specification to WS-Management. Started as a Microsoft document, went through the “WSDM/WS-Management reconciliation” effort, emerged as a new specification that was submitted to DMTF in May 2007. Not heard of since.

CMDBf (Configuration Management Database Federation): A DMTF working group (and soon to be standard) that mainly defines a SOAP-based protocol to query repositories of configuration information. Not linked with (or dependent on) any of the specifications listed above (it is debatable whether it belongs in this list or is part of a new breed).

Modeling

DCML (Data Center Markup Language): The first comprehensive effort to model key elements of a data center, their relationships and their policies. Led by EDS and Opsware. Never managed to attract the major management vendors. Transitioned to an OASIS member section and died of being ignored.

SDM (System Definition Model): A Microsoft specification to model an IT system in a way that includes constraints and validation, with the goal of improving automation and better linking the different phases of the application lifecycle. Was the starting point for SML.

SML (Service Modeling Language): Currently a W3C “proposed recommendation” (soon to be a recommendation, I assume) with the same goals as SDM. It was created, starting from SDM, by a consortium of companies that eventually submitted it to W3C. No known adoption other than the Eclipse COSMOS project (Microsoft was supposed to use it, but there hasn’t been any news on that front for a while). Technically, it is a combination of XSD and Schematron. It appears dead, unless it turns out that Microsoft is indeed using it (I don’t know whether System Center is still using SDM, whether they are adopting SML, whether they are moving towards M or whether they have given up on the model-centric vision).

CML (Common Model Library): An effort by the SML authors to create a set of model elements using the SML metamodel. Appears to be dead (no news in a long time and the cml-project.org domain name that was used seems abandoned).

SDD (Solution Deployment Descriptor): An OASIS standard to define a packaging mechanism meant to simplify the deployment and configuration of software units. It is to an application archive what OVF is to a virtual disk. Little adoption that I know of, but maybe I have a blind spot on this.

OVF (Open Virtualization Format): A recently released DMTF standard. Defines a packaging and descriptor format to distribute virtual machines. It does not defined a common virtual machine format, but a wrapper around it. Seems to have some momentum. Like CMDBf, it may be best thought of as part of a new breed than directly associated with WS-Management and friends.

This is not an exhaustive list. I have left aside the eventing aspects (WS-Notification, WS-Eventing, WS-EventNotification) because while relevant it is larger discussion and this entry to too long already (see here and here for some updates from late last year on the eventing front). It also does not cover the Grid work (other than OGSI/WSRF to the extent that they intersect with the IT management world), even though a lot of the work that took place there is just as relevant to Cloud computing as the IT management work listed above. Especially CDDLM/CDL an abandoned effort to port SmartFrog to the then-hot XML standards, from which there are plenty of relevant lessons to extract.

The lessons

What does this inventory tell us that’s relevant to future Cloud API standardization work? The first lesson is that protocols are easy and models are hard. WS-Management and WSDM technically get the job done. CMDBf will be a good query language. But none of the model-related efforts listed above seem to have hit the mark of “doing the job”. With the possible exception of OVF which is promising (though the current expectations on it are often beyond what it really delivers). In general, the more focused and narrow a modeling effort is, the more successful it seems to be (with OVF as the most focused of the list and CML as the other extreme). That’s lesson learned number two: models that encompass a wide range of systems are attractive, but impossible to deliver. Models that focus on a small sub-area are the way to go. The question is whether these specialized models can at least share a common metamodel or other base building blocks (a type system, a serialization, a relationship model, a constraint mechanism, etc), which would make life easier for orchestrators. SML tries (tried?) to be all that, with no luck. RDF could be all that, but hasn’t managed to get noticed in this context. The OVF and SDD examples seems to point out that the best we’ll get is XML as a shared foundation (a type system and a serialization). At this point, I am ready to throw the towel on achieving more modeling uniformity than XML provides, and ready to do the needed transformations in code instead. At least until the next window of opportunity arrives.

I wish that rather than being 80% protocols and 20% models, the effort in the WS-based wave of IT management standards had been the other way around. So we’d have a bit more to show for our work, for example a clear, complete and useful way to capture the operational configuration of application delivery services (VPN, cache, SSL, compression, DoS protection…). Even if the actual specification turns out to not make it, its content should be able to inform its successor (in the same way that even if you don’t use CIM to model your server it is interesting to see what attributes CIM has for a server).

It’s less true with protocols. Either you use them (and they’re very valuable) or you don’t (and they’re largely irrelevant). They don’t capture domain knowledge that’s intrinsically valuable. What value does WSDM provide, for example, now that’s it’s collecting dust? How much will the experience inform its successor (other than trying to avoid the WS-Addressing disaster)? The trend today seems to be that a more direct use of HTTP (“REST”) will replace these protocols. Sure. Fine. But anyone who expects this break from the past to be a vaccination against past problems is in for a nasty surprise. Because, and I am repeating myself, it’s the model, stupid. Not the protocol. Something I (hopefully) explained in my comments on the Sun Cloud API (before I knew that caring about this API might actually become part of my day job) and something on which I’ll come back in a future post.

Another lesson is the need for clear use cases. Yes, it feels silly to utter such an obvious statement. But trust me, standards groups still haven’t gotten this. It’s not until years spent on WSDM and then WS-Management that I realized that most people were not going after management integration, as I was, but rather manageability. Where “manageability” is concerned with discovering and monitoring individual resources, while “management integration” is concerned with providing a systematic view of the environment, with automation as the goal. In other words, manageability standards can allow you to get a traditional IT management console without the need for agents. Management integration standards can allow you to coordinate your management systems and automate their orchestration. WS-Management is for manageability. CMDBf is in the management integration category. Many of the (very respectful and civilized) head-butting sessions I engaged in during the WSDM effort can be traced back to the difference between these two sets of use cases. And there is plenty of room for such disconnect in the so-loosely-defined “Cloud” world.

We have also learned (or re-learned) that arbitrary non-backward compatible versioning, e.g. for political or procedural reasons as with WS-Addressing, is a crime. XML namespaces (of the XSD and WSDL types, as well as URIs used in similar ways in specifications, e.g. to identify a dialect or profile) are tricky, because they don’t have backward compatibility metadata and because of the practice to use organizations domain names in the URI (as opposed to specification-specific names that can be easily transferred, e.g. cmdbf.org versus dmtf.org/cmdbf). In the WS-based management world, we inherited these problems at the protocol level from the generic WS stack. Our hands are more or less clean, but only because we didn’t have enough success/longevity to generate our own versioning problems, at the model level. But those would have been there had these models been able to see the light of day (CML) or see adoption (DCML).

There are also practical lessons that can be learned about the tactics and strategies of the main players. Because it looks like they may not change very much, as corporations or even as individuals. Karla Norsworthy speaks for IBM on Cloud interoperability standards in this article. Andrew Layman represented Microsoft in the post-Manifestogate Cloud patch-up meeting in New York. Winston Bumpus is driving the standards strategy at VMWare. These are all veterans of the WS-Management, WSDM and related wars collaborations (and more generally the whole WS-* effort for the first two). For the details of what there is to learn from the past in that area, you’ll have to corner me in a hotel bar and buy me a few drinks though. I am pretty sure you’d get your money’s worth (I am not a heavy drinker)…

In summary, here are my recommendations for standardizing Cloud API, based on lessons from the Web services management effort. The theme is “focus on domain models”. The line items:

  • Have clear goals for each effort. E.g. is your use case to deploy and run an existing application in a Cloud-like automated environment, or is it to create new applications that efficiently take advantage of the added flexibility. Very different problems.
  • If you want to use OVF, then beef it up to better apply to Cloud situations, but keep it focused on VM packaging: don’t try to grow it into the complete model for the entire data center (e.g. a new DCML).
  • Complement OVF with similar specifications for other domains, like the application delivery systems listed above. Informally try to keep these different specifications consistent, but don’t over-engineer it by repeating the SML attempt. It is more important to have each specification map well to its domain of application than it is to have perfect consistency between them. Discrepancies can be bridged in code, or in a later incarnation.
  • As you segment by domain, as suggested in the previous two bullets, don’t segment the models any further within each domain. Handle configuration, installation and monitoring issues as a whole.
  • Don’t sweat the protocols. HTTP, plain old SOAP (don’t call it POS) or WS-* will meet your need. Pick one. You don’t have a scalability challenge as much as you have a model challenge so don’t get distracted here. If you use REST, do it in the mindset that Tim Bray describes: “If you’re going to do bits-on-the-wire, Why not use HTTP? And if you’re going to use HTTP, use it right. That’s all.” Not as something that needs to scale to Web scale or as a rebuff of WS-*.
  • Beware of versioning. Version for operational changes only, not organizational reasons. Provide metadata to assert and encourage backward compatibility.

This is not a recipe for the ideal result but it is what I see as practically achievable. And fault-tolerant, in the sense that the failure of one piece would not negate the value of the others. As much as I have constrained expectations for Cloud portability, I still want it to improve to the extent possible. If we can’t get a consistent RDF-based (or RDF-like in many ways) modeling framework, let’s at least apply ourselves to properly understanding and modeling the important areas.

In addition to these general lessons, there remains the question of what specific specifications will/should transition to the Cloud universe. Clearly not all of them, since not all of them even made it in the “regular” IT management world for which they were designed. How many then? Not surprisingly (since IBM had a big role in most of them), Karla Norsworthy, in the interview mentioned above, asserts that “infrastructure as a service, or virtualization as a paradigm for deployment, is a situation where a lot of existing interoperability work that the industry has done will surely work to allow integration of services”. And just as unsurprisingly Amazon’s Adam Selipsky, who’s company has nothing to with the previous wave but finds itself in leadership position WRT to Cloud Computing is a lot more circumspect: “whether existing standards can be transferred to this case [of cloud computing] or if it’s a new topic is [too] early to say”. OVF is an obvious candidate. WS-Management is by far the most widely implemented of the bunch, so that gives it an edge too (it is apparently already in use for Cloud monitoring, according to this press release by an “innovation leader in automated network and systems monitoring software” that I had never heard of). Then there is the question of what IBM has in mind for WS-RT (and other specifications that the WS-RA working group is toiling on). If it’s not used as part of a Cloud API then I really don’t know what it will be used for. But selling it as such is going to be an uphill battle. CMDBf is a candidate too, as a model-neutral way to manage the configuration of a distributed system. But here I am, violating two of my own recommendations (“focus on models” and “don’t isolate config from other modeling aspects”). I guess it will take another pass to really learn…

[UPDATED 2009/5/7: Senior moment! When writing this entry I forgot that I wrote an earlier entry (in late 2007) specifically to describe the difference between “manageability” and “management integration”. So here it is, if you care for more details on this topic.]

5 Comments

Filed under Automation, Cloud Computing, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Modeling, People, Portability, REST, SML, SOAP, Specs, Standards, Utility computing, Virtualization, WS-Management, WS-ResourceCatalog, WS-ResourceTransfer

“Federationing”

I am glad to see that, as it inches towards standardization in the DMTF, the CMDBf specification is getting more visibility. Forrester’s Glenn O’Donnell recently wrote very positively about it on his blog, presenting it as a key enabler for a federation of MDRs (Management Data Repositories, a term introduced by the CMDBf specification so don’t look for it in ITIL). He argues this is the only way (rather than a single data store) to fulfill the ITIL-defined role of a CMS. Rob England (the IT Skeptic) has also shared his thoughts about CMDBf and they were noticeably less enthusiastic, to say the least. While Glenn calls the specification “profound”, Rob calls it “the most over-hyped vendor marketing smokescreen ever”. There is plenty of room in between them, which is where I sit. As I explained before, it does have real value (as a query language/protocol for system integration) but is nowhere near providing “federation” capabilities.

I am happy to see Glenn approve of CMDBf and I agree with him that accurate specialized MDRs are more useful than a single store that attempts to capture all the relevant data. As Glenn puts it, “pockets of the truth are far superior to unified ambiguity”. But I wasn’t very comfortable with the tone of his article, which seemed to almost encourage the proliferation of these MDRs. Maybe he was just trying to present a clean break with the “one big CMDB” approach and overreached. Or maybe I am just not reading properly.

Because while I agree that the answer is not “one and only one store” I also don’t want to loose the value of having as much unification of the IT model as possible. Both at the data level (i.e. same metamodel/model, consistent retention/roll-up policies…) and the access level (i.e. in the same physical store, with shared access control, accessible using a well-known DSL for data manipulation…). Metamodel transformation and model bridging are costly (in accuracy, maintenance, reliability). If your CMS does more than just support a  “model navigation” GUI it may then need to run large queries that go across several portions of your IT model, including multiple different domains (e.g. a compliance rule kicked-off at the app level based on the type of data it manipulates that ends up having to look at the physical location of the servers running the hypervisors for the virtual machines that power the app). Through such global queries you can apply configuration rules, do impact analysis, event correlation, provide context to your transaction tracing, etc. No consolidation means no such queries (or a very limited subset). Considering the current state of federation, there is a lot more that you can do with your CMS if you have a very small number of MDRs rather than a sea of “federated” MDRs. This is why, as Oracle acquires IT management companies, we deliberately integrate their repositories with Enterprise Manager.

[UPDATED 2009/4/8: More, along the same line, from Glenn and his co-author Carlos Casanova available here. And my CMDBf partner-in-crime Van Wiles also responded to Glenn, bringing a BMC perspective.]

1 Comment

Filed under Application Mgmt, Automation, CMDB, CMDB Federation, CMDBf, Everything, IT Systems Mgmt, ITIL, Mgmt integration, Modeling, Query, Specs, Standards

Open Cloud Manifesto, circa 2004

The mini-scandal of last week was the manifesto-gate. The mini-scandal of this week is shaping out to be the Ulitzer-gate (if you want to make sure not to miss next week’s IT scandal, subscribe to the Register feed, ferreting these out and adding a bass-heavy soundtrack is their specialty).

Turns out I am one of these Ulitzer “unaware authors” through two articles I wrote a while ago for the Web services Journal, a paper publication by Sys-con (based on a request from HP PR) and a blog post I allowed Sys-con to republish. Looks like Ulitzer and Sys-con are one and the same. Three articles, spaced two years apart. That’s enough to earn me a dedicated home page at Ulitzer and a rank of 1,000 among their more than 6,000 authors. Makes you wonder how much the 5,000 “authors” behind me have (unknowingly) produced… Whatever. At least it’s all content that I authorized Sys-con to use, not something that was lifted from my blog as apparently happened to others.

Turns out the oldest of these articles (“From Web Services Management to Utility Computing” , from 2004) is not that different from the recently-published (and amply maligned) Open Cloud Manifesto. I described my article at the time as “an attempt to explain how the different efforts going on in the industry around Web services, grid, SOA management, virtualization, utility computing, <insert your favorite buzzword>, fit together to provide organizations with the flexibility and efficiency they need from their IT in order to thrive.”

It ends with “while it would be easier to develop an end-to-end model specific to one company’s offering, standardization allows the integration of the management capabilities of all the components that compose enterprise services. We must keep the pressure on vendors to deliver modular and composable specifications (for format, function, and protocol) that expose management capabilities of infrastructure services, applications, and business processes in such a way that these capabilities can be composed by the next generation of management applications.”

Sure it has a lot more emphasis on WS-* specs than is compatible with the current zeitgeist, and it uses the now-obsolete term of “utility computing” rather than the nebulous alternative currently en vogue, but isn’t the main message there?

Just to be clear, I am not laying pretentious claims of prescience and vision (at least not in this entry). There are plenty of documents (e.g. from the Grid community) that make the same points in more eloquent terms and starting many years prior. It’s just fun to see this link from today’s scandal to the one from last week.

for old time sake, here is the content of the 2004 article:

From Web Services Management to Utility Computing
by William Vambenepe

Enterprise services are created by combining infrastructure services, applications, and business processes. To be able to adapt quickly to business changes, enterprise IT must evolve from management of individual resources to management of interrelated services. This will be achieved through the development of composable and modular standards that expose the management capabilities of the building blocks of enterprise services. The Web services platform is an enabler of this transformation: a Web services-based management infrastructure provides a channel that is appropriate for dynamic resource provisioning, allocation, and configuration – often called utility computing.

We can consider this management infrastructure as a four-layered architecture. Starting at the foundation layer, the work on the base Web services infrastructure is far from over. First, until WSDL 2.0 is widely deployed, designers have to compose around the deficiencies of WSDL 1.1, such as the lack of portType inheritance. Second, there is still no standard for referencing Web services. Finally, key specifications such as WSRF (Web Services Resource Framework) and WSN (Web Services Notification), without which people were left to reinvent Web services interfaces to access stateful resources, have only recently reached the standards community. These issues are being resolved and a set of building blocks for accessing resources through an SOA (service-oriented architecture) is shaping up. It is critical that these building blocks be modular and composable to allow incremental adoption and separation of concerns.

Moving from the foundation to the management protocol layer, the OASIS WSDM (Web Services Distributed Management) technical committee, through its MUWS (Management Using Web Services) specification, is the key articulation point between the base Web services architecture and utility computing. Both the IT management community and the Grid community rely on MUWS. It defines how to express and exercise manageability capabilities through Web services, putting in place a management channel that is more interoperable and accessible than ever before.

Next is the modeling layer. Information models need to be composed so that a service can be represented based on the services that it is assembled from, be they peer or infrastructure services. Since these will be described by different models, the management channel (MUWS) needs to be model-agnostic in order to support a model-centric architecture. For example, CIM (Common Information Model) is a model that focuses on concrete resources. The DMTF WS-CIM subgroup must now open CIM to the Web services platform by developing a standard way to expose CIM-modeled resources through MUWS. Other models provide representations for service security, service-level agreements (SLA), etc. Only by composing these models will, for example, an auction service SLA be adequately managed as it depends on a combination of the performance of the servers on which the service runs, the application server that hosts it, the other services (authentication, billing, etc.) that it makes use of, and the business process engine that controls the bidding. Once this model-centric architecture is in place, management actions can be policy-driven through explicit constraints.

Finally, at the top layer, the architecture includes a set of common services for utility computing. They are being defined collaboratively by DMTF (Utility Computing working group) and GGF (OGSA working group).

All the pieces are falling into place but much remains to be done to allow comprehensive management of enterprise services in a model-centric way through Web services standards. While it would be easier to develop an end-to-end model specific to one company’s offering, standardization allows the integration of the management capabilities of all the components that compose enterprise services. We must keep the pressure on vendors to deliver modular and composable specifications (for format, function, and protocol) that expose management capabilities of infrastructure services, applications, and business processes in such a way that these capabilities can be composed by the next generation of management applications. These applications will use this to synchronize business and IT and to capitalize on change.

Comments Off on Open Cloud Manifesto, circa 2004

Filed under Application Mgmt, Articles, Automation, Business Process, Cloud Computing, Everything, IT Systems Mgmt, Mgmt integration, Modeling, Specs, Standards, Utility computing, Virtualization

Cloud computing: would you like flexibility with your simplicity?

The recent announcement of the Sun Cloud, and more specifically its API is a good occasion to think about how much simplicity we really want in our datacenter automation mechanisms. The Sun API is very simple and its authors are proud of that fact. Indeed they should be proud of avoiding unneeded complexity. They have probably also kept out (at least so far), some needed complexity.

First, let’s focus on the important part:

It’s not REST that matters, it’s the rest

Most of the comments on the API focus on the fact that it’s RESTful. The authoritative source on this is Tim Bray’s description of the API, which he helped shape. But Tim is very down-to-earth about the reasons to use REST:

Why REST? · It’s a sensible question. The chief virtue of RESTful interfaces is massive scaling. But gimme a break, these are data-center management operations; a typical transaction frequency would be a single-digit number per week, with the single digit often being “0”, and it wouldn’t be surprising if a big multi-cluster staged-boot operation had a latency of minutes. The data-center controls are unlikely to be a bottleneck.

Why, then? Simply because we wanted a bits-on-the-wire interface. APIs, in the general case, suck; and are really hard to make portable. Bits-on-the-wire are ultimately flexible and interoperable. If you’re going to do bits-on-the-wire, Why not use HTTP? And if you’re going to use HTTP, use it right. That’s all.

The use of REST is not a fundamental characteristic of the API. In other words, if this API turns out to be useful I can rewrite it as a SOAP API and it would still be useful. Unless the SOAP API is made purposely complicated, it would only be marginally harder to use, not fundamentally less useful.

In fact, we may find out. If the rumor is confirmed and IBM decides to Tivolify (rather than kill) the Sun Cloud, the whole thing can be refactored as WS-RT/XML/XQuery (and maybe WS-ResourceCatalog) in five days, four of which would be spent capturing, sedating and restraining Tim Bray (and his “spec machete”) with the last one used for coding.

In the case of the Sun Cloud API, REST makes the API simpler in the same way that a keyless system makes a car easier to operate. You don’t have to fumble for they key, but you still need to know to parallel park, change a tire and operate the stereo.

By using REST, the Sun team has kept away some arbitrary complexity (e.g. fine-grained PUT; instead Sun decides what are the two valid sets of input parameters to create a cluster). But that’s only a small percentage of the potential complexity of the system. Not to mention that most developer will use libraries rather than on-the-wire protocols so they won’t see any difference. Instead, the real deal is:

The model

By “the model” I mean both the resource model and the capabilities of the resources. For capabilities, I don’t care whether a virtual machine can be started via an HTTP GET request on a URL that ends with ?control=start, or via a SOAP message with the wsa:Action header set to http://iloveclouds.com/vm/start or via an RPC call to a Start(…) method. I just care that the model includes the capability to start a VM. And the list of states a VM can be in.

Look at a datacenter today. Make an inventory of all the networking equipment, storage, servers, hypervisors, operating systems and infrastructure services that it contains. Consider all the configuration settings of all these resources (as they would be represented in a complete, authoritative and consistent CMDB, that most elusive creature). Add to it all the controls and APIs they expose. That’s a lot of data, even if you don’t consider the applications layer. That’s a few orders of magnitude larger than what the model in the Sun Cloud API can describe. That gap (between our CMDB model and the Sun Cloud model) is what we should look at and analyze. Why are they so far apart? How big is the ideal datacenter automation and virtualization model?

Among other things, these hundreds of configuration settings in your current datacenter are used to optimize deployments. No-one would miss the pain of dealing with the optimizations if they went away, but we would miss the performance benefits they bring. So what replaces them if the model is too simple to support any tweak? Is the infrastructure behind the API auto-optimized, based on actual application patterns? Now that would be real progress towards simplicity and may allow us to rely on an API as simple as the Sun API. But the industry has been trying to do this with little success for a long time. I expect incremental, not radical, progress on this. Alternatively, does Cloud Computing change the economics to the point where performance optimizations through configurations are no longer cost-efficient, where scaling out is the answer? Hard to make this a general statement, considering how difficult it remains for many applications to scale out. And this sounds very SUV-like in these footprint-aware times (we see how well the “stretch the hood and add two cylinders to the engine” approach worked for Detroit).

Sun might very well have this covered under the hood. But I don’t know that I want to assume that they have an auto-optimizing system just because they produced an API that would benefit from having it underneath.

Not to mention that not all configuration tweaks have to do with performance optimization. Some of them are driven by licensing, organizational, risk and compliance considerations. If auto-detecting an application performance profile is hard, try auto-detecting its regulatory requirements.

Complexity with a purpose

The right place to be, between the “omniscient CMDB model” and the “Sun Cloud model” is somewhere in the middle, with a couple of incrementally complex layers. Of course they are so far apart that saying “somewhere in the middle” is a cope-out.  The current level of complexity is very hard to manage by humans (assisted by processes and tools, e.g. ITIL) and impossible to really automate. A lot of the complexity and variability is arbitrary rather than flexibility-inducing. We need to reduce this (all-out standardization is one way, stack integration is another). But the simplicity of the model in the Sun Cloud API is too extreme. Look at Amazon EC2. Everyone lauds the simplicity of the APIs and everyone, in the same breath, asks for more options (different instance types, availability zones, reserved instances…). Amazon (and Sun too, I assume) is taking the eminently rational approach of starting from simple and adding complexity (sorry, flexibility) as needed. That’s great. Just don’t get too enamored with the initial simplicity.

[UPDATED 2009/3/20: James Governor lauds the simplicity of Amazon’s cloud offering.  If I understand him correctly, he sees simplicity as coming not just from “few options” but also from backward compatibility with current app infrastructure. That second part is what William Louth criticizes in his comment below. At the very least I like to keep the two separated: “how intrinsicly simple is it” and “how backward compatible is it” even though both can be seen as providing the benefit of simplicity.]

7 Comments

Filed under Automation, Cloud Computing, Everything, IT Systems Mgmt, Modeling, REST, Specs, Tech, Utility computing, Virtualization

It feels like an AON ago

Here is yet another reminder of the short attention span in our industry: in this week of all-Cisco-all-the-time coverage and commentary, induced by the Unified Computing announcement, not one article or blog posting mentioned AON (Application Oriented Networking). Remember AON, introduced by Cisco in 2005? If not you’re not alone. At least according to Technorati and Google News who don’t find a single mention of it in the Unified Computing coverage (until Technorati re-indexes this blog, at which point this entry will ironically make a liar of itself…).

If Cisco is not going to tell us how AON relates to Unified Computing it would be nice if some of the trade publications and analysts who covered AON at the time made an effort to update those of us who can’t remember the neighbor’s name but never forget an acronym. Wasn’t AON Cisco’s first attempt to move from the network layer to the application layer, which is what Unified Computing is also about? Is this the second step? A reset? What was learned from AON?

At least there are parts of Unified Computing I understand (Cisco selling blades. Check. Partnership with BMC for management software. Check. etc…). That’s more than I could say at the time for AON (even after moderating a panel at the IEEE ICWS 2005 conference in which a Cisco manager described it).

A search on the Cisco site seems to indicate that AON is indeed available for purchase. It looks like a DataPower-like XML network appliance (message security, routing and monitoring). If they had described it like that at the time I am fairly sure I would have understood. Especially since such appliances already existed. Let’s see if Unified Computing has more success as a bold vision for the programmable datacenter or if it too ends up as a lonely blade SKU in Cisco’s price sheet.

[UPDATED 2009/3/18: At least one analyst made the link. Congratulations to Eric Siegel from the Burton Group. But his linkage was too subtle for Technorati or Google to pick it up: he didn’t mention AON in his blog entry about Unified Computing. That post, titled “Cisco the Computer Company, Act IV” refers to “Cisco the Computer Company, Act III”, which refers to “Cisco the Computer Company, Act II” which refers to, you’ve guessed it, “Cisco the Computer Company” which covers AON. Bingo. Even though he doesn’t directly tackle the “how does Unified Computing relate to AON” question, Eric still gets the prize for follow-through. And for prescience. More Burton Group coverage of Unified Computing here and here. This is from the DCS (“Data Center Strategies”) side of the house, as opposed to the NTS (“Network and Telecom Strategies”) side where Eric lives. If nothing else Cisco is challenging one thing with this move: the organizational structure (DCS vs. NTS) of the Burton Group…]

Comments Off on It feels like an AON ago

Filed under Application Mgmt, Automation, Everything, Virtualization

Exploring “IT management in a changing IT world”

The tagline for this blog is “IT management in a changing IT world”. Of course nobody but their authors care about blog taglines. Still, in the unlikely event that I am asked to expand on the “changing IT world” part I would do it as follows.

The changes currently at work in the IT world can be organized along three axis:

  • IT infrastructure and management
  • Application development and delivery
  • Business and regulation

Each of these categories is ridiculously large. It’s only through the prism of the relationships between them that they provide any value. Think about three balls linked by coil springs.

If you give one of these balls a shake, you will start a hard-to-predict dance between them. This is similar to how the three domains above relate to one another. Changes in one (say a new focus on regulatory compliance in the “business” area, the emergence of virtualization technology in the “infrastructure” area or the appearance of Web 2.0 applications in the “application” area) start a complex movement involving all three. It takes a while to achieve a new equilibrium (and in practice it is never achieved since changes occur too often, adding stimulus to an already excited system). For a visual illustration, see this little YouTube video (but imagine that the three balls are arranged in a triangle rather than linearly and that every so often one of them gets pulled in a random direction).

This is not new of course. There have been changes in these three areas for as long as IT has existed (starting before it was called IT) and they have always driven changes in how IT is managed. To some extent they also have always influenced one another. The “new” part is that the connections are a lot tighter now, that the springs have a much higher force constant (the “k” in “F=-kx”). So here is my attempt at mapping today’s hot buzzwords on a map organized along these areas.

Before you ask: yes of course I have a very rigorous methodology, based on very precise quantitative data, to establish with certainty the exact x, y and z coordinates of each label. Buzzword topology is a precise science.

You may notice that the buzziest buzzword (at least currently), “Cloud”, does not appear on the map. It’s because it buzzes so much that it would be all over it, engulfing what currently appears as “virtualization”, “datacenter automation”, “Iaas”, “PaaS”, “SaaS” and “opex/capex”. There are two main parts in the “Cloud” buzzword: the “Technical Cloud” and the “Business Cloud”. The “Technical Cloud” is where we take virtualization and standardization (of machines, networks and application infrastructure) and turn that mind-boggling complexity into a manageable system that can be programmed to deliver applications (Cisco recently called it “Unified Computing”; HP, IBM and others have been trying to describe and brand it for a long time). Building on these technical capabilities comes the second part of “Cloud”, the “Business Cloud”. It is the ability to use infrastructure owned by a third party (presumably one able to leverage economies of scale) and all the possibilities this opens in the business realm. That’s what “Cloud” started as, back when it was known as “Utility Computing” and before it was applied to everything under the sun. A recent illustration of the relationship between the “Technical Cloud” and the “Business Cloud” is the introduction of vCloud by VMWare (their vision includes using VMotion technology, a piece of the “Technical Cloud”, not just to move machines between neighboring hypervisors but between organizations, enabling the “Business Cloud”). Anyway, that’s why “Cloud” it’s not on the map. It is actually all over it.

The system displayed on the map is vibrating very intensely right now, and I don’t see this changing anytime soon. Just for fun, here are candidates for future boxes on the map:

  • In the “IT infrastructure and management” category, maybe one day we’ll get to real metadata-driven management integration across the stack (as opposed to the more limited “application modeling” area listed above), whether through RDF or not.
  • In the “application development and delivery” category, maybe Doug Purdy’s vision “to make everyone a programmer (even if they don’t know it)” will be realized, whether through Oslo or not.
  • In the “business and regulation” category, maybe one day corporations will actually start caring about the customer data they are entrusted to (but only if mishandling it finally costs them more than “sorry about that, here is a one year credit monitoring subscription now go away”).

In summary, the evolution of IT management is driven not only by changes in IT technology but also by changes in two other fields (“application development and delivery” and “business and regulation”) with which it is tightly connected. Both of these fields are also in a very dynamic state. And they also influence one another, resulting in a complex three-way dance. You can’t understand the trajectory and moves of one dancer without seeing the others.

That’s what I mean by “IT management in a changing IT world”. Thanks for asking.

[UPDATED 2009/6/25: For more on the “technical cloud” versus “business cloud”, go read Neil Ward-Dutton’s nice explanation. He actually breaks down the “business cloud” in two (separating the economic aspect from the strategic aspect).]

1 Comment

Filed under Application Mgmt, Automation, Big picture, BPM, BSM, Business, Cloud Computing, Everything, IT Systems Mgmt, ITIL, Mgmt integration, Open source, Utility computing, Virtualization

WS-Discovery and WS-DeviceProfile public review

We are in the middle of the public review period for the three specifications coming out of the “Web Services Discovery and Web Services Devices Profile (WS-DD)” OASIS technical committee. The specifications are:

This trio has been swirling around for many years. Mostly in the USB/PnP/UPnP community (devices physically attached to Windows boxes) but they have popped up on the enterprise WS-* radar screen from time to time, for two reasons:

  • WS-DD as another specification (in addition to WS-Management and the later incarnations of WS-MeX) who uses WS-Transfer, and therefore as a (poor, in my view) justification for making it a stand-along specification
  • WS-Discovery as potentially useful for datacenter management (for resource discovery, obviously)

Looking at the members list for the technical committee seems to confirm the suspicion that many members would be more interested in the potential datacenter-related applications (of WS-Discovery, presumably) than the USB gadgets use cases: CA, IBM, Novell, Progress, RedHat, Software AG, WSO2. Well, I guess Novell and RedHat could be in the game from the perspective of plugging devices into desktops running their version of Linux rather than from a datacenter perspective. CA and IBM could be in it from an Asset Management (for devices) perspective too. So who knows (if you had no tolerance for idle speculations you wouldn’t be reading blogs, would you?).

In any case, no Apple, Palm, RIM, Google (Android) or Sun (JavaFX). I guess handheld devices aren’t the devices targeted here. Rather it’s more printers and cameras. In which case you have to wonder why HP isn’t there but that’s another question (I’ll pretend not to know).

No point dwelling on this, especially since the member list is a very imperfect representation of the actual participation. As an alternative we can scan the mailing list archive to see who is active. Judging from the last two months (feb 2009, march 2009) it’s Microsoft, Microsoft and that company in Redmond. Microsoft, as we know, created these specifications from the start and they seem to still be the engine.

In the end it remains to be seen if WS-Discovery has any role to play in datacenter infrastructure discovery. That environment is clearly becoming more dynamic, but the most frequent changes will take place at the level of the guest VM (creation of new instances) and above (applications, services…) and all these creations will presumably be orchestrated by some controller so there is little need for a “hello world” resource-initiated discovery mechanism for them.

Which leaves the physical equipment. Do we need WS-Discovery to get from the point of a server being plugged to the point where it has been provisioned? There are alternatives in this somewhat homogenous environment. And UDP multicast only takes you so far in a datacenter (at least without extensions). So my guess is no. But who knows. The drafts are here for you to review if you think it may be relevant.

Comments Off on WS-Discovery and WS-DeviceProfile public review

Filed under Automation, Everything, IT Systems Mgmt, Manageability, Microsoft, Specs, Standards

The datacenter as a programmable entity

This is an exciting time for those who want to shrink the computer. They are having a field day playing with devices powered by Android, the iPhone’s Cocoa, Palm’s new WebOS, Windows Mobile, JavaFX (maybe one day) and, to a lesser extent, the Blackberry.

But times are good too for those who want to go the other way and program larger things rather than smaller ones. If you are interested in thinking about datacenters as a programmable entity, you are in luck: for these long plane trips when you run out of battery, bring a printout of the proceedings of the research meeting organized last year in Cambridge by Microsoft and HP Labs, titled “The Rise and Rise of the Declarative Datacentre”. When you’re back on-line go check the presentations on the site.

And if you liked Paul Anderson’s “Programming the Data Centre” presentation at the Cambridge meeting, you can also read his “Programming the Virtual Infrastructure” slides from LISA 08. More LISA 08 presentations here.

I got the link to Paul Anderson’s second presentation (and maybe also the first one, some time ago) from Steve Loughran, who also adds a few comments, starting with the debate between the declarative and procedural approaches. This question has plenty of down-the-road implications. There is a lot to like about the declarative approach in terms of composition, manageability and more generally as a framework to manage complexity via encapsulation.

A simple analogy for this debate is to think about driving directions. The declarative approach is for me to give you a map with a circle on it showing where my house is and let you find your way. It’s more work for you but it’s also more resilient. The procedural approach is for me to give you a set of turn-by-turn directions, based on where you are coming from. If you miss one turn or if one road happens to be blocked at the time, then you’re in trouble.

That being said, there are enough powerful and useful PowerShell or Puppet scripts out there to give you a pause before discarding procedural approaches. While the declarative (aka “desired state”, “policy-driven” and sometimes “model-based”) approach looks a lot more elegant, at this point in time the real work usually gets done via scripts, deployment procedures or the likes.

In additin to academia, the competition between these approaches is playing out right now between all the companies and products that want to help you automate and manage your cloud deployments (public and/or private): for example, Rightscale scripts (custom scripts and Righscripts, see here and here) versus the more declarative ECML/EDML documents from Elastra. Or the very declarative approach taken by SmartFrog.

5 Comments

Filed under Automation, Cloud Computing, Conference, Desired State, Everything, Grid, Implementation, Research, Tech, Utility computing

UCI: setting RDF for failure?

I don’t get it. I just read Reuven Cohen’s description of the Unified Cloud Interface project that he recently started. It’s nothing less than using RDF to create “a Semantic Cloud Infrastructure capable of adapting to a variety of methodologies / architectures and completely agnostic to any specific API or platform being described.”

What made me fall off my chair is the methodology/architecture part of this statement. It’s hard enough (but doable) to use RDF to map philosophically similar APIs. It’s a non-starter to use it to bridge architectural and methodological differences. I have spent a fair amount of time looking at Semantic Web technologies in the context of modeling IT systems (see the “semantic tech” category of this blog). While I think they would be a great foundation I don’t see them ever coming anywhere near what Reuven describes.

But to be fair, I am not sure what he really is describing. There are a few overly ambitious proclamations like the one above and this paragraph:

The key drivers of a unified cloud interface (UCI) is “One abstraction to Rule them All” – an API for other API’s. A singular abstraction that can encompass the entire infrastructure stack as well as emerging cloud centric technologies through a unified interface. What a semantic model enables for UCI is a capability to bridge both cloud based API’s such as Amazon Web Services with existing protocols and standards, regardless of the level of adoption of the underlying API’s or technology. The goal is simple, develop your application once, deploy anywhere at anytime for any reason.

But in his piece you’ll also find CIM being cited as an example. There are good things to be said about CIM, but it certainly is not “a dynamic computing model that can, under certain conditions, be ‘trained’ to appropriately ‘learn’ the meaning of related cloud & infrastructure resources” (or, in the case of CIM, computer system resources). Good luck “training” CIM to “learn” anything. It’s CIM that’s going to train you to do it its way, period.

The CIM example (and other standards he lists) paints the picture of defining a standard API for Cloud Computing and forcing all providers to use it. That’s the conventional approach to universality. If that’s what UCI is after then it is technically achievable. And RDF might be a very good technical foundation for it. Whether anyone can pull this off politically and commercially at this stage is a different question of course. In any case, such an effort would have nothing to do with magically wrapping whatever API each provider has defined and whatever architecture/methodology they chose.

And further down we see a sketch of another, much more modest, vision, when Reuven talks about how “these web resources could just as easily be ‘cloud resources’ or API’s” which seems to represent a whole API as an RDF resource. Sure, then you can use RDF/OWL to capture versioning information between them, backward compatibility etc. Probably very useful, but that’s a very different scope.

So which is it? Reuven is a thought leader in Cloud Computing, so I want to think I am missing his point.

So far, I haven’t seen any Cloud taxonomy that is reasonably complete and has received broad support. Shouldn’t we first try to come up with a human-readable taxonomy before we try to turn it into a machine-readable ontology? In my previous post I explicitly stayed away from being pedantic about the difference between the terms, but the confusion between a taxonomy and an ontology seems to be part of what’s going on here.

The sad thing is that they (you know, them) will point to this as a proof that Semantic Web technologies don’t work.

Or maybe I’ve just set myself up for a generous portion of humble pie on April 2nd (when Reuven says an “initial functional draft UCI implementation, taxonomy and ontology” will be unveiled). I’d love to be surprised. And my ego has taken worse hits before.

[UPDATED 2009/2/10: You should read Steve Oberlin’s take on this overall taxonomy/ontology discussion. He knows the topic, carefully reads the posts that he comments on, packs a healthy dose of skepticism and takes the time to explain what taxonomies and ontologies are, which was overdue. Plus, I just love sites that don’t feel the need to use decorative pictures. His doesn’t have a single image file which means that even if he didn’t have superb credentials (which he does) he’d get my respect by default. A blog to watch.]

1 Comment

Filed under Automation, Cloud Computing, Everything, Modeling, Portability, RDF, Semantic tech, Standards, Tech, Utility computing

Moving towards utility/cloud computing standards?

This Forbes article (via John) channels 3Tera’s Bert Armijo’s call for standardization of utility computing. He calls it “Open Cloud” and it would “allow a company’s IT systems to be shared between different cloud computing services and moved freely between them“. Bert talks a bit more about it on his blog and, while he doesn’t reference the Forbes interview (too modest?), he points to Cloudscape as the vision.

A few early thoughts on all this:

  • No offense to Forbes but I wouldn’t read too much into the article. Being Forbes, they get quotes from a list of well-known people/companies (Google and Amazon spokespeople, Forrester analyst, Nick Carr). But these quotes all address the generic idea of utility computing standards, not the specifics of Bert’s project.
  • Saying that “several small cloud-computing firms including Elastra and Rightscale are already on board with 3Tera’s standards group” is ambiguous. Are they on-board with specific goals and a candidate specification? Or are they on board with the general idea that it might be time to talk about some kind of standard in the general area of utility computing?
  • IEEE and W3C are listed as possible hosts for the effort, but they don’t seem like a very good match for this area. I would have thought of DMTF, OASIS or even OGF first. On the face of it, DMTF might be the best place but I fear that companies like 3Tera, Rightscale and Elastra would be eaten alive by the board member companies there. It would be almost impossible for them to drive their vision to completion, unlike what they can do in an OASIS working group.
  • A new consortium might be an option, but a risky and expensive one. I have sometimes wondered (after seeing sad episodes of well-meaning and capable start-ups being ripped apart by entrenched large vendors in standards groups) why VCs don’t play a more active role in standards. Standards sound like the kind of thing VCs should be helping their companies with. VC firms are pretty used to working together, jointly investing in companies. Creating a new standard consortium might be too hard for 3Tera, but if the VCs behind 3Tera, Elastra and Rightscale got together and looked at the utility computing companies in their portfolios, it might make sense to join forces on some well-scoped standardization effort that may not otherwise be given a chance in existing groups.
  • I hope Bert will look into the history of DCML, a similar effort (it was about data center automation, which utility computing is not that far from once you peel away the glossy pictures) spearheaded by a few best-of-bread companies but ignored by the big boys. It didn’t really take off. If it had, utility computing standards might now be built as an update/extension of that specification. Of course DCML started as a new consortium and ended as an OASIS “member section” (a glorified working group), so this puts a grain of salt on my “create a new consortium and/or OASIS group” suggestion above.
  • The effort can’t afford to be disconnected from other standards in the virtualization and IT management domains. How does the effort relate to OVF? To WS-Management? To existing modeling frameworks? That’s the main draw towards DMTF as a host.
  • What’s the open source side of this effort? As John mentions during the latest Redmonk/Willis IT management podcast (starting around minute 24), there needs to a open source side to this. Actually, John thinks all you need is the open source side. Coté brings up Eucalyptus. BTW, if you want an existing combination of standards and open source, have a look at CDDLM (standard) and SmartFrog (implementation, now with EC2/S3 deployment)
  • There seems to be some solid technical raw material to start from. 3Tera’s ADL, combined with Elastra’s ECML/EDML, presumably captures a fair amount of field expertise already. But when you think of them as a starting point to standardization, the mindset needs to switch from “what does my product need to work” to “what will the market adopt that also helps my product to work”.
  • One big question (at least from my perspective) is that of the line between infrastructure and applications. Call me biased, but I think this effort should focus on the infrastructure layer. And provide hooks to allow application-level automation to drive it.
  • The other question is with regards to the management aspect of the resulting system and the role management plays in whatever standard specification comes out of Bert’s effort.

Bottom line: I applaud Bert’s efforts but I couldn’t sleep well tonight if I didn’t also warn him that “there be dragons”.

And for those who haven’t seen it yet, here is a very good document on the topic (but it is focused on big vendors, not on how smaller companies can play the standards game).

[UPDATED 2008/6/30: A couple hours after posting this, I see that Coté has just published a blog post that elaborates on his view of cloud standards. As an addition to the podcast I mentioned earlier.]

[UPDATED 2008/7/2: If you read this in your feed viewer (rather than directly on vambenepe.com) and you don’t see the comments, you should go have a look. There are many clarifications and some additional insight from the best authorities on the topic. Thanks a lot to all the commenters.]

20 Comments

Filed under Amazon, Automation, Business, DMTF, Everything, Google, Google App Engine, Grid, HP, IBM, IT Systems Mgmt, Mgmt integration, Modeling, OVF, Portability, Specs, Standards, Utility computing, Virtualization

More clues on the Oslo/SCA/SML trail: it’s “D”

I just found out that I completly missed some interesting information about Oslo-related efforts at Microsoft. Back in February, Mary-Jo Foley reported on a new modeling language (code-name “D”, apparently) that is part of this initiative. And more recently she reported that David Chappell gave a presentation about Oslo (and more generally Microsoft’s SOA plans) at TechEd. He reportedly said that we should expect a new “schema language” (which Mary-Jo thinks is “D”). What I want to know is what its relationship is with SML/SDM and SCA.

Mary-Jo might not know about SCA and SML but I know that David does. He wrote this white paper about SCA and an article arguing that “Microsoft Should Not Support SCA” (based on an a questionable assessment that SCA is only about portability). He and I also had a little back-and-forth about SCA, SML and Microsoft in the comments section of his post. Unfortunately, David hasn’t blogged about Microsoft’s SOA strategy for a while for us non-TechEd people.

In addition to Mary-Jo’s report, the only information I was about to quickly dig out about David’s presentation is this blog post on Microsoft’s Israel site. Looks like David gave the same presentation at TechEd Israel 2008. Anyone who understands Hebrew cares to translate the blog? Fortunately there is a two-minutes video (also available here) in which we can hear David talk (in English). During the second of the two minutes you’ll hear and see something that could come straight out of a SCA presentation…

For some reason, David’s TechEd Israel presentation doesn’t seem to be listed here and TechEd online tells me that “Featured videos are unavailable at this time”. That’s both for IT Professionals and Developers. But of course they forced me to install Silverlight before telling me that.

[UPDATED 2008/8/11: Here is a 14 minutes video interview of David Chappell providing an update on Oslo.]

3 Comments

Filed under Application Mgmt, Automation, Conference, Desired State, Everything, IT Systems Mgmt, Mgmt integration, Microsoft, Modeling, Oslo, SCA, SML, Tech

I have seen the future of CMDBf

I got a sneak peak at CMDBf v2 today.

I am calling it v2 based on the assumption that the one being currently standardized in DMTF will end up being called 1.0 (because it’s the first one out of DMTF) or 1.1 (to prevent confusion with the submitted version).

At the Semantic Technology Conference, David Booth from HP presented his work (along with his partner, Steve Battle from HP Labs) to provide a SPARQL front-end to HP’s Universal CMDB (the engine under what was the Mercury MAM product). Here are the slides.

The mapping from SPARQL to TQL (the native query interface for UCMDB) was made pretty easy by the fact that TQL is a graph-oriented query language. How much harder would it be to similarly transform a CMDBf (v1) query interface into a SPARQL query interface (and vice-versa)? Not much. The only added difficulty would come from the CMDBf XPath constraints. TQL has a property value mechanism that is very similar to CMDBf’s “propertyValue” constraint and maps well to SPARQL functions. The introduction of XPath as a constraint language in CMDBf makes things harder. It could be handled by adding XPath support to the SPARQL engine using function extensibility. Or by turning the entire XML into RDF and emulating XPath in SPARQL. But in either case, you’ll have impedence mismatch at some point because concepts such as element order that exist in XPath have no native equivalent in RDF.

The use of XPath in selectors on the other hand is not a problem. HP’s prototype uses Gloze (available as a Jena package) to turn the XML returned by UCMDB into RDF. An XSLT transform could turn that same XML into a CMDBf-valid XML response instead and that XSLT could easily handle the XPath selectors from the query request. This is another reason why constraints and selectors should remain separate in CMDBf (fortunately the specification is back to doing this properly).

Here is why I call this prototype CMDBf v2: The CMDBf effort (v1 or 1.1), in its current form of re-inventing a graph query, can succeed. Let’s assume the working group strikes a reasonable balance between completeness and complexity, and vendors choose to compete on innovation and execution rather than lock-in (insert cynical comment here). CMDBf may then end up being supported by the main CMDB vendors. It wouldn’t provide federation capabilities, but having a common CMDB query interface supported by the Big Four would help with management integration. And yet, while the value would be real, it would only provide a little help to solve a larger problem:

  • As a technology limited to IT systems management, it would be unlikely to see widely available tools (e.g. user consoles and language-specific libraries).
  • It wouldn’t get the kind of robustness and interoperability that comes from wide adoption. While pretty similar, there might be some minor differences in the various implementations. Once your implementation has been tweaked to work with the implementations from the Big Four, you’ll call it done. Just like SNMP, another technology that is specific to IT systems management (see it happen here).
  • Even if it works perfectly at the query level, it will just hasten the time when developers run into the real problem, model interoperability. CMDBf doesn’t help at all with this. In fact, it makes it harder by hard-coding some dependencies on an XML back-end (the XPath constraints).

In the long run, IT management has to become more automated and integrated. That’s a given. The way it happens may or may not go through CMDB-like configuration stores. But if it does, we’ll have to eventually move beyond CMDBf (v1) towards something that addresses the three requirements above. And federation. I don’t know if it will be called CMDBf v2, and/or if it will come from the DMTF (by then, the CMDBf brand might be an asset or a liability depending on developer experience with the specification). But I strongly suspect (“probability 0.8” as a Gartner analyst might put it) that it will use semantic technologies. Because the real, hard, underlying problem is a problem of semantic integration. In that sense, David and Steve’s prototype is a sneak peek at what will come after CMDBf v1/1.1.

Pretty much since the beginning of CMDBf I have been pushing for it to ideally embrace SPARQL (with no success) or to at least stay close to it conceptually in order to make the eventual mapping/evolution smooth (with a bit more success). This includes pushing for a topological query language, trying to keep XML idiosyncrasies at bay and keeping constraints and selectors cleanly separated. Rather than working within the CMDBf group, David took the alternative approach of simply doing it. Hopefully this will help convince people of the value of re-using semantic web technology for IT systems management. Yes semantic technologies have been designed for a much more general use case. But the use cases that CMDB systems address are a subset of the use cases addressed by semantic technologies. It’s hard for domain experts to see their domain as just a subset of a larger problem, but this is the case here. Isn’t HTTP serving the IT management community better than a systems management-specific alternative would?

By the way, there is no inferencing taking place in the HP prototype. We are just talking about re-using an existing, well though-through graph query language. Sure OWL inferencing and some rules could be seamless layered on top of this. But this is in no way required to do (better) what CMDBf v1 tries to do.

And then there is the “federation” question. Who do you trust more to deliver this? A bunch of IT system management architects in DMTF or the web and query experts at W3C, HP Labs etc who designed and implemented SPARQL over many years? BTW, it sounds like SPQARL federation was discussed at WWW 2008, based on these meeting notes (search for “federation”).

2 Comments

Filed under Automation, CMDB, CMDB Federation, CMDBf, Conference, DMTF, Everything, Graph query, HP, IT Systems Mgmt, Query, RDF, Semantic tech, SPARQL, Standards, W3C, XPath

Oracle acquires e-TEST from Empirix

Somewhat lost in the news about Oracle’s recent earning report is the announcement that Oracle just purchased e-TEST suite from Empirix. We are not purchasing the company, just some of their products (they also sell VoIP testing tools, for example, which will stay with Empirix). Most importantly, we are also getting the people who made the product, not just a code dump. They’ll join the Enterprise Manager team (my group). Welcome aboard!

The e-TEST suite is made of three integrated components (I am describing the current e-TEST suite, not necessarily the resulting Oracle offering):

  • e-Manager Enterprise is a process management application for application testing.
  • e-Tester lets you easily create sophisticated tests for functional and regression testing.
  • e-Load is a load and performance testing framework.

(these product names make me feel like I am back in HP’s e-speak team)

This is a mature product suite that will increases the scope and depth of EM’s application testing capabilities. It extends the existing EM recorder/beacon infrastructure. It offers a sophisticated test transaction model (remember VBA?). It offers load testing capabilities. Not to mention the process management capabilities around test cases.

My toddler daughter loves her book about the solar system. She has learned to say “hot!” whenever we look at Mercury. Tonight I’ll have to teach her to say “feeling the heat” instead.

More info here. That is probably also where specific product plans will be released.

If you’ve ever been to an Oracle Open World presentation, you won’t be surprised to see that this post ends with a disclaimer that:

It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality and should not be relied upon in making a purchasing decision. The development, release and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3 Comments

Filed under Automation, Everything, IT Systems Mgmt, Oracle, Testing

Elastra and data center configuration formats

I heard tonight for the first time of a company called Elastra. It sounds like they are trying to address a variation of the data center automation use cases covered by Opsware (now HP) and Bladelogic (now BMC). Elastra seems to be in an awareness-building phase and as far as I can tell it’s working (since I heard about them). They got to me through John’s blog. They are also using the more conventional PR channel (and in that context they follow all the cheesy conventions: you get to “unlock the value”, with “the leading provider” who gives you “a new product that revolutionizes…” etc, all before the end of the first paragraph). And while I am making fun of the PR-talk I can’t help zeroing on this quote from the CEO, who “wanted to pick up where utility computing left off – to go beyond the VM and toward virtualizing complex applications that span many machines and networks”. Does he feels the need to narrowly redefine “utility computing” (who knew that all that time “utility computing” was just referring to a single hypervisor?) as a way to justify the need for the new “cloud” buzzword (you’ll notice that I haven’t quite given up yet, this post is in the “utility computing” category and I still do not have a “cloud” category)?

The implied difference with Opsware and Bladelogic seems to be that while these incumbent (hey Bladelogic, how does it feel to be an “incumbent”?) automate data center management tasks in old boring data centers, Elastra does it in clouds. More specifically “public and private compute clouds”. I think I know roughly what a public cloud is supposed to be (e.g. EC2), but a private cloud? How is that different from a data center? Is a private cloud a data center that has the Elastra management software deployed? In that case, how is automating private clouds with Elastra different from automating data centers with Elastra? Basically it sounds like they don’t want to be seen as competing with Opsware and Bladelogic so they try to redefine the category. Which makes it easier to claim (see above) to be “the leading provider of software for designing, deploying, and managing applications in public and private compute clouds” without having the discovery or change management capabilities of Opsware (or anywhere near the same number of customers).

John seems impressed by their “public cloud” capabilities (I don’t think he has actually tested them yet though) and I trust him on that. Knowing the complexities of internal data centers, I am a lot more doubtful of the “private cloud” claims (at least if I interpret them correctly).

Anyway, I am getting carried away with some easy nitpicking on the PR-talk, but in truth it uses a pretty standard level of obfuscation/hype for this type of press release. Sad, I know.

The interesting thing (and the reason I started this blog entry in the first place) is that they seem to have created structures to capture system design (ECML) and deployment (EDML) rules. From John’s blog:

“At the core of Elastra’s architecture are the system design specifications called ECML and EDML. ECML is an XML markup language to specify a cloud design (i.e., multiple system design of firewalls, load balancers, app servers, db servers, etc…). The EDML markup provides the provisioning instructions.”

John generously adds “Elastra seems to be the first to have designed their autonomics into a standards language” which seems to assume that anything in XML is a standard. Leaving the “standard” debate aside, an XML format does tend improve interoperability and that’s a good thing.

So where are the specifications for these ECML and EDML formats? I would be very interested in reading them, but they don’t appear to be available anywhere. Maybe that would be a good first step towards making them industry standards.

I would be especially interested in comparing this to what the SML/CML effort is going after. Here are some propositions that need to be validated or disproved. Comparing SML/CML to ECML/EDML could help shade light on them:

  • SML/CML encompasses important and useful datacenter automation use cases.
  • Some level of standardization of cross-domain system design/deployment/management is needed.
  • SML/CML will be too late.
  • SML/CML will try to do too many things at once.

You can perform the same exercise with OVF. Why isn’t OVF based on SML? If you look at the benefits that could be theoretically be derived by that approach (hardware, VM, network and application configuration all in the same metamodel) it tells you about all that is attractive about SML. At the same time, if you look at the fact that OVF is happening while CML doesn’t seem to go anywhere, it tells you that the “from the very top all the way down to the very bottom” approach that SML is going after is very difficult to pull off. Especially with so many cooks in the kitchen.

And BTW, what is the relationship between ECML/EDML and OVF? I’d like to find out where the Elastra specifications land in all this. In the worst case, they are just an XML rendering of the internals of the Elastra application, mixing all domains of the IT stack. The OOXML of data center automation if you want. In the best case, it is a supple connective tissue that links stiffer domain-specific formats.

[UPDATED 2008/3/26: Elastra’s “introduction to elastic programing” white paper has a few words about the relationship between OVF and EDML: “EDML builds on the foundation laid by Open Virtual Machine Format (OVF) and extends that language’s capabilities to specify ways in which applications are deployed onto a Virtual Machine system”. Encouraging, if still vague.]

[UPDATED 2008/3/31: A week ago I hadn’t heard of Elastra and now I learn that I had been tracking the blog of its lead-architect-to-be all along! Maybe Stu will one day explain what a “private cloud” is. His description of his new company seems to confirm my impression that they are really focused (for now at least) on “public clouds” and not the Opsware-like “private clouds” automation capabilities. Maybe the “private clouds” are just in the business plan (and marketing literature) to be able to show a huge potential markets to VCs so they pony up the funds. Or maybe they really plan to go after this too. Being able to seamlessly integrate both (for mixed deployments) is the holly grail, I am just dubious that focusing on this rather than doing one or the other “right” is the best starting point for a new company. My guess is that despite the “private cloud” talk, they are really focusing on “public clouds” for now. That’s what I would do anyway.]

[UPDATED on 2008/6/25: Stephen O’Grady has an interesting post about the role of standards in Cloud computing. But he only looks at it from the perspective of possible standardization of the interfaces used by today’s Cloud providers. A full analysis also needs to include the role, in Cloud Computing, of standards (app runtime standards, IT management standards, system modeling standards, etc…) that started before Cloud computing was big. Not everything in Cloud computing is new. And even less is new about how it will be used. Especially if, as I expect, utility computing and on-premise computing are going to become more and more intertwined, resulting in the need to manage them as a whole. If my app is deployed at Amazon, why doesn’t it (and its hosts) show up in my CMDB and in my monitoring panel? As Coté recently wrote, “as the use of cloud computing for an extension of data centers evolves, you could see a stronger linking between Hyperic’s main product, HQ and something like Cloud Status”.]

9 Comments

Filed under Automation, CML, Everything, IT Systems Mgmt, OVF, SML, Tech, Utility computing, Virtualization