Category Archives: IT Systems Mgmt

Steve Ballmer gets Cloud

Steve Ballmer wants devops

Devops? What’s devops? See these articles:


Filed under Cloud Computing, DevOps, Everything, IT Systems Mgmt, Microsoft, People

Enterprise application integration patterns for IT management: a blast from the past or from the future?

In a recent blog post, Don Ferguson (CTO at CA) describes CA Catalyst, a major architectural overhaul which “applies enterprise application integration patterns to the problem of integrating IT management systems”. Reading this was fascinating to me. Not because the content was some kind of revelation, but exactly for the opposite reason. Because it is so familiar.

For the better part of the last decade, I tried to build just this at HP. In the process, I worked with (and sometimes against) Don’s colleague at IBM, who were on the same mission. Both companies wanted a flexible and reliable integration platform for all aspects of IT management. We had decided to use Web services and SOA to achieve it. The Web services management protocols that I worked on (WSMF, WSDM, WS-Management and the “reconciliation stack”) were meant for this. We were after management integration more than manageability. Then came CMDBf, another piece of the puzzle. From what I could tell, the focus on SOA and Web services had made Don (who was then Mr. WebSphere) the spiritual father of this effort at IBM, even though he wasn’t at the time focused on IT management.

As far as I know, neither IBM nor HP got there. I covered some of the reasons in this post-mortem. The standards bickering. The focus on protocols rather than models. The confusion between the CMDB as a tool for process/service management versus a tool for software integration. Within HP, the turmoil from the many software acquisitions didn’t help, and there were other reasons. I am not sure at this point whether either company is still aiming for this vision or if they are taking a different approach.

But apparently CA is still on this path, and got somewhere. At least according to Don’s post. I have no insight into what was built beyond what’s in the post. I am not endorsing CA Catalyst, just agreeing with the design goals listed by Don. If indeed they have built it, and the integration framework resists the test of time, that’s impressive. And exciting. It apparently even uses some the same pieces we were planning to use, namely WS-Management and CMDBf (I am reluctantly associated with the first and proudly with the second).

While most readers might not share my historical connection with this work, this is still relevant and important to anyone who cares about IT management in the enterprise. If you’re planning to be at CA World, go listen to Don. Web services may have a bad name, but the technical problems of IT management integration remain. There are only a few routes to IT management automation (I count seven, the one taken by CA is #2). You can throw away SOAP if you want, you still need to deal with protocol compatibility, model alignment and instance reconciliation. You need to centralize or orchestrate the management operations performed. You need to be able to integrate with complementary products or at the very least to effectively incorporate your acquisitions. It’s hard stuff.

Bonus point to Don for not forcing a “Cloud” angle for extra sparkle. This is core IT management.

Comments Off on Enterprise application integration patterns for IT management: a blast from the past or from the future?

Filed under Automation, CA, CMDB, CMDB Federation, CMDBf, Everything, IT Systems Mgmt, Mgmt integration, Modeling, People, Protocols, SOAP, Specs, Standards, Tech, Web services, WS-Management

Smoothing a discrete world

For the short term (until we sell one) there are three cars in my household. A manual transmission, an automatic and a CVT (continuous variable transmission). This makes me uniquely qualified to write about Cloud Computing.

That’s because Cloud Computing is yet another area in which the manual/automatic transmission analogy can be put to good use. We can even stretch it to a 4-layer analogy (now that’s elasticity):

Manual transmission

That’s traditional IT. Scaling up or down is done manually, by a skilled operator. It’s usually not rocket science but it takes practice to do it well. At least if you want it to be reliable, smooth and mostly unnoticed by the passengers.

Manumatic transmission (a.k.a. Tiptronic)

The driver still decides when to shift up or down, but only gives the command. The actual process of shifting is automated. This is how many Cloud-hosted applications work. The scale-up/down action is automated but, still contingent on being triggered by an administrator. Which is what most IaaS-deployed apps should probably aspire to at this point in time despite the glossy brochures about everything being entirely automated.

Automatic transmission

That’s when the scale up/down process is not just automated in its execution but also triggered automatically, based on some metrics (e.g. load, response time) and some policies. The scenario described in the aforementioned glossy brochures.

Continuous variable transmission

That’s when the notion of discrete gears goes away. You don’t think in terms of what gear you’re in but how much torque you want. On the IT side, you’re in PaaS territory. You don’t measure the number of servers, but rather a continuously variable application capacity metric. At least in theory (most PaaS implementations often betray the underlying work, e.g. via a spike in application response time when the app is not-so-transparently deployed to a new node).


OK, that’s the analogy. There are many more of the same kind. Would you like to hear how hybrid Cloud deployments (private+public) are like hybrid cars (gas+electric)? How virtualization is like carpooling (including how you can also be inconvenienced by the BO of a co-hosted VM)? Do you want to know why painting flames on the side of your servers doesn’t make them go faster?

Driving and IT management have a lot in common, including bringing out the foul-mouth in us when things go wrong.

So, anyone wants to buy a manual VW Golf Turbo? Low mileage. Cloud-checked.


Filed under Application Mgmt, Automation, Cloud Computing, Everything, IT Systems Mgmt, Utility computing

Two versions of a protocol is one too many

There is always a temptation, when facing a hard design decision in the process of creating an interface or a protocol, to produce two (or more) versions. It’s sometimes a good idea, as a way to explore where each one takes you so you can make a more informed choice. But we know how this invariably ends up. Documents get published that arguably should not. It’s even harder in a standard working group, where someone was asked (or at least encouraged) by the group to create each of the alternative specifications. Canning one is at best socially awkward (despite the appearances, not everyone in standards is a psychopath or a sadist) and often politically impossible.

And yet, it has to be done. Compare the alternatives, then pick one and commit. Don’t confuse being accommodating with being weak.

The typical example these days is of course SOAP versus REST: the temptation is to support both rather than make a choice. This applies to standards and to proprietary interfaces. When a standard does this, it hurts rather than promote interoperability. Vendors have a bit more of an excuse when they offer a choice (“the customer is always right”) but in reality it forces customers to play Russian roulette whether they want it or not. Because one of the alternatives will eventually be left behind (either discarded or maintained but not improved). If you balance the small immediate customer benefit of using the interface style they are most used to with the risk of redoing the integration down the road, the value proposition of offering several options crumbles.

[Pedantic disclaimer: I use the term “REST” in this post the way it is often (incorrectly) used, to mean pretty much anything that uses HTTP without a SOAP wrapper. The technical issues are a topic for other posts.]


CMDBf v1 is a DMTF standard. It is a SOAP-based protocol. For v2, it has been suggested that there should a REST version. I don’t know what the CMDBf group (in which I participate) will end up doing but I’ve made my position clear: I could go either way (remain with SOAP or dump it) but I do not want to have two versions of the protocol (one SOAP one REST). If we think we’re better off with a REST version, then let’s make v2 REST-only. Supporting both mechanisms in v2 would be stupid. They would address the same use cases and only serve to provide political ass-coverage. There is no functional need for both. The argument that we need to keep supporting SOAP for the benefit of those who implemented v1 doesn’t fly. As an implementer, nobody is saying that you need to turn off your v1 services the second you launch the v2 version.

DMTF Cloud

Between the specifications submitted directly to DMTF, the specifications developed by DMTF “partner” organizations and the existing DMTF protocols, the DMTF Cloud effort is presented with a mix of SOAP, RESTful and XML-RPC-over-HTTP options. In the process of deciding what to create or adopt I am sure that the temptation will be high to take the easy route of supporting several versions to placate everyone. But such a “consensus” would be achieved on the back of the implementers so I very much hope it won’t be the case.

When it is appropriate

There are cases where supporting alternatives options is worth the cost. But it typically happens when they serve very different use cases. Think of SAX versus DOM, which have clearly differentiated sweetspots. In the Cloud world, Amazon S3 gives us interesting examples of both justified and extraneous alternatives. The extraneous one is the choice between REST and SOAP for the S3 API. I often praise AWS for its innovation and pragmatism, but this is an example of something that only looks pragmatic. On the other hand, the AWS import/export mechanism is a useful alternative. It allows you to physically ship a device with a few terabytes of data to Amazon. This is technically an alternative to the S3 programmatic interface, but one with obviously differentiated use cases. I recommend you reserve the use of “alternative APIs” for such scenarios.

If it didn’t work for Tiger Woods, it won’t work for your Cloud API either. Learn to commit.

[CLARIFICATION: based on some of the early Twitter feedback on this entry, I want to clarify that it’s alternative versions that I am against, not successive versions (i.e. an evolution of the interface over time). How to manage successive versions properly is a whole other debate.]


Filed under Amazon, API, Cloud Computing, CMDB, CMDBf, DMTF, Everything, IT Systems Mgmt, Protocols, REST, SOAP, Specs, Standards, Utility computing, Web services

HP has submitted a specification to the DMTF Cloud incubator

When I lamented, in a previous post, that I couldn’t tell you about recent submissions to the DMTF Cloud incubator, one of those I had in mind was a submission from HP. I can now write this, because the author of the specification, Nigel Cook, has recently blogged about it. Unfortunately he is isn’t publishing the specification itself, just an announcement that it was submitted. Hopefully he is currently going through the long approval process to make the submitted document public (been there, done that, I know it takes time).

In the blog, Nigel makes a good argument for the need to go beyond a hypervisor-centric view of Cloud computing. Even at the IaaS layer there are cases of automated-but-not-virtualized deployment that have all the characteristics of Cloud computing and need to be supported by Cloud management APIs. Not to mention OS-level isolation like Solaris Containers.

Nigel also offers a spirited defense of SOAP-based protocols. I don’t necessarily agree with all his points (“one could easily map the web service definition I described to REST if that was important” suggests a “it’s just SOAP without the wrapper” view of REST), but I am glad he is launching this debate. We need to discuss this rather than assume that REST is the obvious answer. Remember, a few years ago SOAP was just as obvious an answer to any protocol question. It may well be that indeed REST comes out ahead of this discussion, but the process will force us to be explicit about what benefits of REST we are trying to achieve and will allow us to be practical in the way we approach it.


Filed under Automation, Cloud Computing, DMTF, Everything, HP, IT Systems Mgmt, Mgmt integration, Specs, Standards, Utility computing, Virtualization

Waiting for events (in Cloud APIs)

Events/alerts/notifications have been a central concept in IT management at least since the first SNMP trap was emitted, and probably even long before that. And yet they are curiously absent from all the Cloud management APIs/protocols. If you think that’s because “THE CLOUD CHANGES EVERYTHING” then you may have to think again. Over the last few days, two of the most experienced practitioners of Cloud computing pointed out that this omission is a real pain in the neck. RightScale’s Thorsten von Eicken was first to request “an event based interface instead of a request-reply based interface”, pointing out that “we run a good number of machines that do nothing but chew up 100% cpu polling EC2 to detect changes”. George Reese seconded and started to sketch a solution. And while these blog posts gave the issue increased visibility recently, it has been a recurring topic on the AWS Forum and other similar discussion boards for quite some time. For example, in this thread going back to 2006, an Amazon employee wrote that “this is a feature we’ve discussed recently and we’re looking at options” (incidentally, I see a post by Thorsten in that old thread). We’re still waiting.

Let’s look at what it would take to define such a feature.

I have some experience with events for IT management, having been involved in the WS-Notification family of specifications and having co-chaired the OASIS technical committee that standardized them. This post is not about foisting WS-Notification on Cloud APIs, but just about surfacing some of the questions that come up when you try to standardize such a mechanism. While the main use cases for WS-Notification came from IT (and Grid) management, it was supposed to be a generic mechanism. A Cloud-centric eventing protocol can be made simpler by focusing on fewer use cases (Cloud scenarios only). In addition, WS-Notification was marred by the complexity-is-a-sign-of-greatness spirit of the time . On this too, a Cloud eventing protocol could improve things by keeping IBM at bay simplicity in mind.

Types of event

When you pull the state of a resource to see if anything changed,  you don’t have to tell the provider what kind of change you are interested in. If, on the other hand, you want the provider to notify you, then they need to know what you care about. You may not want to be notified on every single change in the resource state. How do you describe the changes you care about? Is there an agreed-upon set of states for the resource and you are only notified on state transitions? Can you indicate the minimum severity level for an event to be emitted? Who determines the severity of an event? Or do you get to specify what fields in the resource state you want to watch? What about numeric values for which you may not want to be notified of every change but only when a threshold is crossed? Do you get to specify a query and get notified whenever the query result changes? In WS-Notification some of this is handled by WS-Topics which I still like conceptually (I co-edited it) but is too complex for the task at hand.

Event formats

What format are the events serialized in? How is the even metadata captured (e.g. time stamp of observation, which may not be the same as the time at which the notification message was sent)? If the event payload is a representation of the new state of the resource, does it indicate what field changes (and what the old value was)? How do you keep event payloads consistent with the resource representation in the request/response interactions? If many events occur near the same time, can you group them in one notification message for better scalability?

Subscription creation

Presumably you need a subscription mechanism. Is the subscription set in stone when the resource is created? Or can you come later and subscribe? If subscription is an operation on the resource itself, how do you subscribe for events on something that doesn’t exist yet (e.g. “create a VM and notify me once it’s started”)? Do you get to set subscriptions on a per-resource-basis? Or is this a global setting for all the resources that you own? Can you have two different subscriptions on the same resource (e.g. a “critical events only” subscription that exist throughout the life of the resource, plus a “lots of events please” subscription that you keep for a few hours while troubleshooting)?

Subscription management

Do you get to come back and update/pause/delete a subscription? Do you get to change what filter the subscription carries? Or is it set in stone until the subscription expires? Can you change the delivery endpoint? What if events fail to be delivered? Does the provider cancel your subscription? After how many failures? Does it just pause it for a few hours? Keep trying?

Subscription expiration

Who sets the expiration period? The subscriber? Can the provider set a max duration? Do you get a warning message before the subscription expires? Can you renew a subscription or do you have to create a new one? Do you get a message telling you that it has expired? Where are these subscription-lifecycle messages sent? To the same endpoint as the regular messages? What if your subscription is being killed because your deliver endpoint is down, clearly it makes no sense to send the warning message to that same endpoint. Do you provide a separate “subscription management” endpoint (different from the event delivery endpoint) when you subscribe? Alternatively, does an email message get sent to the registered user who set the subscription?

Delivery reliability

How reliable do you want the notifications to be? Should the emitter retry until they’ve received a confirmation? How long do they keep messages that can’t be delivered? Some may have a very short shelf life while others are still useful weeks later. If you don’t have a reliable mechanism but you really “need to know about a lost server within a minute of it disappearing” (the example Georges gives) then in reality you may still have to poll just to make sure that an event wasn’t lost. If you haven’t received an event in a while, how can you test if the subscription is still working? Should subscriptions send a heartbeat message once a while?

Delivery mechanism

How do you deliver notifications? Do you keep HTTP connections open through tricks similar to how self-updating web pages work (e.g. COMET, long polling and soon WebSockets)? Or do you just provide a listener endpoint to which the notifier tries to connect (which, in the case of public cloud deployments, means you need to have a publicly-addressable listener, but hopefully not on the same Cloud infrastructure). Do you use XMPP? AMQP? Email? Can I have you hold my events and let me come pull them?


Do you need to verify the origin of the events you receive? Or do you assume they may be forged and always initiate a connection to the provider to double-check? And on the other side, what are the security requirements for event delivery? If a user looses some of their privileges, do you have to go and cancel the still-active subscriptions that they created?


Is there a maximum event rate? Do you get charged for the events the Cloud provider sends you? How do you make sure that someone doesn’t create a subscription pointing to the wrong endpoint (either erroneously or maliciously, e.g. DoS). Do you send a test message at registration asking the delivery endpoint to acknowledge that they indeed want to receive these notifications?


My goal is not to argue that we cannot have a simple yet good enough notification system or to scare anyone from attempting to define it. It’s just to show that it’s not as simple as it may seem at first blush. But there probably is a sweetspot and people like Thorsten and George are very well qualified to find it.

[UPDATED 2010/4/7: Amazon releases AWS Simple notification Service. Not just as an eventing feature for the Cloud API, as a generic notification service. Which can, of course, also carry Cloud management events. Though at this point you’re on your own to publish them from your instances, it doesn’t look like the AWS infrastructure can do it for you. Which means, for example, that you’re not going to be able to publish an event for a sudden crash.]


Filed under API, Application Mgmt, Automation, Cloud Computing, Desired State, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Protocols, Specs, Standards, Tech, Utility computing

Oracle acquires Amberpoint

Oracle just announced that it has purchased Amberpoint. If you have ever been interested in Web services management, then you surely know about Amberpoint. The company has long led the pack of best-of-breed vendors for Web services and SOA Management. My history with them goes back to the old days of the OASIS WSDM technical committee, where their engineers brought to the group a unique level of experience and practical-mindedness.

The official page has more details. In short, Amberpoint is going to reinforce Oracle Enterprise Manager, especially in these areas:

  • Business Transaction Management
  • SOA Management
  • Application Performance Management
  • SOA Governance (BTW, Oracle Enterprise Repository 11g was released just over a week ago)

I am looking forward to working with my new colleagues from Amberpoint.

1 Comment

Filed under Application Mgmt, Everything, Governance, IT Systems Mgmt, Manageability, Mgmt integration, Middleware, Web services

Generalizing the Cloud vs. SOA Governance debate

There have been some interesting discussions recently about the relationship between Cloud management and SOA management/governance (run-time and design-time). My only regret is that they are a bit too focused on determining winners and loosers rather than defining what victory looks like (a bit like arguing whether the smartphone is the triumph of the phone over the computer or of the computer over the phone instead of discussing what makes a good smartphone).

To define victory, we need to answer this seemingly simple question: in what ways is the relationship between a VM and its hypervisor different from the relationship between two communicating applications?

More generally, there are three broad categories of relationships between the “active” elements of an IT system (by “active” I am excluding configuration, organization, management and security artifacts, like patch, department, ticket and user, respectively, to concentrate instead on the elements that are on the invocation path at runtime). We need to understand if/how/why these categories differ in how we manage them:

  • Deployment relationships: a machine (or VM) in a physical host (or hypervisor), a JEE application in an application server, a business process in a process engine, etc…
  • Infrastructure dependency relationships (other than containment): from an application to the DB that persists its data, from an application tier to web server that fronts it, from a batch job to the scheduler that launches it, etc…
  • Application dependency relationships: from an application to a web service it invokes, from a mash-up to an Atom feed it pulls, from a portal to a remote portlet, etc…

In the old days, the lines between these categories seemed pretty clear and we rarely even thought of them in the same terms. They were created and managed in different ways, by different people, at different times. Some were established as part of a process, others in a more ad-hoc way. Some took place by walking around with a CD, others via a console, others via a centralized repository. Some of these relationships were inventoried in spreadsheets, others on white boards, some in CMDBs, others just in code and in someone’s head. Some involved senior IT staff, others were up to developers and others were left to whoever was manning the controls when stuff broke.

It was a bit like the relationships you have with the taxi that takes you to the airport, the TSA agent who scans you and the pilot who flies you to your destination. You know they are all involved in your travel, but they are very distinct in how you experience and approach them.

It all changes with the Cloud (used as a short hand for virtualization, management automation, on-demand provisioning, 3rd-party hosting, metered usage, etc…). The advent of the hypervisor is the most obvious source of change: relationships that were mostly static become dynamic; also, where you used to manage just the parts (the host and the OS, often even mixed as one), you now manage not just the parts but the relationship between them (the deployment of a VM in a hypervisor). But it’s not just hypervisors. It’s frameworks, APIs, models, protocols, tools. Put them all together and you realize that:

  • the IT resources involved in all three categories of relationships can all be thought of as services being consumed (an “X86+ethernet emulation” service exposed by the hypervisor, a “JEE-compatible platform” service exposed by the application server, an “RDB service” expose by the database, a Web services exposed via SOAP or XML/JSON over HTTP, etc…),
  • they can also be set up as services, by simply sending a request to the API of the service provider,
  • not only can they be set up as services, they are also invoked as such, via well-documented (and often standard) interfaces,
  • they can also all be managed in a similar service-centric way, via performance metrics, SLAs, policies, etc,
  • your orchestration code may have to deal with all three categories, (e.g. an application slowdown might be addressed either by modifying its application dependencies, reconfiguring its infrastructure or initiating a new deployment),
  • the relationships in all these categories now have the potential to cross organization boundaries and involve external providers, possibly with usage-based billing,
  • as a result of all this, your IT automation system really needs a simple, consistent, standard way to handle all these relationships. Automation works best when you’ve simplified and standardize the environment to which it is applied.

If you’re a SOA person, your mental model for this is SOA++ and you pull out your SOA management and governance (config and runtime) tools. If you are in the WS-* obedience of SOA, you go back to WS-Management, try to see what it would take to slap a WSDL on a hypervisor and start dreaming of OVF over MTOM/XOP. If you’re into middleware modeling you might start to have visions of SCA models that extend all the way down to the hardware, or at least of getting SCA and OSGi to ally and conquer the world. If you’re a CMDB person, you may tell yourself that now is the time for the CMDB to do what you’ve been pretending it was doing all along and actually extend all the way into the application. Then you may have that “single source of truth” on which the automation code can reliably work. Or if you see the world through the “Cloud API” goggles, then this “consistent and standard” way to manage relationships at all three layers looks like what your Cloud API of choice will eventually do, as it grows from IaaS to PaaS and SaaS.

Your background may shape your reference model for this unified service-centric approach to IT management, but the bottom line is that we’d all like a nice, clear conceptual model to bridge and unify Cloud (provisioning and containment), application configuration and SOA relationships. A model in which we have services/containers with well-defined operational contracts (and on-demand provisioning interfaces). Consumers/components with well-defined requirements. APIs to connect the two, with predictable results (both in functional and non-functional terms). Policies and SLAs to fine-tune the quality of service. A management framework that monitors these policies and SLAs. A common security infrastructure that gets out of the way. A metering/billing framework that spans all these interactions. All this while keeping out of sight all the resource-specific work needed behind the scene, so that the automation code can look as Zen as a Japanese garden.

It doesn’t mean that there won’t be separations, roles, processes. We may still want to partition the IT management tasks, but we should first have a chance to rejigger what’s in each category. It might, for example, make sense to handle provider relationships in a consistent way whether they are “deployment relationships” (e.g. EC2 or your private IaaS Cloud) or “application dependency relationships” (e.g. SOA, internal or external). On the other hand, some of the relationships currently lumped in the “infrastructure dependency relationships” category because they are “config files stuff” may find different homes depending on whether they remain low-level and resource-specific or they are absorbed in a higher-level platform contract. Any fracture in the management of this overall IT infrastructure should be voluntary, based on legal, financial or human requirements. And not based on protocol, model, security and tool disconnect, on legacy approaches, on myopic metering, that we later rationalize as “the way we’d want things to be anyway because that’s what we are used to”.

In the application configuration management universe, there is a planetary collision scheduled between the hypervisor-centric view of the world (where virtual disk formats wrap themselves in OVF, then something like OVA to address, at least at launch time, application and infrastructure dependency relationships) and the application-model view of the world (SOA, SCA, Microsoft Oslo at least as it was initially defined, various application frameworks…). Microsoft Azure will have an answer, VMWare/Springsouce will have one, Oracle will too (though I can’t talk about it), Amazon might (especially as it keeps adding to its PaaS portfolio) or it might let its ecosystem sort it out, IBM probably has Rational, WebSphere and Tivoli distinguished engineers locked into a room, discussing and over-engineering it at this very minute, etc.

There is a lot at stake, and it would be nice if this was driven (industry-wide or at least within each of the contenders) by a clear understanding of what we are aiming for rather than a race to cobble together partial solutions based on existing control points and products (e.g. the hypervisor-centric party).

[UPDATED 2010/1/25: For an illustration of my statement that “if you’re a SOA person, your mental model for this is SOA++”, see Joe McKendrick’s “SOA’s Seven Greatest Mysteries Unveiled” (bullet #6: “When you get right down to it, cloud is the acquisition or provisioning of reusable services that cross enterprise walls. (…)  They are service oriented architecture, and they rely on SOA-based principles to function.”)]


Filed under Application Mgmt, Automation, Cloud Computing, CMDB, Everything, Governance, IT Systems Mgmt, ITIL, Mgmt integration, Middleware, Modeling, OSGi, SCA, Utility computing, Virtualization, WS-Management

Backward-compatible vs. forward-compatible: a tale of two clouds

There is the Cloud that provides value by requiring as few changes as possible. And there is the Cloud that provides value by raising the abstraction and operation level. The backward-compatible Cloud versus the forward-compatible Cloud.

The main selling point of the backward-compatible Cloud is that you can take your existing applications, tools, configurations, customizations, processes etc and transition them more or less as they are. It’s what allowed hypervisors to spread so quickly in the enterprise.

The main selling point of the forward-compatible Cloud is that you are more productive and focused. Fewer configuration items to worry about, fewer stack components to install/monitor/update, you can focus on your application and your business goals. You develop and manage at the level of application concepts, not systems. Bottom line, you write and deploy applications more quickly, cheaply and reliably.

To a large extent this maps to the distinction between IaaS and PaaS, but it’s not that simple. For example, a PaaS that endeavors to be a complete JEE environment is mainly aiming for the backward-compatible value proposition. On the other hand, EC2 spot instances, while part of the IaaS layer, are of the forward-compatible kind: not meant to run your current applications unchanged, but rather to give you ways to create applications that better align with your business goals.

Part of the confusion is that it’s sometimes unclear whether a given environment is aiming for forward-compatibility (and voluntary simplification) or whether its goal is backward-compatibility but it hasn’t yet achieved it. Take EC2 for example. At first it didn’t look much like a traditional datacenter, beyond the ability to create hosts. Then we got fixed IP, EBS, boot from EBS, etc and it got more and more realistic to run applications unchanged. But not quite, as this recent complaint by Hoff illustrates. He wants a lot more control on the network setup so he can deploy existing n-tier applications that have specific network topology/config requirements without re-engineering them.  It’s a perfectly reasonable request, in the context of the backward-compatible Cloud value proposition. But one that will never be granted by a Cloud that aims for forward-compatibility.

Similarly, the forward-compatible Cloud doesn’t always successfully abstract away lower-level concerns. It’s one thing to say you don’t have to worry about backup and security but it means that you now have to make sure that your Cloud provider handles them at an acceptable level. And even on technical grounds, abstractions still leak. Take Google App Engine, for example. In theory you only deal with requests and not even think about the servers that process them (you have no idea how many servers are used). That’s nice, but once a while your Java application gets a DeadlineExceededException. That’s because the GAE platform had to start using a new JVM to serve this request (for example, your traffic is growing or the JVM previously used went down) and it took too long for the application to load in the new JVM, resulting in this loading request being killed. So you, as the developer, have to take special steps to mitigate a problem that originates at a lower level of the stack than you’re supposed to be concerned about.

All in all, the distinction between backward-compatible and forward-compatible Clouds is not a classification (most Cloud environments are a mix). Rather, it’s another mental axis on which to project your Cloud plans. It’s another way to think about the benefits that you expect from your use of the Cloud. Both providers and consumers should understand what they are aiming for on that axis. Hopefully this can help prevent shout matches of the “it’s a bug, no it’s a feature” variety.

[UPDATED 2010/3/4: Apparently, Steve Ballmer thinks along the same lines. Though the way he sees it, Azure is forward-compatible, while Amazon is backward-compatible: “I think Amazon has done a nice job of helping you take the server-based programming model – the programming model of yesterday, that is not scale-agnostic – and then bringing it into the cloud. On the other hand, what we’re trying to do with Azure is let you write a different kind of application.“]

[UPDATED 2010/3/5: I now have the quasi-proof that indeed Steve Ballmer stole the idea from my blog. Look at this entry in my HTTP log. This visitor came the evening before Steve’s “Cloud” talk at the University of Washington. I guess I am not the only one to procrastinate until the 11th hour when I have a deadline. Every piece of information in this log entry points at Steve Ballmer. How can it be anyone other than him? - - [03/Mar/2010:23:51:52 -0800] "GET /archives/1198 HTTP/1.0" 200 4820 "" "Mozilla/1.22 (compatible; MSIE 2.0; Microsoft Bob)"

(in case you are not fluent in the syntax and semantics of HTTP log files, this is a joke)]


Filed under Amazon, Application Mgmt, Cloud Computing, Everything, Google App Engine, IT Systems Mgmt, Utility computing, Virtualization

Taxonomy of Cloud Computing Benefits

One of the heavily discussed Cloud topics in early 2009 was a  Cloud Computing taxonomy. Now that this theme has died down (with limited results), and to start 2010 in a similar form, here is a proposal for a taxonomy of the benefits of Cloud Computing.

Just like the original Cloud Computing taxonomy only had three layers (IaaS/PaaS/SaaS), so does this taxonomy of Cloud benefits. The point of this post is to promote the third layer. I describe layers 1 and 2  mainly to better call out what’s specific about layer 3.

Layer 1 (infrastructure: “let someone else do it”)

This is the bare-bottom, inherent benefit of Cloud Computing: you don’t have to deal with the hardware. In practice, it means:

  • no need to worry about power/cooling,
  • on-demand provisioning/deprovisioning (machines appear/disappear in a way physical machines do not),
  • not responsible for physical security (though responsible for ensuring that the provider has an acceptable security level),
  • economies of scale (for equipment purchase and operations),
  • potential environmental benefits,
  • etc…

Layer 2 (management: “let a program do it”)

More specifically, more automated IT management. This does not require Cloud Computing (you can have a highly automated IT management environment on premise), but the move to Cloud Computing is the trigger that is making it really happen. While this capability is not an inherent benefit of Cloud Computing, the Cloud makes it:

  • Needed: You don’t get to put color tags on machines, you don’t get to bring a DVD to install a new application, you don’t get to open a machine to insert more memory, you don’t get to go retrieve a backup tape, label it and put it in a safe, etc. Of course loosing these “privileges” doesn’t sound bad considering that they are mostly chores, but it means that you have to design alternative (and mostly programmatic) ways to perform the functions that these tasks addressed.
  • Easier: Cloud environments are highly API-driven. Many IT tools from the previous generation were console-centric (people would go out and buy “a network/event/system management console“) with APIs/protocols as a secondary thought. In Cloud environments, tools are a lot more API-centric with the console as an adjunct (anyone has stats about the ratio of EC2 instances provisioned via the AWS console versus the APIs?). This is also why even though a lot of people wanted standard management protocols (of the WSDM/WS-Management generation), there wasn’t as much of a realization of their importance in the old environment (and not as much pressure to create them and eagerness to adopt them). The stakes and visibility are a lot higher in the Cloud environments and that’s why this second wave of protocols will have to succeed where the previous one came short.
  • More beneficial: Once you have automated IT management in a traditional data center, what you get is fewer employees needed and somewhat better utilization. But you are still gated by the time/process to purchase/install new machines and the cost of unused machines (at least with automation you don’t have to pay their power/cooling). You don’t get the “just what I need” level of infrastructure usage that the same automation work allows in a Cloud setting.

Layer 3 (applications: “do it right”)

In short, use the move to the Cloud as an opportunity to fix some of the key issues of today’s applications. Think of the Cloud switch as a second Y2K, 10 years later: like in 2000, not only are there things that the transition requires you to fix, there are also many things that aren’t exactly required to fix but still make sense to fix as part of the larger modernization effort. Of course the Cloud move is missing that ever-so-valuable project management motivator of a firm deadline, but hopefully competitive pressure can play a similar role.

What are these issues? Here is a partial list:

  • Security: at least authentication and authorization. We have SSO/Federation systems, both enterprise-type and Web-centric and they often suck in practice. Whether it’s because of the protocols, the implementations, the tools or the mindset. Plus, there are too many of them. As applications gained mouths and ears and started to communicate with one another, the problem became obvious. If, in the Cloud, you also want them to grow legs and be able to move around (wholly or in parts) then it really really has to get fixed. Not to mention the “all or nothing” delegation model that I am surprised hasn’t yet created a major disaster (let’s see what 2010 has in store). I suggested a band-aid fix earlier, but this needs a real solution (the Cloud Security Alliance provides some guidance in this document, see “domain 12” for IAM).
  • Get remote application interfaces right. It’s been discussed, manifesto’ed, buried and lampooned many times before (this was my humble take on it). Whether it’s because of WS-* or, more likely, java2wsdl we have been delayed in this but it simply has to happen. Call it SOAP, zenSOAP, REST, practical REST or whatever you want. Just make sure that all important functions and data are accessible via clear, documented, consistent, easy-to-use, on-the-wire interfaces. Once we have these interfaces, and only then, we can worry about reliably composing/orchestrating applications that cross organizational boundaries.
  • Related to the previous point, clean up the incestuous relationship between an application and its data. Actually, it’s not “its” data. It’s the data it works on.
  • Deliver application-centric IT management. Quit loosing and (badly) re-creating information: for example, an application deployment followed by a black-box discovery (“what did I just do”?). Or after-the-fact re-establishing correlations between events on different servers (“what was this about”?). Application management too often looks like a day in the life of a senile person.
  • Fault-tolerance and disaster recovery. It is too often lacking (or untested, which is the same) for applications that are just below the perceived threshold of requiring it to be done right. That threshold needs to be lowered and the move to the Cloud can be used to make this possible.

[You should also read Tim Bray’s perspective (and Stefan Tilkov’s comment) on the process/methodology/tools for enterprise applications, an orthogonal (but related) area of improvement. More fundamental.]

As I mentioned above, these are mostly not Cloud specific (though it is possible to create a Cloud connection for each). They are things that we have known about and tried to fix for a while. But the pace has been pretty slow and there is an opportunity for the Cloud transition to do more than just hand out the keys of the datacenter.

What kinds of benefits are you aiming for in your Cloud plans?

[UPDATED 2010/01/11: An interesting take on a similar topic by Brenda Michelson: 5 Enduring Aspects of Cloud Computing]

[UPDATED 2010/01/14: Along the same lines (but looking at it in the other direction), an interesting graph from Alistair Croll of Bitcurrent.]


Filed under Application Mgmt, Automation, Cloud Computing, Ecology, Everything, IT Systems Mgmt, Mgmt integration, Security, Utility computing

Book on Middleware Management with Oracle Enterprise Manager

My colleagues (and Enterprise Manager experts) Debu Panda and Arvind Maheshwari have a very handy book out, titled Middleware Management with Oracle Enterprise Manager Grid Control 10gR5 (that’s the latest release of Enterprise Manager). The publisher sent me a copy of the book. It illustrates well that Enterprise Manager does a lot more than just database management; it also provides coverage of most of the Oracle middleware stack (and some non-Oracle middleware components).

I am happy to provide an outline of the book, because it shows both how complete the book is and how wide the coverage of Enterprise Manager is for the Oracle middleware stack.

  • Chapter 1 provides an overview of the base Enterprise Manager product and its various packs.
  • Chapter 2 describes the installation process.
  • Chapter 3 describes the key concepts of the different subsystems of Enterprise Manager.
  • Chapter 4 covers management of WebLogic server, the centerpiece of Oracle Fusion Middleware.
  • Chapter 5 covers management of the core of the pre-BEA Oracle Application Server (OC4J, OHS and WebCache).
  • Chapter 6 is about managing Oracle Forms and Reports (used by EBS and many client-server applications).
  • Chapter 7 is about managing the BPEL server, a major component of the SOA Suite.
  • Chapter 8 (available as a free download) covers management of another part of the SOA Suite, namely Oracle Service Bus (previously AquaLogic Service Bus).
  • Chapter 9 addresses management of Oracle Identity Manager.
  • Chapter 10 covers management of Coherence (a distributed in-memory cache) clusters.
  • Chapter 11 describes the capability to manage non-Oracle middleware for these youthful errors you committed before seeing the (red) light.
  • Chapter 12 introduces some of the cool new application management features: Composite Application Modeler and Monitor (CAMM) to manage a distributed application across all its components, and Application Diagnostic for Java (AD4J) to drill down into a specific JVM.
  • Chapter 13 invites you to roll-up your sleeves and write your own plug-in so that Enterprise Manager can manage new types of targets.
  • Chapter 14 ends the book by sharing some best practices from customer experience.

All in all, this is the most user-friendly and accessible way to learn and become familiar with the scope of what Enterprise Manager has to offer for middleware management. The gory details (e.g. the complete list of target types, metrics and their definitions) are not in the book but available from the on-line documentation.

To end on a ludic note, you can use this table of content to test your knowledge of some Oracle acquisitions. Can you associate the following acquired companies with the corresponding chapter? Auptyma, Oblix, BEA, ClearApp, Collaxa, Tangosol.

The ROT-13-encoded answer is: ORN: 4&8 – Pbyynkn: 7 – Boyvk: 9 – Gnatbfby:10 – Nhcglzn: 12 – PyrneNcc: 12

Comments Off on Book on Middleware Management with Oracle Enterprise Manager

Filed under Application Mgmt, Book review, BPEL, Everything, IT Systems Mgmt, Middleware, Oracle

REST in practice for IT and Cloud management (part 3: wrap-up)

[Preface: a few months ago I shared some thoughts about how REST was (or could) be applied to IT and Cloud management. Part 1 was a comparison of the RESTful aspects of four well-known IaaS Cloud APIs and part 2 was an analysis of how REST applies to configuration management. Both of these entries received well-informed reader comments BTW, so if you read the posts but didn’t come back for the comments you really owe it to yourself to do so now. At the time, I jotted down thoughts for subsequent entries in this series, but I never got around to posting them. Since the topic seems to be getting a lot of attention these days (especially in DMTF) I decided to go back to these notes and see if I could extract a few practical recommendations in the form of a wrap-up.]

The findings listed below should be relevant whether your protocol is trying to be truly RESTful, just HTTP-centric or even zen-SOAPy. Many of the issues that arise when creating a protocol that maps well to IT management use cases should transcend these variations and that’s what I try to cover.

Finding #1: Relationships (links) are first-class entities (a.k.a. “hypermedia”)

The clear conclusion of both part 1 and part 2 was that the most relevant part of REST for IT and Cloud management is the use of hypermedia. IT management enjoys a head start on this compared to other domains, because its models are already rich in explicit relationships (e.g. CIM associations), as opposed to other business domains in which relationships are more implicit (to the end user at least). But REST teaches us that just having relationships in your model is not enough. They need to be exposed in a way that maps directly to the protocol, so that following a relationship is an infrastructure-level task, not an application-level task: passing an ID as a parameter for some domain-specific function is not it.

This doesn’t violate the rule to not mix the protocol and the model because the alignment should take place in the metamodel. XML is famously weak in that respect, but that’s where Atom steps in, handling relationships in a generic way. Similarly, support for references is, in addition to its accolade to Schematron, one of the main benefits of SML (extra kudos for apparently dropping the “EPR” reference scheme between submission and standardization, in favor of just the “URI” scheme). Not to mention RDFa and friends. Or HTTP Link headers (explained) for link-challenged types.

Finding #2: Put IDs on steroids

There is little to argue about the value of clearly identifying things of interest and we didn’t wait for the Web to realize this. But it is also one of the most vexing and complex problems in many areas of computing (including IT management). Some of the long-standing questions include:

  • Use an opaque ID (some random-looking string a characters) or an ID grounded in “unique” properties of the resource (if you can find any)?
  • At what point does a thing stop being the same (typical example: if I replace each hardware component of a server one after the other, at which point is it not the same server anymore? Does it make sense for the IT guys to slap an “asset id” sticker on the plastic box around it?)
  • How do you deal with reconciling two resources (with their own IDs) when you realize they represent the same thing?

REST guidelines don’t help with these questions. There often is an assumption, which is true for many web apps, that the application “owns” the resource. My “inbox” only exists as a resource within the mail server application (e.g. Gmail or an Exchange server). Whatever URI GMail assigns for it is the URI for my inbox, period. Things are not as simple when the resources exist outside of any specific application: take a server, for example: the board management controller (or the hypervisor in the case of a VM), the OS management layer and the management agent installed on the machine all have claims to report on the machine (and therefore a need to identify it).

To some extent, Cloud computing simplifies many of these issues by providing controllers that “own” infrastructure resources and can authoritatively identify them. But it really is only pushing the problem to the next level of the stack.

Making the ID a URI doesn’t magically answer these questions. Though it helps in that it lets you leverage reconciliation mechanisms developed around URIs (such as <atom:link rel=”alternate”> or owl:sameAs). What REST does is add another constraint to this ID mechanism: Make the IDs dereferenceable URLs rather than just URIs.

I buy into this. A simple GET on a resource URI doesn’t solve everything but it has so many advantages that it should be attempted in all cases. And make this HTTP GET please (see finding #6).

In this adoption of GET, we just have to deal with small details such as:

  • What URL do I use for resources that have more than one agent/controller?
  • How close to the resource do I point this URL? If it’s too close to it then it may change as the resource evolves (e.g. network changes) or be affected by the resource performance (e.g. a crashed machine or application that does not respond to its management API). If it’s removed from the resource, then I introduce a scope (e.g. one controller) within which the resource has to remain, which may cause scalability concerns (how many VMs can/should one controller handle, what if I want to migrate a VM across the ocean…).

These are somewhat corner cases (and the more automation and virtualization you get, the fewer possible controllers you have per resource). While they need to be addressed, they don’t come close to negating the value of dereferenceable IDs. In addition, there are plenty of mechanisms to help with the issues above, from links in the representations (obviously) to RDDL-style lightweight directory to a last resort “give Saint Peter a call” mechanism (the original WSRF proposal had a sub-specification called WS-RenewableReferences that would let you ask for a new version of an expired EPR but it was never published — WS-Naming in then-GGF also touched on that with its reference resolvers — showing once again that the base challenges don’t change as fast as technology flavors).

Implicit in this is the fact that URIs are vastly superior to EPRs. The latter were only just a band-aid on a broken system (which may have started back when WSDL 1.1 decided to define “ports” as message aggregators that can have only one URL) and it’s been more debilitating to SOAP than any other interoperability issue. Web services containers internalized this assumption to the point of providing a stunted dispatch mechanism that made it very hard to assign distinct URLs to resources.

Finding #3: If REST told you to jump off a bridge, would you do it?

Adherence to REST is not required to get the benefits I describe in this series. There is a lot to be inspired by in REST, but it shouldn’t be a religion. Sure, if you squint hard enough (and poke it here and there) you can call your interface RESTful, but why bother with the contortions if some parts are not so. As long as they don’t detract from the value of REST in the other parts. As in all conversions, the most fervent adepts of RPC will likely be tempted to become its most violent denunciators once they’re born again. This is a tired scenario that we don’t need to repeat. Don’t think of it as a conversion but as a new perspective.

Look at the “RESTful with many parameters?” comment thread on Stefan Tilkov’s excellent InfoQ introduction to REST. It starts with some shared distaste for parameter-laden URIs and a search for a more RESTful approach. This gets suggested:

You could do a post on some URI like ./query/product_dep which would create a query resource. Now you “add” products to the query either by sending a product uri list with the initial post or by calling post on ./query/product_dep/{id}. With every post to the query resource the get on the query resource would change.

Yeah, you could. But how about an RPC-like query operation rather than having yet another resource lifecycle to manage just for the sake of being REST-compliant? And BTW, how do you think any sane consumer of your API is going to handle this? You guessed it, by packaging the POST/POST/GET/DELETE in one convenient client-side library function called “query”. As much as I criticize RPC-centric toolkits (see finding #5 below), it would be justified in this case.

Either you understand why/how REST principles benefit you or you don’t. If you do, then use this understanding to interpret the REST principles to best fit your needs. If you don’t, then no amount of CONTENT-TYPE-pixie-dust-spreading, GET-PUT-POST-DELETE-golden-rule-following and HATEOAS-magical-incantation-reciting will help you. That’s the whole point, for me at least, of this tree-part investigation. Stefan says essential the same, but in a converse way, in his article: “there are often reasons why one would violate a REST constraint, simply because every constraint induces some trade-off that might not be acceptable in a particular situation. But often, REST constraints are violated due to a simple lack of understanding of their benefits.” He says “understand why you violate” and I say “understand why you obey”. It is essentially the same (if you’re into stereotypes you can attribute the difference to his Germanic heritage and my Gallic blood).

Even worse than bending your interface to appear RESTful, don’t cherry-pick your use cases to only keep those that you feel you can properly address via REST, leaving the others aside. Conversely, don’t add requirements just because REST makes them easy to support (interesting how quickly “why do you force me to manage the lifecycle of yet another resource just to run a query” turns into “isn’t this great, you can share queries among users and you can handle long-running queries, I am sure we need this”).

This is not to say that you should not create a fully RESTful system. Just that you don’t necessarily have to and you can still get many benefits as long as you open your eyes to the cost/benefits trade-off involved.

Finding #4: Learn humility from REST

Beyond the technology, there is a vibe behind REST design. You can copy the technology and still miss it. I described it in 2005 as Humble Architecture, and applied to SOA at the time. But it describes REST just as well:

More practically, this means that the key things to keep in mind when creating a service, is that you are not at the center of the universe, that you don’t know who is going to consume your service, that you don’t know what they are going to do with it, that you are not necessarily the one who can make the best use of the information you have access to and that you should be willing to share it with others openly…

The SOA Manifesto recently called this “intrinsic interoperability”.

In IT management terms, it means that you can RESTify your CMDB and your event console and your asset management software and your automation engine all you want, if you see your code as the ultimate consumer and the one that knows best, as the UI that users have to go through, the “ultimate source of truth” and the “manager of managers” then it doesn’t matter how well you use HTTP.

Finding #5: Beware of tools bearing gifts

To a large extent, the great thing about REST is how few tools there are to take it away from you. So you’re pretty much forced to understand what is going on in your contract as opposed to being kept ignorant by a wsdl2java type of toolkit. Sure, Java (and .NET) have improved in that regard, but really the cultural damage is done and the expectations have been set. Contrast this to “the ‘router’ is just a big case statement over URI-matching regexps”, from Tim Bray’s post on the Sun Cloud API, one of my main inspirations for this investigation.

REST is not inherently immune to the tool-controlling-the-hand syndrome. It’s just a matter of time until such tools try to make REST “accessible” to the “normal” developer (who can supposedly prevent thread deadlocks but not parse XML). Joe Gregorio warns about this in the context of WADL (to summarize: WADL brings XSD which leads to code generation). Keep this in mind next time someone states that REST is more “loosely coupled” than SOAP. It’s how you use it that matters.

Finding #6: Use screws, not glue, so we can peer inside and then close the lid again

The “view source” option is how I and many others learned HTML. It unfortunately created a generation of HTML monsters who never went past version 3.2 (the marbled background makes me feel young again). But it also fueled the explosion of the Web. On-the-wire inspection through soapUI is what allowed me to perform this investigation and report on it (WMI has allowed this for years, but WS-Management is what made it accessible and usable for anyone on any platform). This was, of course, in the context of SOAP which is also inspectable. Still, in that respect nothing beats plain HTTP which is why I recommend HTTP GET in finding #2 (make IDs dereferenceable) even though I don’t expect that the one-page-per-resource view is going to be the only way to access it in the finished product.

These (HTML source, on-the-wire XML and resource-description pages) rarely hit the human eye and yet their presence enables the development of the more commonly used views. Making it as easy as possible to see what is going on under the covers helps with learning, with debugging, with extending and with innovating. In the same way that 99% of web users don’t look at the HTML source (and 99.99% of them don’t see the HTTP requests) but the Web would not be what it is to them if this inspectability wasn’t been there to fuel its development.

Along the same line, make as few assumptions as possible about the consumers in your interfaces. Which, in practice, often means document what goes on the wire. WSDL/WADL can be used as a format, but they are at most one small component. Human-readable semantics are much more important.

Finding #7: Nothing is free

Part of what was so attractive about SOAP is everything you were going to get “for free” by using it. Message-level security (for all these use cases where your messages starts over HTTP, then hops onto a train, then get delivered by a carrier pigeon). Reliable messaging. Transactionality. Intermediaries (they were going to be a big deal in SOAP, as you can see in vestigial form today in the Nodes/Roles left in the spec – also, do you remember WS-Routing? I do.)

And it’s true that by now there is a body of specifications that support this as composable SOAP headers. But the lack of usage of these features contrasts with how often they were bandied in the early days of SOAP.

Well, I am detecting some of the same in the REST camp. How often have you heard about how REST enables caching? Or about how content types allows an ISP to compress images on the fly to speed up delivery over dial-up? Like in the SOAP case, these are real features and sometimes useful. It doesn’t mean that they are valuable to you. And if they are not, then don’t let them be used as justifications. Especially since they are not free. If caching doesn’t help me (because of low volume, because security considerations prevent a shared cache, etc) then its presence actually adds a cost to me, since I now have to worry whether something is cached or not and deal with ETags. Or I have to consistently remember to request the cache to be bypassed.

Finding #8: Starting by sweeping you front door.

Before you agonize about how RESTful your back-end management protocol is, how about you make sure that your management application (the user front-end) is a decent Web application? One with cool URIs , where the back button works, where bookmarks work, where the data is not hidden in some over-encompassing Flash/Silverlight thingy. Just saying.


Now for some questions still unanswered.

Question #1: Is this a flee market?

I am highly dubious of content negotiation and yet I can see many advantages to it. Mostly along the lines of finding #6: make it easy for people to look under the hood and get hold of the data. If you let them specify how they want to see the data, it’s obviously easier.

But there is no free lunch. Even if your infrastructure takes care of generating these different views for you (“no coding, just check the box”), you are expanding the surface of your contract. This means more documentation, more testing, more interoperability problems and more friction when time comes to modify the interface.

I don’t have enough experience with format negotiation to define the sweetspot of this practice. Is it one XML representation and one HTML, period (everything else get produced by the client by transforming the XML)? But is the XML Atom-wrapped or not? What about RDF? What about JSON? Not to forget that SOAP wrapper, how hard can it be to add. But soon enough we are in legacy hell.

Question #2: Mime-types?

The second part of Joe Gregorio’s WADL entry is all about Mime types and I have a harder time following him there. For one thing, I am a bit puzzled by the different directions in which Mime types go at the same time. For example, we have image formats (e.g. “image/png”), packaging/compression formats (e.g. “application/zip”) and application formats (e.g. “application/vnd.oasis.opendocument.text” or “application/msword”). But what if I have a zip full of PNG images? And aren’t modern word processing formats basically a zip of XML files? If I don’t have the appropriate viewer, maybe I’d like them to be at least recognized as ZIP files. I don’t see support for such composition and taxonomy in these types.

And even within one type, things seem a bit messy in practice. Looking at the registered applications in the “options” menu of my Firefox browser, I see plenty of duplication:

  • application/zip vs. application/x-zip-compressed
  • application/ms-powerpoint vs. application/
  • application/sdp vs. application/x-sdp
  • audio/mpeg vs. audio/x-mpeg
  • video/x-ms-asf vs. video/x-ms-asf-plugin

I also wonder at what level of depth I want to take my Mime types. Sure I can use Atom as a package but if the items I am passing around happen to be CIM classes (serialized to XML), doesn’t it make sense to advertise this? And within these classes, can I let you know which domain (e.g. which namespace) my resources are in (virtual machines versus support tickets)?

These questions may simply be a reflection of my lack of maturity in the fine art of using Mime types as part of protocol design. My experience with them is more of the “find the type that works through trial and error and then leave it alone” kind.

[Side note: the first time I had to pay attention to Mime types was back in 1995/1996, playing with non-parsed headers and the multipart/x-mixed-replace type to bring some dynamism to web pages (that was before JavaScript or even animated GIFs). The site is still up, but the admins have messed up the Apache config so that the CGIs aren’t executed anymore but return the Python code. So, here are some early Python experiments from yours truly: this script was a “pushed” countdown and this one was a “pushed” image animation. Cool stuff at the time, though not in a “get a date” kind of way.]

On the other hand, I very much agree with Joe’s point that “less is more”, i.e. that by not dictating how the semantics of a Mime type are defined the system forces you to think about the proper way to define them (e.g. an English-language RFC). As opposed to WSDL/XSD which gives the impression that once your XML validator turns green you’re done describing your interface. These syntactic validations are a complement at best, and usually not a very useful one (see “fat-bottomed specs”).

In comments on previous posts, Stu Charlton also emphasizes the value that Mime types bring. “Hypermedia advocates exposing a variety of links for such state-transitions, along with potentially unique media types to describe interfaces to those transitions.” I get the hypermedia concept, the HATEOAS approach and its very practical benefits. But I am still dubious about the role of Mime types in achieving them and I am not the only one with such qualms. I have too much respect for Joe and Stu to dismiss it entirely, but until I get an example that makes it “click” in practice for me I won’t sweat about Mime types too much.

Question #3: Riding the Zeitgeist?

That’s a practical question rather than a technical one, but as a protocol creator/promoter you are going to have to decide whether you market it as “RESTful”. If I have learned one thing in my past involvement with standards it is that marketing/positioning/impressions matter for standards as much as for products. To a large extent, for Clouds, Linked Data is a more appropriate label. But that provides little marketing/credibility humph with CIOs compared to REST (and less buzzword-compliance for the tech press). So maybe you want to write your spec based on Linked Data and then market it with a REST ribbon (the two are very compatible anyway). Just keep in mind that REST is the obvious choice for protocols in 2009 in the same way that SOAP was a few years ago.

Of course this is not an issue if you specification is truly RESTful. But none of the current Cloud “RESTful” APIs is, and I don’t expect this to change. At least if you go by Roy Fielding’s definition (or Paul’s handy summary):

A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC’s functional coupling].

And (in a comment) Mark Baker adds:

I’ve reviewed lots of “REST APIs”, many of them privately for clients, and a common theme I’ve noticed is that most folks coming from a CORBA/DCE/DCOM/WS-* background, despite all the REST knowledge I’ve implanted into their heads, still cannot get away from the need to “specify the interface”. Sometimes this manifests itself through predefined relationships between resources, specifying URI structure, or listing the possible response codes received from different resources in response to the standard 4 methods (usually a combination of all those). I expect it’s just habit. But a second round of harping on the uniform interface – that every service has the same interface and so any service-specific interface specification only serves to increase coupling – sets them straight.

So the question of whether you want to market yourself as RESTful (rather than just as “inspired by the proper use of HTTP illustrated by REST”) is relevant, if only because you may find the father of REST throwing (POSTing?) tomatoes at you. There is always a risk in wearing clothes that look good but don’t quite fit you. The worst time for your pants to fall off is when you suddenly have to start running.

For more on this, refer to Ted Neward’s excellent Roy decoder ring where he not only explains what Roy means but more importantly clarifies that “if you’re not doing REST, it doesn’t mean that your API sucks” (to which I’d add that it is actually more likely to suck if you try to ape REST than if you allow yourself to be loosely inspired by it).


Wrapping up the wrap-up

There is one key topic that I had originally included in this wrap-up but decided to remove: extensibility. Mark Hapner brings it up in a comment on a previous post:

It is interesting to note that HTML does not provide namespaces but this hasn’t limited its capabilities. The reason is that links are a very effective mechanism for composing resources. Rather than composition via complicated ‘embedding’ mechanisms such as namespaces, the web composes resources via links. If HTML hadn’t provided open-ended, embeddable links there would be no web.

I am the kind of guy who would have namespace-qualified his children when naming them (had my wife not stepped in) so I don’t necessarily see “extension via links” as a negation of the need for namespaces (best example: RDF). The whole topic of embedding versus linking is a great one but this post doesn’t need another thousand words and the “REST in practice” umbrella is not necessarily the best one for this discussion. So I hereby conclude my “REST in practice for IT and Cloud management” series, with the intent to eventually start a “Linked Data in practice for IT and Cloud management” series in which extensibility will be properly handled. And we can also talk about querying (conspicuously absent from Cloud APIs, unless CMDBf is now a Cloud API) and versioning. As a teaser for the application of Linked Data to IT/Cloud, I will leave you with what Vint Cerf has to say.

[UPDATED 2010/1/27: I still haven’t written the promised “Linked Data in practice for IT and Cloud management” post, but this explanation of the usage of Linked Data for pretty much says it all. I may still write a post describing how what Jeni says about government data applies to Cloud management APIs, but it’s almost too obvious to bother. Actually, there may be reasons why Cloud management benefits even more from Linked Data than UK government data, so it may still be worth a post. At some point. When I convince myself that it may influence things rather than be background noise.]


Filed under API, Application Mgmt, Automation, Cloud Computing, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Modeling, Protocols, REST, Semantic tech, SOA, SOAP, Specs, Utility computing

Can I get a price check on this AMI?

I almost titled this entry “Cloud + Tivoli = $” in reference to the previous one (“Cloud + proprietary software = ♥”). In that earlier entry, I described the opportunity for Cloud providers to benefit themselves, their customers and software vendors by drastically reducing the frictions involved in using proprietary software (rather than open source software). The example I used was Windows EC2 instances. But it’s not the best example because there is a very tight relationship between Amazon and Microsoft on this. In many ways, these Windows instances are “hard-coded” in EC2: they have a special credential retrieval mechanism, their price appears in the main EC2 price list, etc. This cannot scale as a generic Amazon-mediated payment service for many software vendors.

Rather than the special case of Windows instances, the more interesting situation to look at is the availability of vendor-provided EC2 instances at a higher price. So I went to look a bit more into this, and I came out… very confused and $20 poorer.

Earlier in the week, I had noticed an announcement of IBM Tivoli on EC2 that explained that “the hourly price for Tivoli on EC2 includes an IBM license”. This seemed like a perfect place for me to start. My first question was “how much does it cost”? The blog entry doesn’t say. It links to a Tivoli on EC2 FAQ on the IBM site, which doesn’t say either (apparently IBM’s target customers work in recession-proof industries and do not “frequently ask” about prices). I then followed the link to the overall IBM and AWS FAQ but it just states that “charges will be announced by Amazon Web Services in the coming months”. Both FAQs explain how to use your traditional IBM license on EC2, but that’s not what I am after. At this point, I feel like a third-world tourist who entered a high-end jewelry store in Paris where no price is displayed. Call me plebeian, but I am more accustomed to Target-like stores with price-check scanners in the aisles…

I hypothesized that the AWS console might show me the price when I select the Tivoli AMIs. But no such luck. Tired of searching, and since I was already in the console, I figured I’d just launch an instance and see the hourly cost in my account usage. Since it comes in three versions (depending on how many targets you want Tivoli to manage), I launched one of each. Additionally, for one of them I launched instances of two different sizes so I can verify that the price difference is equal to the base EC2 price difference between such instance sizes. Here is what I got:


Of course, by the time my account usage page was updated (it took a few hours) I had found the price list which in retrospect wasn’t that hard to find (from Amazon, not IBM).

So maybe I am not the brightest droplet in the cloud, but for 20 bucks I consider that at least I bought the right to make a point: these prices should not be just on some web page. They should be accessible at the time of launch, in the console. And also in the EC2 API, so that the various EC2 tools can retrieve them. Whether it’s just for human display or to use as part of some automation logic, this should be available in an authoritative manner, without the need to scrape a page.

The other thing that bothers me is the need to decide upfront whether I want to launch a Tivoli instance to manage 50 virtual cores, 200 virtual cores or 600 virtual cores. That feels very inelastic for an EC2 deployment. I want to be charged for the actual number of virtual cores I am managing at any point in time. I realize the difficulty in metering this way (the need for Tivoli to report this to AWS, the issue of trust…) but hopefully it will eventually get there.

While I am talking about future improvements, another limitation is that there can currently only be one vendor per AMI. What if someone wants to write an application that runs on top of Oracle Middleware and package this as a paid AMI? It would be nice if Amazon eventually allowed the price of the instance to be split three-ways (Amazon, Oracle, application vendor).

In any case, now you know why this investigation left me poorer. The confused part comes from the fact that I had earlier experimented with Amazon Paid AMIs and it was an entirely different experience. Better in some ways: you get a clear price list upfront such as this.


But not as good in other ways: you have to purchase the paid AMI in a way that it is somewhat disconnected from the launch of the instance. And for some reason you paid for this directly out of your credit card as opposed to it going to your AWS usage account along with all your other charges. I would expect that many customers will use these paid AMIs are part of a larger EC2 deployment and as such it seems awkward to have it billed separately.

But overall, it’s the disconnect between the two that the confuses me. Are there two different types of paid AMIs (three if you include the Windows EC2 instances)? What am I missing?

The next step in my investigation should probably be to create an AMI and set a price on it, so I get the vendor’s view in addition to the consumer’s view. And maybe I can earn my $20 back in the process…


Filed under Amazon, Application Mgmt, Business, Cloud Computing, Everything, IBM, IT Systems Mgmt, Utility computing

Oracle Real User Experience Insight 6.0

Oracle just released version 6.0 of Real User Experience Insight (friends call it RUEI), which is part of the Enterprise Manager portfolio. As the name indicates, it captures and presents in great details the experience of actual users interacting with your application. This is real traffic, not synthetic probes. It’s is a mature product, which originally came from the Moniforce acquisition two years ago.

One way to classify the improvements in this version is to sort them based on who they are exciting for:

Exciting for techies

The ability to link in context from RUEI to diagnostic tools in Enterprise Manager. For example, going from a slow JSP in RUEI to a view of its role in the overall composite application. Or to a deep-dive in Java diagnostic.

Exciting for Oracle applications administrators

Many improvements (in the form of updated “Accelerators”) for using RUEI to manage Oracle EBS, PeopleSoft, Siebel and JD Edwards. Including EBS Forms support in socket mode without Chronos (those who know what this means rejoice, others can safely ignore).

Exciting for business and marketing people

The full capture and replay of user sessions. The ease of reproducing errors and seeing exactly what your users do and experience. Terrifyingly edifying at times.

1 Comment

Filed under Application Mgmt, Everything, IT Systems Mgmt, Oracle

Can your hypervisor radio for air support?

As I was reading about Microsoft Azure recently, a military analogy came to my mind. Hypervisors are tanks. Application development and runtime platforms compose the air force.

Tanks (and more generally the mechanization of ground forces) transformed war in the 20th century. They multiplied the fighting capabilities of individuals and changed the way war was fought. A traditional army didn’t stand a chance against a mechanized one. More importantly, a mechanized army that used the new tools with the old mindset didn’t stand a chance against a similarly equipped army that had rethought its strategy to take advantage of the new capabilities. Consider France at the beginning of WWII, where tanks were just canons on wheels, spread evenly along the front line to support ground troops. Contrast this with how Germany, as part of the Blitzkrieg, used tanks and radios to create highly mobile – and yet coordinated – units that caused havoc in the linear French defense.

Exercise for the reader who wants to push the analogy further:

  • Describe how Blitzkrieg-style mobility of troops (based on tanks and motorized troop transports) compares to Live Migration of virtual machines.
  • Describe how the use of radios by these troops compares to the use of monitoring and control protocols to frame IT management actions.

Tanks (hypervisor) were a game-changer in a world of foot soldiers (dedicated servers).

But no matter how good your tanks are, you are at a disadvantage if the other party achieves air superiority. A less sophisticated/numerous ground force that benefits from strong air support is likely to prevail over a stronger ground force with no such support. That’s what came to my mind as I read about how Azure plans to cover the IaaS layer, but in the context of an application-and-data-centric approach. Where hypervisors are not left to fend for themselves based on the limited view of the horizion from the periscope of their turrets but rather orchestrated, supported (and even deployed) from the air, from the application platform.

C-130 tank airdrop

(Yes, I am referring to the Azure vision as it was presented at PDC09, not necessarily the currently available bits.)

Does your Cloud vendor/provider need an air force?

Exercise for the reader who wants to push the analogy to the stratosphere:

  • Describe how business logic/process, business transaction management and business intelligence are equivalent to satellites, surveying the battlefield and providing actionable intelligence.

The new Cloud stack (“military-cloud complex” version):


[Note: I have no expertise in military history (or strategy) beyond high school classes about WWI and WWII, plus a couple of history books and a few war movies. My goal here is less to be accurate on military concerns (though I hope to be) than to draw an analogy which may be meaningful to fellow IT management geeks who share my level of (in)expertise in military matters. This is just yet another way in which I try to explain that, for Clouds as for plain old IT management, “it’s the application, stupid”.]

1 Comment

Filed under Application Mgmt, Azure, Cloud Computing, Everything, IT Systems Mgmt, Mgmt integration, Utility computing, Virtualization

Review of Fujitsu’s IaaS Cloud API submission to DMTF

Things are heating up in the DMTF Cloud incubator. Back in September, VMWare submitted its vCloud API (or rather a “reader’s digest” version of it) to the group. Last week, the group released a white paper titled “Interoperable Clouds”. And a second submission, from Fujitsu, was made last week and publicly announced today.

The Fujitsu submission is called an “API design”. What this means is that it doesn’t tell you anything about what things look like on the wire. It could materialize as another “XML over HTTP” protocol (with or without SOAP wrapper), but it could just as well be implemented as a binary RPC protocol. It’s really more of an esquisse of a resource model than a remote API. The only invocation-related aspect of the document is that it defines explicit operations on various resources (though not their input and outputs). This suggest that the most obvious mapping would be to some XML/HTTP RPC protocol (SOAPy or not). In that sense, it stands out a bit from the more recent Cloud API proposals that take a “RESTful” rather than RPC approach. But in these days of enthusiastic REST-washing I am pretty sure a determined designer could produce a RESTful-looking (but contorted) set of resources that would channel the operations in the specification as HTTP-like verbs on these resources.

Since there are few protocol aspects to this “API design”, if we are to compare it to other “Cloud APIs”, it’s really the resource model that’s worth evaluating. The obvious comparison is to the EC2 model as it provides a pretty similar set of infrastructure resources (it’s entirely focused on the IaaS layer). It lacks EC2 capabilities around availability, security and monitoring. But it adds to the EC2 resource model the notions of VDC (“virtual data center”, a container of IaaS resources), VSYS (see below) and a lightly-defined EFM (Extended Function Module) concept which intends to encompass all kinds of network/security appliances (and presumably makes up for the lack of security groups).

The heart of the specification is the VSYS and its accompanying VSYS Descriptor. We are encouraged to think of the VSYS Descriptor as an extension of OVF that lets you specify this kind of environment:

Example content for a VSYS Descriptor

Example content for a VSYS Descriptor

By forcing the initial VSYS instance to be based on a VSYS Descriptor, but then allowing the VSYS to drift away from the descriptor via direct management actions, the specification takes a middle-of-the-road approach to the “model-based versus procedural” debate. Disciples of the procedural approach will presumably start from a very generic and unconstrained VSYS Descriptor and, from there, script their way to happiness. Model geeks will look for a way to keep the system configuration in sync with a VSYS Descriptor.

How this will work is completely undefined. There is supposed to be a getVSYSConfiguration() operation which “returns the configuration information on the VSYS” but there is no format/content proposed for the response payload. Is this supposed to return every single config file, every setting (OS, MW, application) on all the servers in the VSYS? Surely not. But what then is it supposed to return? The specification defines five VSYS attributes (VSYSID, creator, createTime, description and baseDescriptor) so I know what getSYSAttributes() returns. But leaving getVSYSConfiguration() undefined is like handing someone an airplane maintenance manual that simply reads “put the right part in the right place”. A similar feature is also left as an exercise to the reader in section that sketches an “external configuration service”. We are provided with a URL convention to address the service, but zero information about the format and content of the configuration instructions provided to the VServer.

EC2 has a keypair access mechanism for Linux instances and a clumsy password-retrieval system for Windows instances. The Fujitsu proposal adopts the lowest common denominator (actually the greatest common divisor, but that’s a lost rhetorical cause): random password generation/retrieval for everyone.

I also noticed the statement that a VServer must be “implemented as a virtual machine” which is an unnecessary constraint/assumption. The opposite statement is later made for EFMs, which “can be implemented in various ways (e.g. run on virtual machines or not)”, so I don’t want to read too much into the “hypervisor-required” VServer statement which probably just needs an editorial clean-up.

From a political perspective this specification looks more like a case of “can I play with you? I brought some marbles” than a more aggressive “listen everybody, we’re playing soccer now and I am the captain”. In other words, this may not be as much an attempt to shape the outcome of the incubator as much as to contribute to its work and position Fujitsu as a respected member whose participation needs to be acknowledged.

While this is an alternative submission to the vCloud API, I don’t think VMWare will feel very challenged by it. The specification’s core (VSYS Descriptor) intends to build on OVF, which should be music to VMWare’s ears (it’s the model, not the protocol, which is strategic). And it is light enough on technical details that it will be pretty easy for vCloud to claim that it, indeed, aligns with the intent of this “design”.

All in all, it is good to see companies take the time to write down what they expect out of the DMTF work. And it’s refreshing to see genuine single-company contributions rather than pre-negotiated documents by a clique. Whether they look more like implementable specifications of position paper, they all provide good input to the DMTF Cloud incubator.


Filed under Automation, Cloud Computing, DMTF, Everything, IT Systems Mgmt, Mgmt integration, Modeling, Specs, Standards, Utility computing, Virtualization

Desirable technical characteristics of PaaS

PaaS can most dramatically improve the IT experience in four areas:

  • Hosting/operations efficiency
  • Application-centric management
  • Development productivity
  • Security

To do so, there are technical characteristics that PaaS frameworks should eventually exhibit. These are not technical characteristics of a given PaaS container, they are shared characteristics that go across all container types, no matter what the operational capabilities of the containers are.

Here is a rough and unorganized list of the desirable characteristics (meta-capabilities) of PaaS Cloud containers:

  • An application component model that supports deployment/configuration across all PaaS container types.
  • Explicit interactions/invocations between application components (resilient connections between component: infrastructure-level retry/reroute)
  • Uniform and consistent request tracking across all components. Ability to intercept component-to-component communication.
  • Short-term (or externally persisted) state so that all instances can be quickly redirected out of any one node.
  • Subset of platform management interface exposed to consumer, along with out of the box application management. Application metrics consolidated at application level rather than node level.
  • Consistent, model-based application management interface across all container types. Hooks for component code to provide its manageability in the same framework.
  • Minimal footprint of any container node for limited patching requirements.
  • Assistance for debugging platform-hosted code (see this entry).
  • No encroachment of container technology on application contract (e.g. no forced URL structure).
  • Application uniformly scalable to the limit of the underlying hardware (no imposed partitioning).
  • Shared authentication / authorization / auditing across containers.
  • Minimum contract/interface exposed by each container.
  • Governance of application services, aligned (in model/protocols) with the container management interfaces.
  • [UPDATE: need to add metering+billing as William Louth pointed out in a comment]

This applies across the board to public, private and hybrid PaaS. The distinctions between these delivery models are real but at a different level. The important thing is that the PaaS administrator is different from the application administrator in all cases. On the other hand, most of these technical characteristics are not achievable for lower-level Cloud resources (like virtual hosts and low-level storage) which is why the IaaS form of Cloud leaves the Cloud promise only partially fulfilled.


Filed under Application Mgmt, Cloud Computing, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Middleware, PaaS, Utility computing

Would you like some management with that appliance?

Andi Mann recently wrote an interesting post about virtual appliances . He uses the domain name for his blog so I figured I’d do just that. More specifically, I have three comments on his article.

Opaque or transparent appliance

Andi’s concerns about the security and management problems posed by virtual appliances are real, but he seems to assume that the content of the appliance is necessarily opaque to the customer and under the responsibility of the appliance provider. Why can’t a virtual appliance be transparent in the sense that the customer is able to efficiently manage at least some aspects of the software installed on it? “You can’t put agents on most virtual appliances, they don’t come with WMI, and most have only a GUI for management” says Andi. Why can’t an appliance come with an agent (especially in these days of consolidation where many vendors provide many layers of the stack – hypervisor / OS / application container / application / management tools – including their agent)? Why can’t it implement a standard management API (most servers nowadays implement WBEM, WS-Management and/or IPMI pre-boot – on the motherboard – which is a lot more challenging to do than supporting a similar protocol in a virtual appliance). Andi is really criticizing the current offering more than the virtual appliance model per se and in this I can join him.

Let me put it differently, since this is probably just a question of definition: what would Andi call a virtual appliance that does expose management APIs for its infrastructure (e.g. WS-Management for the OS, JMX for the java stack) or that comes with an agent (HP, IBM, BMC, Oracle…) installed on it?

Such an appliance (let’s call it a “transparent virtual appliance” for now) doesn’t provide all the commonly claimed benefits of an appliance (zero config/admin) but as Andi points out these benefits come with major intrinsic drawbacks. A transparent virtual appliance still drastically simplifies installation (especially useful for test/dev/demo/POC). It doesn’t entirely free you of monitoring and configuration but at least it provides you with a very consistent and controlled starting point, manageable from the start (no need to subsequently install an agent). In addition, it can be made “just enough” (just enough OS, just enough app server…) to require a lot less maintenance than an application stack that you assemble yourself out of generic parts. We’ll always have trade offs between how optimized/customized it is versus how uniform your overall environment can be, but I don’t see the use of an appliance as a delivery mechanism as necessarily cornering you into a completely opaque situation, from a management perspective.

Those who attended Oracle Open World a few weeks ago were treated to an example of such an appliance, if they attended any of the sessions that covered Oracle’s Appliance Builder (the main one was, I believe, Virtualizing Oracle Fusion Middleware in the Modern Data Center, in case you have access to the Open World On Demand replay and slides). I believe it’s probably the same content that @jayfry3 was shown when he tweeted about “Oracle is demoing their private cloud self-service app”. These appliances are not at all opaque from a management perspective. To the contrary, they are highly manageable, coming with an Enterprise Manager agent installed that can manage everything in the appliance (and when that “everything” doesn’t include the OS, it’s because there isn’t one thanks to JRockit Virtual Edition, making things slimmer, faster, safer and more manageable). And of course the OVM-based environment in which you deploy these appliances is also managed by Enterprise Manager. OK, my point here wasn’t to go into marketing mode, but this is cool stuff and an example of what virtual appliances should be. BTW, this was also demonstrated during Hasan Rizvi’s keynote at OpenWorld, including the management of these systems through Enterprise Manager.

In the long run it’s irrelevant

As with all things computer-related, the issue is going to get blurrier and then irrelevant . The great thing about software is that there is no solid line. In this case, we will eventually get more customized appliances (via appliance builders or model-driven appliance generation) blurring the line between installed software and appliance-based software.

Waiting for PaaS

Towards the end of his post, Andi paints an optimistic vision of the future: “I also think that virtual appliances have a bright future – but in some ways I continue to see them as a beta version of what could (or should) come next.  By adding in capabilities for responsible and accountable management, they could form the basis of more fully-functional virtual service management containers. These in turn could form the basis of elastic, mobile, network-deployed, responsible cloud appliances that deliver complete end-to-end service management without regard to physical location or domain of control.”

I mostly agree with this vision, though when I describe it it is in the guise of a PaaS platform. Where your appliance (which today goes from the OS all the way to the app) has shrunk to an application template that you deploy in the PaaS environment (rather than in a hypervisor). If/when the underlying PaaS environment has reached the right level of management automation you get all the benefits of an appliance while maintaining the consistency of your environment and its adherence to your management policies (because the environment is the PaaS platform and its management is driven from your policies).

[As is often the case, this started as a comment (on Andi’s blog) and quickly outgrew that environment, leading to this new post. Plus, Andi’s blog is brand new and seems to be well worth spreading the word about (Andi himself is under-marketing it).]


Filed under Application Mgmt, Automation, Desired State, Everything, IT Systems Mgmt, Manageability, Oracle, Oracle Open World, OVM, PaaS, Virtual appliance, WS-Management

OWL news you can use

The W3C released OWL 2 today. Most readers of this blog are IT management people (whether they call it “cloud computing” or “boring old system management”) and don’t follow RDF, OWL, SPARQL etc too closely (if at all). Yet there is a lot of potential value in using these technologies for IT management, so I thought it might be helpful to provide some practical resources on the topic. I have selected articles that cover the special (some may say “twisted”) approach of using OWL and its friends for validation rather than just inference, as this use case is very relevant to IT management.

Of course you can also go to the W3C standard itself, starting with the overview of OWL 2.

Just so you don’t feel lonely if you decide to explore this path, have a look at Elastra’s sexy technology stack. ECML, EDML and EMML are all defined as OWL ontologies.

Comments Off on OWL news you can use

Filed under Application Mgmt, Everything, IT Systems Mgmt, Mgmt integration, Modeling, OWL, RDF, Semantic tech, Specs, Standards

The future (2006 version), has arrived

Remember 2006? Things were starting to fall into place for IT management integration and automation:

  • SDD was already on its way to cleanly describe/package/manage the lifecycle of simple and composite applications alike,
  • the first version of SML came out to capture all the relevant constraints of complex and composite systems and open the door to “desired-state management”,
  • the CMDBf effort was started to seamlessly integrate all sources of configuration and provide a bird-eye view of your entire IT infrastructure, and
  • the WSDM/WS-Management convergence/reconciliation was announced and promised to free management consoles from supporting many resource discovery, collection and control mechanisms and from having platform/library dependencies between the manager and its targets.

It looked like we were a year or two from standardization on all these and another year or two from shipping implementations. Things were looking good.

Good news: the schedule was respected. SDD, SML and CMDBf are now all standards (at OASIS, W3C and DMTF respectively). And today the Eclipse COSMOS project announced the release of COSMOS 1.1 which implements them all. The WSDM/WS-Management convergence is the only one that didn’t quite go according to the plan but it is about to come out as a standard too (in a pared-down form).

Bad news: nobody cares. We’ve moved on to “private clouds”.

Having been involved with these specifications in various degrees (a little bit on SDD, a fair amount on SML and a lot on CMDBf and WSDM/WS-Management) I am not as detached as my sarcastic tone may suggest. But as they say in action movies, “don’t let sentiments get in the way of the mission”.

There is still a chance to reuse parts of this stack (e.g. the CMDBf query language) and there are lessons to learn from our errors. The over-promising, the technical misjudgments, the political bickering, the lack of concrete customer validation, etc. To some extent this work was also victim of collateral damages from the excesses of WS-* (I am looking at you WS-Addressing). We also failed to notice the rise of the hypervisor in our peripheral vision.

I tried to capture some important lessons in this post-mortem. For the edification of the cloud generation. I also see a pendulum in action. Where we over-engineered I now see some under-engineering (overly granular interaction models, overemphasis on the virtual machine as the unit of everything, simplistic constraint models, underestimation of config/patching issues…). Things will come around and may eventually look familiar (suggested exercise: compare PubSubHubBub with WS-Notification).

As long as each iteration gets us closer to the goal things are good.

See you in 2012. Same place, same day, same time.


Filed under Application Mgmt, Automation, Cloud Computing, CMDB, CMDB Federation, CMDBf, Desired State, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Modeling, Protocols, SML, Specs, Standards, Utility computing, WS-Management