Archive for the ‘Application Mgmt’ Category

Post

Cloud + proprietary software = ♥

In Amazon,Application Mgmt,Automation,Business,Cloud Computing,Everything,Open source,Oracle,Utility computing,Virtual appliance on December 2, 2009 by @vambenepe

When I left HP for Oracle, in the summer of 2007, a friend made the argument that Cloud Computing (I think we were still saying “Utility Computing” at the time) would be the death of proprietary software.

There was a lot to support this view. EC2 was one year old and its usage was overwhelmingly based on open source software (OSS). For proprietary software, there was no clear understanding of how licensing terms applied to Cloud deployments. Not only would most sales reps not know the answer, back then they probably wouldn’t have comprehended the question.

Two and a half years later… well it doesn’t look all that different at first blush. EC2 seems to still be a largely OSS-dominated playground. Especially since the most obvious (though not necessarily the most accurate) measure is to peruse the description/content of public AMIs. They are (predictably since you can’t generally redistribute proprietary software) almost entirely OSS-based, with the exception of the public AMIs provided by software vendors themselves (Oracle, IBM…).

And yet the situation has improved for usage of proprietary software in the Cloud. Most software vendors have embraced Cloud deployments and clarified licensing issues (taking the example of Oracle, here is its Cloud Computing Center page, an overview of the licensing policy for Cloud deployments and the AWS/Oracle partnership page to encourage the use of Oracle software on EC2; none of this existed in 2007).

But these can be called reactive adaptations. At best (depending on the variability of your load and whether you use an Unlimited License Agreement), these moves have brought the “proprietary vs. OSS” debate in the Cloud back to the same parameters as for on-premise deployments. They have simply corrected the initial challenge of not even knowing how to license proprietary software for Cloud deployments.

But it’s not stopping here. What we are seeing is some of the parameters for the “proprietary vs. OSS” debate become more friendly towards proprietary software in the Cloud than on-premise. I am referring to the emergence of EC2 instances for which the software licenses are included in the per-hour rate for server instances. This pretty much removes license management as a concern altogether. The main example is Windows EC2 instances. As a user, you don’t have to deal with any Windows license consideration. You just request a machine and use it. Which of course doesn’t mean you are not paying for Windows. These Windows instances are more expensive than comparable Linux instances (e.g. $0.34 versus $0.48 per hour in the case of a “standard/large” instance) and Amazon takes care of paying Microsoft.

The removal of license management as a concern may not make a big difference to large corporations that have an Unlimited License Agreement, but for smaller companies it may take a chunk out of the reasons to use open source software. Not only do you not have to track license usage (and renewal), you never have to spend time with a sales rep. You don’t have to ask yourself at what point in your beta program you’ve moved from a legitimate use of the (often free) development license to a situation in which you need a production license. Including the software license directly in the cost of the base Cloud resource (e.g. the virtual machine instance) makes planning (and auto-scaling) easier: you use the same algorithm as in the “free license” situation, just with a different per hour cost. The trade-off becomes quantitative rather than qualitative. You can trade a given software stack against a faster CPU or more memory or more storage, depending on which combination serves your needs better. It doesn’t matter if the value you get from the instance comes from the software in the image or the hardware. This moves IaaS closer to PaaS (force.com doesn’t itemize your bill between hardware cost and software cost). Anything that makes IaaS more like PaaS is good for IaaS.

From an earlier post, about virtual appliances:

As with all things computer-related, the issue is going to get blurrier and then irrelevant . The great thing about software is that there is no solid line. In this case, we will eventually get more customized appliances (via appliance builders or model-driven appliance generation) blurring the line between installed software and appliance-based software.

I was referring to a blurring line in terms of how software is managed, but it’s also true in terms of how it is licensed.

There are of course many other reasons (than license management) why people use open source software rather than proprietary. The most obvious being the cost of the license (which, as we have seen, doesn’t go away but just gets incorporated in the base Cloud instance rate). Or they may simply prefer a given open source product independently of any licensing aspect. Some need access to the underlying code, to customize/improve it for their purpose. Or they may be leery of depending on one entity for the long-term viability of their platform. There may even be some who suspect any software that they don’t examine/compile themselves to contain backdoors (though these people are presumably not candidates for Cloud deployments in he first place). Etc. These reasons remain pretty much unchanged in the Cloud. But, anecdotally at least, removing license management concerns from both manual and automated tasks is a big improvement.

If Cloud providers get it right (which would require being smarter than wireless service providers, a low bar) and software vendors play ball, the “proprietary vs. OSS” debate may become more favorable to proprietary software in the Cloud than it is on-premise. For the benefit of customers, software vendors and Cloud providers. Hopefully Amazon will succeed where telcos mostly failed, in providing a convenient application metering/billing service for 3rd-party software offered on top of their infrastructural services (without otherwise getting in the way). Anybody remembers the Minitel? Today we’d call that a “Terminal as a Service” offering and if it did one thing well (beyond displaying green characters) it was billing. Which reminds me, I probably still have one in my parent’s basement.

[Note: every page of this blog mentions, at the bottom, that "the statements and opinions expressed here are my own and do not necessarily represent those of Oracle Corporation" but in this case I will repeat it here, in the body of the post.]

[UPDATED 2009/12/29: The Register seems to agree. In fact, they come close to paraphrasing this blog entry:

"It's proprietary applications offered by enterprise mainstays such as Oracle, IBM, and other big vendors that may turn out to be the big winners. The big vendors simply manipulated and corrected their licensing strategies to offer their applications in an on-demand or subscription manner.

Amazonian middlemen

AWS, for example, now offers EC2 instances for which the software licenses are included in the per-hour rate for server instances. This means that users who want to run Windows applications don't have to deal with dreaded Windows licensing - instead, they simply request a machine and use it while Amazon deals with paying Microsoft."]

[UPDATED 2010/1/25: I think this "Cloud as Monetization Strategy for Open Source" post by Geva Perry (based on an earlier post by Savio Rodrigues ) confirms that in the Cloud the line between open source and proprietary software is thinning.]

[UPDATED 2010/11/12: Related and interesting post on the AWS blog: Cloud Licensing Models That Exist Today]

Post

Oracle Real User Experience Insight 6.0

In Application Mgmt,Everything,IT Systems Mgmt,Oracle on December 1, 2009 by @vambenepe

Oracle just released version 6.0 of Real User Experience Insight (friends call it RUEI), which is part of the Enterprise Manager portfolio. As the name indicates, it captures and presents in great details the experience of actual users interacting with your application. This is real traffic, not synthetic probes. It’s is a mature product, which originally came from the Moniforce acquisition two years ago.

One way to classify the improvements in this version is to sort them based on who they are exciting for:

Exciting for techies

The ability to link in context from RUEI to diagnostic tools in Enterprise Manager. For example, going from a slow JSP in RUEI to a view of its role in the overall composite application. Or to a deep-dive in Java diagnostic.

Exciting for Oracle applications administrators

Many improvements (in the form of updated “Accelerators”) for using RUEI to manage Oracle EBS, PeopleSoft, Siebel and JD Edwards. Including EBS Forms support in socket mode without Chronos (those who know what this means rejoice, others can safely ignore).

Exciting for business and marketing people

The full capture and replay of user sessions. The ease of reproducing errors and seeing exactly what your users do and experience. Terrifyingly edifying at times.

Post

Can your hypervisor radio for air support?

In Application Mgmt,Azure,Cloud Computing,Everything,IT Systems Mgmt,Mgmt integration,Utility computing,Virtualization on November 25, 2009 by @vambenepe

As I was reading about Microsoft Azure recently, a military analogy came to my mind. Hypervisors are tanks. Application development and runtime platforms compose the air force.

Tanks (and more generally the mechanization of ground forces) transformed war in the 20th century. They multiplied the fighting capabilities of individuals and changed the way war was fought. A traditional army didn’t stand a chance against a mechanized one. More importantly, a mechanized army that used the new tools with the old mindset didn’t stand a chance against a similarly equipped army that had rethought its strategy to take advantage of the new capabilities. Consider France at the beginning of WWII, where tanks were just canons on wheels, spread evenly along the front line to support ground troops. Contrast this with how Germany, as part of the Blitzkrieg, used tanks and radios to create highly mobile – and yet coordinated – units that caused havoc in the linear French defense.

Exercise for the reader who wants to push the analogy further:

  • Describe how Blitzkrieg-style mobility of troops (based on tanks and motorized troop transports) compares to Live Migration of virtual machines.
  • Describe how the use of radios by these troops compares to the use of monitoring and control protocols to frame IT management actions.

Tanks (hypervisor) were a game-changer in a world of foot soldiers (dedicated servers).

But no matter how good your tanks are, you are at a disadvantage if the other party achieves air superiority. A less sophisticated/numerous ground force that benefits from strong air support is likely to prevail over a stronger ground force with no such support. That’s what came to my mind as I read about how Azure plans to cover the IaaS layer, but in the context of an application-and-data-centric approach. Where hypervisors are not left to fend for themselves based on the limited view of the horizion from the periscope of their turrets but rather orchestrated, supported (and even deployed) from the air, from the application platform.

C-130 tank airdrop

(Yes, I am referring to the Azure vision as it was presented at PDC09, not necessarily the currently available bits.)

Does your Cloud vendor/provider need an air force?

Exercise for the reader who wants to push the analogy to the stratosphere:

  • Describe how business logic/process, business transaction management and business intelligence are equivalent to satellites, surveying the battlefield and providing actionable intelligence.

The new Cloud stack (“military-cloud complex” version):

cloud-military-stack

[Note: I have no expertise in military history (or strategy) beyond high school classes about WWI and WWII, plus a couple of history books and a few war movies. My goal here is less to be accurate on military concerns (though I hope to be) than to draw an analogy which may be meaningful to fellow IT management geeks who share my level of (in)expertise in military matters. This is just yet another way in which I try to explain that, for Clouds as for plain old IT management, "it's the application, stupid".]

Post

Desirable technical characteristics of PaaS

In Application Mgmt,Cloud Computing,Everything,IT Systems Mgmt,Manageability,Mgmt integration,Middleware,PaaS,Utility computing on November 10, 2009 by @vambenepe

PaaS can most dramatically improve the IT experience in four areas:

  • Hosting/operations efficiency
  • Application-centric management
  • Development productivity
  • Security

To do so, there are technical characteristics that PaaS frameworks should eventually exhibit. These are not technical characteristics of a given PaaS container, they are shared characteristics that go across all container types, no matter what the operational capabilities of the containers are.

Here is a rough and unorganized list of the desirable characteristics (meta-capabilities) of PaaS Cloud containers:

  • An application component model that supports deployment/configuration across all PaaS container types.
  • Explicit interactions/invocations between application components (resilient connections between component: infrastructure-level retry/reroute)
  • Uniform and consistent request tracking across all components. Ability to intercept component-to-component communication.
  • Short-term (or externally persisted) state so that all instances can be quickly redirected out of any one node.
  • Subset of platform management interface exposed to consumer, along with out of the box application management. Application metrics consolidated at application level rather than node level.
  • Consistent, model-based application management interface across all container types. Hooks for component code to provide its manageability in the same framework.
  • Minimal footprint of any container node for limited patching requirements.
  • Assistance for debugging platform-hosted code (see this entry).
  • No encroachment of container technology on application contract (e.g. no forced URL structure).
  • Application uniformly scalable to the limit of the underlying hardware (no imposed partitioning).
  • Shared authentication / authorization / auditing across containers.
  • Minimum contract/interface exposed by each container.
  • Governance of application services, aligned (in model/protocols) with the container management interfaces.
  • [UPDATE: need to add metering+billing as William Louth pointed out in a comment]

This applies across the board to public, private and hybrid PaaS. The distinctions between these delivery models are real but at a different level. The important thing is that the PaaS administrator is different from the application administrator in all cases. On the other hand, most of these technical characteristics are not achievable for lower-level Cloud resources (like virtual hosts and low-level storage) which is why the IaaS form of Cloud leaves the Cloud promise only partially fulfilled.

Post

Enumeration of PaaS container types

In Application Mgmt,Cloud Computing,Everything,Middleware,PaaS,Utility computing on November 9, 2009 by @vambenepe

What do we need from an on-demand application platform for enterprise software? Here is a proposed classification of container types on which you can create/scale/manage applications. Your application is made of modules that run on these containers (e.g. one app may have two “synchronous web request” modules, one “structured data service” module and one “scheduled job” module). All tied together using an application component model that dovetails with the capabilities of the different containers. The proposed containers map to the following capabilities:

  • Synchronous web request processing
  • Long-lived (persisted) process execution with introspectable/declarative flow
  • Event processing
  • Persistence (a few different types: structured vs. buckets/files vs. object cache)
  • Batch/background/scheduled/queued processing
  • Possibly some advanced Web/mobile UI (portals, human task flows…)
  • User management (self-contained or as an abstraction layer for other systems)
  • [UPDATED 2009/11/19: I should have included pluggable protocol/format adapters.]

While new applications may be built to run purely on such a platform, you will not be able to run your IT solely on it anytime in the foreseeable future. So if you are thinking about it from the perspective of the entire universe of virtualization containers needed to support your IT system, you’ll need to add lower-level container types, such as:

  • Guest hosts (typically from hypervisors)
  • Low-level (e.g. block) storage
  • Networking between them

Another way to think about it is that Cloud/Utility computing is about having pools of resources dynamically allocated. There needs to be few enough pools that each pool is large enough and used across enough consumers to derive efficiencies from the act of sharing. But there needs to be enough pool types that applications have a complete infrastructure to run on that lets them abstracts away what is not business logic. This list is an attempt at this middle ground.

This is a classification of containers, not a description. There are many different ways to realize each of them. The synchronous  web request processing could look like a servlet engine, a Python handler or a PHP page. The persisted execution could be a BPEL engine or some other state machine definition/execution engine. The structured data interface could look like SQL, XQuery or SPARQL, etc… In addition, more specialized application infrastructure elements (e.g. video streaming, analytics) might enter the picture for some applications.

I hope to see the discussion move beyond “IaaS vs. PaaS” towards talking about more specific container types that are needed/supported by the different virtualization stacks. My application doesn’t care if the file container is considered IaaS or PaaS.

The next post will list desired characteristics of the PaaS environment (meta-capabilities that go across all the container types in the first list but may not be available from the lower-level containers in the second list).

Post

Would you like some management with that appliance?

In Application Mgmt,Automation,Desired State,Everything,IT Systems Mgmt,Manageability,Oracle,Oracle Open World,OVM,PaaS,Virtual appliance,WS-Management on November 2, 2009 by @vambenepe

Andi Mann recently wrote an interesting post about virtual appliances . He uses the domain name pleasediscuss.com for his blog so I figured I’d do just that. More specifically, I have three comments on his article.

Opaque or transparent appliance

Andi’s concerns about the security and management problems posed by virtual appliances are real, but he seems to assume that the content of the appliance is necessarily opaque to the customer and under the responsibility of the appliance provider. Why can’t a virtual appliance be transparent in the sense that the customer is able to efficiently manage at least some aspects of the software installed on it? “You can’t put agents on most virtual appliances, they don’t come with WMI, and most have only a GUI for management” says Andi. Why can’t an appliance come with an agent (especially in these days of consolidation where many vendors provide many layers of the stack – hypervisor / OS / application container / application / management tools – including their agent)? Why can’t it implement a standard management API (most servers nowadays implement WBEM, WS-Management and/or IPMI pre-boot – on the motherboard – which is a lot more challenging to do than supporting a similar protocol in a virtual appliance). Andi is really criticizing the current offering more than the virtual appliance model per se and in this I can join him.

Let me put it differently, since this is probably just a question of definition: what would Andi call a virtual appliance that does expose management APIs for its infrastructure (e.g. WS-Management for the OS, JMX for the java stack) or that comes with an agent (HP, IBM, BMC, Oracle…) installed on it?

Such an appliance (let’s call it a “transparent virtual appliance” for now) doesn’t provide all the commonly claimed benefits of an appliance (zero config/admin) but as Andi points out these benefits come with major intrinsic drawbacks. A transparent virtual appliance still drastically simplifies installation (especially useful for test/dev/demo/POC). It doesn’t entirely free you of monitoring and configuration but at least it provides you with a very consistent and controlled starting point, manageable from the start (no need to subsequently install an agent). In addition, it can be made “just enough” (just enough OS, just enough app server…) to require a lot less maintenance than an application stack that you assemble yourself out of generic parts. We’ll always have trade offs between how optimized/customized it is versus how uniform your overall environment can be, but I don’t see the use of an appliance as a delivery mechanism as necessarily cornering you into a completely opaque situation, from a management perspective.

Those who attended Oracle Open World a few weeks ago were treated to an example of such an appliance, if they attended any of the sessions that covered Oracle’s Appliance Builder (the main one was, I believe, Virtualizing Oracle Fusion Middleware in the Modern Data Center, in case you have access to the Open World On Demand replay and slides). I believe it’s probably the same content that @jayfry3 was shown when he tweeted about “Oracle is demoing their private cloud self-service app”. These appliances are not at all opaque from a management perspective. To the contrary, they are highly manageable, coming with an Enterprise Manager agent installed that can manage everything in the appliance (and when that “everything” doesn’t include the OS, it’s because there isn’t one thanks to JRockit Virtual Edition, making things slimmer, faster, safer and more manageable). And of course the OVM-based environment in which you deploy these appliances is also managed by Enterprise Manager. OK, my point here wasn’t to go into marketing mode, but this is cool stuff and an example of what virtual appliances should be. BTW, this was also demonstrated during Hasan Rizvi’s keynote at OpenWorld, including the management of these systems through Enterprise Manager.

In the long run it’s irrelevant

As with all things computer-related, the issue is going to get blurrier and then irrelevant . The great thing about software is that there is no solid line. In this case, we will eventually get more customized appliances (via appliance builders or model-driven appliance generation) blurring the line between installed software and appliance-based software.

Waiting for PaaS

Towards the end of his post, Andi paints an optimistic vision of the future: “I also think that virtual appliances have a bright future – but in some ways I continue to see them as a beta version of what could (or should) come next.  By adding in capabilities for responsible and accountable management, they could form the basis of more fully-functional virtual service management containers. These in turn could form the basis of elastic, mobile, network-deployed, responsible cloud appliances that deliver complete end-to-end service management without regard to physical location or domain of control.”

I mostly agree with this vision, though when I describe it it is in the guise of a PaaS platform. Where your appliance (which today goes from the OS all the way to the app) has shrunk to an application template that you deploy in the PaaS environment (rather than in a hypervisor). If/when the underlying PaaS environment has reached the right level of management automation you get all the benefits of an appliance while maintaining the consistency of your environment and its adherence to your management policies (because the environment is the PaaS platform and its management is driven from your policies).

[As is often the case, this started as a comment (on Andi's blog) and quickly outgrew that environment, leading to this new post. Plus, Andi's blog is brand new and seems to be well worth spreading the word about (Andi himself is under-marketing it).]

Post

OWL news you can use

In Application Mgmt,Everything,IT Systems Mgmt,Mgmt integration,Modeling,OWL,RDF,Semantic tech,Specs,Standards on October 28, 2009 by @vambenepe

The W3C released OWL 2 today. Most readers of this blog are IT management people (whether they call it “cloud computing” or “boring old system management”) and don’t follow RDF, OWL, SPARQL etc too closely (if at all). Yet there is a lot of potential value in using these technologies for IT management, so I thought it might be helpful to provide some practical resources on the topic. I have selected articles that cover the special (some may say “twisted”) approach of using OWL and its friends for validation rather than just inference, as this use case is very relevant to IT management.

Of course you can also go to the W3C standard itself, starting with the overview of OWL 2.

Just so you don’t feel lonely if you decide to explore this path, have a look at Elastra’s sexy technology stack. ECML, EDML and EMML are all defined as OWL ontologies.

Post

Yes you can read the OSGi specification

In Application Mgmt,Everything,Off-topic,OSGi on October 27, 2009 by @vambenepe

You know what I like the best about OSGi? That it doesn’t put the bar too high for architects. At first I was a bit intimidated by the size  (338 pages for the “core specification”, 862 pages for the “service compendium”) and the fact that I had to look up “compendium”. But then they put me right at ease:

“Architects should focus on the introduction of each subject. This introduction contains a general overview of the subject, the requirements that influenced its design, and a short description of its operation as well as the entities that are used. The introductory sections require knowledge of Java concepts like classes and interfaces, but should not require coding experience.”

I am like so totally overqualified for my job. Hell, I even know what packages are.

(from the recently released OSGi version 4.2.)

Post

Cloud platform patching conundrum: PaaS has it much worse than IaaS and SaaS

In Application Mgmt,Cloud Computing,Everything,Google App Engine,Governance,ITIL,Manageability,Mgmt integration,PaaS,SaaS,Utility computing,Virtualization on October 15, 2009 by @vambenepe

The potential user impact of changes (e.g. patches or config changes) made on the Cloud infrastructure (by the Cloud provider) is a sore point in the Cloud value proposition (see Hoff’s take for example). You have no control over patching/config actions taken by the provider, any of which could potentially affect you. In a traditional data center, you can test the various changes on specific applications; you don’t have to apply them at the same time on all servers; and you can even decide to skip some infrastructure patches not relevant to your application (“if it aint’ broken…”). Not so in a Cloud environment, where you may not even know about a change until after the fact. And you have no control over the timing and the roll-out of the patch, so that some of your instances may be running on patched nodes and others may not (good luck with troubleshooting that).

Unfortunately, this is even worse for PaaS than IaaS. Simply because you seat on a lot more infrastructure that is opaque to you. In a IaaS environment, the only thing that can change is the hardware (rarely a cause of problem) and the hypervisor (or equivalent Cloud OS). In a PaaS environment, it’s all that plus whatever flavor of OS and application container is used. Depending on how streamlined this all is (just enough OS/AS versus a traditional deployment), that’s potentially a lot of code and configuration. Troubleshooting is also somewhat easier in a IaaS setup because the error logs are localized (or localizable) to a specific instance. Not necessarily so with PaaS (and even if you could localize the error, you couldn’t guarantee that your troubleshooting test runs on the same node anyway).

In a way, PaaS is squeezed between IaaS and SaaS on this. IaaS gets away with a manageable problem because the opaque infrastructure is not too thick. For SaaS it’s manageable too because the consumer is typically either a human (who is a lot more resilient to change) or a very simple and well-understood interface (e.g. IMAP or some Web services). Contrast this with PaaS where the contract is that of an application container (e.g. JEE, RoR, Django).There are all kinds of subtle behaviors (e.g, timing/ordering issues) that are not part of the contract and can surface after a patch: for example, a bug in the application that was never found because before the patch things always happened in a certain order that the application implicitly – and erroneously – relied on. That’s exactly why you always test your key applications today even if the OS/AS patch should, in theory, not change anything for the application. And it’s not just patches that can do that. For example, network upgrades can introduce timing changes that surface new issues in the application.

And it goes both ways. Just like you can be hurt by the Cloud provider patching things, you can be hurt by them not patching things. What if there is an obscure bug in their infrastructure that only affects your application. First you have to convince them to troubleshoot with you. Then you have to convince them to produce (or get their software vendor to produce) and deploy a patch.

So what are the solutions? Is PaaS doomed to never go beyond hobbyists? Of course not. The possible solutions are:

  • Write a bug-free and high-performance PaaS infrastructure from the start, one that never needs to be changed in any way. How hard could it be? ;-)
  • More realistically, narrowly define container types to reduce both the contract and the size of the underlying implementation of each instance. For example, rather than deploying a full JEE+SOA container componentize the application so that each component can deploy in a small container (e.g. a servlet engine, a process management engine, a rule engine, etc). As a result, the interface exposed by each container type can be more easily and fully tested. And because each instance is slimmer, it requires fewer patches over time.
  • PaaS providers may give their users some amount of visibility and control over this. For example, by announcing upgrades ahead of time, providing updated nodes to test on early and allowing users to specify “freeze” periods where nothing changes (unless an urgent security patch is needed, presumably). Time for a Cloud “refresh” in ITIL/ITSM-land?
  • The PaaS providers may also be able to facilitate debugging of infrastructure-related problem. For example by stamping the logs with a version ID for the infrastructure on the node that generated the log entry. And the ability to request that a test runs on a node with the same version. Keeping in mind that in a SOA / Composite world, the root cause of a problem found on one node may be a configuration change on a different node…

Some closing notes:

  • Another incarnation of this problem is likely to show up in the form of PaaS certification. We should not assume that just because you use a PaaS you are the developer of the application. Why can’t I license an ISV app that runs on GAE? But then, what does the ISV certify against? A given PaaS provider, e.g. Google? A given version of the PaaS infrastructure (if there is such a thing… Google advertises versions of the GAE SDK, but not of the actual GAE runtime)? Or maybe a given PaaS software stack, e.g. the Oracle/Microsoft/IBM/VMWare/JBoss/etc, meaning that any Cloud provider who uses this software stack is certified?
  • I have only discussed here changes to the underlying platform that do not change the contract (or at least only introduce backward-compatible changes, i.e. add APIs but don’t remove any). The matter of non-compatible platform updates (and version coexistence) is also a whole other ball of wax, one that comes with echoes of SOA governance discussions (because in PaaS we are talking about pure software contracts, not hardware or hardware-like contracts). Another area in which PaaS has larger challenges than IaaS.
  • Finally, for an illustration of how a highly focused and specialized container cuts down on the need for config changes, look at this photo from earlier today during the presentation of JRockit Virtual Edition at Oracle Open World. This slide shows (in font size 3, don’t worry you’re not supposed to be able to read), the list of configuration files present on a normal Linux instance, versus a stripped-down (“JeOS”) Linux, versus JRockit VE.


By the way, JRockit VE is very interesting and the environment today is much more favorable than when BEA first did it, but that’s a topic for another post.

[UPDATED 2009/10/22: For more on this (in an EC2-centric context) see section 4 ("service problem resolution") of this IBM paper. It ends with "another possible direction is to develop new mechanisms or APIs to enable cloud users to directly and automatically query and correlate application level events with lower level hardware information to better identify the root cause of the problem".]

Post

The future (2006 version), has arrived

In Application Mgmt,Automation,Cloud Computing,CMDB,CMDB Federation,CMDBf,Desired State,Everything,IT Systems Mgmt,Manageability,Mgmt integration,Modeling,Protocols,SML,Specs,Standards,Utility computing,WS-Management on September 24, 2009 by @vambenepe

Remember 2006? Things were starting to fall into place for IT management integration and automation:

  • SDD was already on its way to cleanly describe/package/manage the lifecycle of simple and composite applications alike,
  • the first version of SML came out to capture all the relevant constraints of complex and composite systems and open the door to “desired-state management”,
  • the CMDBf effort was started to seamlessly integrate all sources of configuration and provide a bird-eye view of your entire IT infrastructure, and
  • the WSDM/WS-Management convergence/reconciliation was announced and promised to free management consoles from supporting many resource discovery, collection and control mechanisms and from having platform/library dependencies between the manager and its targets.

It looked like we were a year or two from standardization on all these and another year or two from shipping implementations. Things were looking good.

Good news: the schedule was respected. SDD, SML and CMDBf are now all standards (at OASIS, W3C and DMTF respectively). And today the Eclipse COSMOS project announced the release of COSMOS 1.1 which implements them all. The WSDM/WS-Management convergence is the only one that didn’t quite go according to the plan but it is about to come out as a standard too (in a pared-down form).

Bad news: nobody cares. We’ve moved on to “private clouds”.

Having been involved with these specifications in various degrees (a little bit on SDD, a fair amount on SML and a lot on CMDBf and WSDM/WS-Management) I am not as detached as my sarcastic tone may suggest. But as they say in action movies, “don’t let sentiments get in the way of the mission”.

There is still a chance to reuse parts of this stack (e.g. the CMDBf query language) and there are lessons to learn from our errors. The over-promising, the technical misjudgments, the political bickering, the lack of concrete customer validation, etc. To some extent this work was also victim of collateral damages from the excesses of WS-* (I am looking at you WS-Addressing). We also failed to notice the rise of the hypervisor in our peripheral vision.

I tried to capture some important lessons in this post-mortem. For the edification of the cloud generation. I also see a pendulum in action. Where we over-engineered I now see some under-engineering (overly granular interaction models, overemphasis on the virtual machine as the unit of everything, simplistic constraint models, underestimation of config/patching issues…). Things will come around and may eventually look familiar (suggested exercise: compare PubSubHubBub with WS-Notification).

As long as each iteration gets us closer to the goal things are good.

See you in 2012. Same place, same day, same time.

Post

Thoughts on the “Simple Cloud API”

In Application Mgmt,Cloud Computing,Everything,IBM,Manageability,Mgmt integration,Microsoft,Middleware,Open source,Portability,Utility computing on September 22, 2009 by @vambenepe

PHP developers with Cloud aspirations rejoice! Zend has announced a PHP toolkit (called the Simple Cloud API project) to abstract and access application-level Cloud services. This is not just YACA (yet another Cloud API), as there are interesting differences between this and all the other Cloud toolkits out there.

First it’s PHP, which was not covered by the existing toolkits. Considering how many web applications are written in PHP (including the one that serves this very blog) this may seem strange, until you realize that most Cloud toolkits out there are focused on provisioning/managing low-level compute resources of the IaaS kind. Something that is far out of PHP’s sweetspot and much more practically handled with Java, Python, Ruby or some .NET language accessible via PowerShell.

Which takes us to the second, and arguably most interesting, characteristic of this toolkit: it is focused on application-level Cloud services (files, documents and queues for now) rather than infrastructure-level. In other word, it’s the first (to my knowledge) PaaS toolkit.

I also notice that Zend has gotten endorsements from IBM, Microsoft, Nirvanix, Rackspace and GoGrid. The first two especially seem to have impressed InfoWorld. Let’s keep in mind that at this point all we are talking about are canned quotes in a press release. Which rank only above politician campaign promises as predictor of behavior. In any case that can’t be the full extent of IBM and Microsoft’s response to the VMWare/Cisco push on IaaS standards. But it may suggest that their response will move the battlefield to include PaaS, which would be a smart move.

Now for a few more acerbic comments:

  • It has “simple” in its name, like SOAP (as Pete Lacey famously lampooned). In the long term this tends to negatively correlate with simplicity, just like the presence of “democratic” in the official name of a country does not bode well for actual democracy.
  • Please, don’t shorten “Simple Cloud API” to SCA which is already claimed in a (potentially) closely related field.
  • Reuven Cohen is technically correct to see it as “a way to create other higher level programmatic API interfaces such as REST or SOAP using an easy, yet portable PHP programming environment”. But pay attention to how many turtles are on this pile: the native provider API, the adapter to the “simple cloud API”, the SOAP or REST remote API and the consuming application’s native API. How much real isolation are you getting when you build your house on such a wobbly foundation

[UPDATE: Comments from someone in the know:  a programmer working on adding Azure support for this Simple Cloud API project.]

Post

Look Ma, no hypervisor!

In Application Mgmt,Cloud Computing,Everything,Google App Engine,Implementation,IT Systems Mgmt,Middleware,Utility computing,Virtualization,VMware,XenSource on September 21, 2009 by @vambenepe

Encouraged by hypervisor vendors, the confusion between virtualization and Cloud Computing is rampant. In the industry, the term “virtualization” (and its corollary, “virtual machine”) is used in so many different ways that it has lost all usefulness. For a recent example, read the introduction of this SNIA/OGF white paper (on Cloud Storage) which asserts that “the new technology underlying this is the system virtual machine that allows multiple instances of an operating system and associated applications to run on single physical machine. Delivering this over the network, on demand, is termed Infrastructure as a Service (IaaS)”.

In fact, even IaaS-type Cloud services don’t imply the use of hypervisors.

We need to decouple the Cloud interface/contract (e.g. “what are the types of resources that can I provision on demand? hosts, app servers, storage capacity, app services…”) from the underlying implementation (e.g. “are hypervisors used by the Cloud provider?”). At the risk of spelling out things that may be obvious to many readers of this blog, here is a simplified matrix of Cloud Computing systems, designed to illustrate that all combinations of interface and implementation are possible and in many cases even reasonable.

IaaS interface PaaS interface
Hypervisor used Yes! (see #1) Yes! (see #2)
Hypervisor not used Yes! (see #3) Yes! (see #4)

#1: IaaS interface, hypervisor-based implementation

This is a very common approach these days, both in public Clouds (EC2, Rackspace and presumably at some point the VMWare vCloud Express service providers) and private Clouds (Citrix, Sun, Oracle, Eucalyptus, VMWare…). Basically, you take a bunch of servers, put hypervisors on all of them and make VMs running on these hypervisors available to the Cloud customers.

But despite its predominance, this is not the only path to a Cloud, not even to an IaaS (e.g. “x86 hosts on demand”) Cloud. The following three other scenarios are all valid too.

#2: PaaS interface, hypervisor-based implementation

This is the road SpringSource has been on, first with Cloud Foundry (using AWS EC2 which is based on the Xen hypervisor) and presumably soon on top of VMWare.

#3: IaaS interface, no hypervisor in the implementation

Let’s remember that the utility computing vision (before the term fell in desuetude in favor of “cloud”) has been around before x86 hypervisors were so common. Take Loudcloud as an illustration. They were building what is now called a “public Cloud” starting back in 1999 and not using any hypervisor. Just bare metal provisioning and advanced provisioning automation software. Then they sold the hosting part to EDS (now HP) and only kept the software, under the name Opsware (now HP too, incidentally). That software was meant to create what we now call a “private Cloud”. See this old DCML announcement as one example of the Opsware vision. And no hypervisor was harmed in the making of this movie.

At the current point in time, the hardware (e.g. multiple cores, shared memory) and software (hypervisors, legacy apps) environment is such that hypervisor-based solutions seem to have an edge over those based on automated provisioning/configuration alone. But these things tend to change quickly in our industry… Especially if you factor in non-technical considerations like compliance, fear of data leakage and the risk of having the hardware underlying your application seized because of an investigation involving another tenant…

And this is not going into finner techno-philosophical points about the different types of hypervisors. Not to mention mainframe LPARs… One could build a hypervisor-free IaaS solution on these.

To some extent, you may even put the “pwned” machines (in a botnet) in this “IaaS with no hypervisor” category (with the small difference that what’s being made available is an x86 with an OS, typically Windows, already installed). If you factor out externalities (like the FBI breaking down your front door at 6:00AM) this approach has claims as the most cost-effective form of Cloud computing available today… Solaris zones are another example of possible foundation for a hypervisor-free IaaS-like offering (here too, with an OS rather than a “raw host” as the interface).

#4: PaaS interface, no hypervisor in the implementation

In the public sphere, this corresponds to Google App Engine.

In the private sphere, several companies have built it themselves on top of WebLogic, by adding some level of “on-demand” application provisioning in order to streamline the relationship between the IT group running the servers and the business groups who want to deploy applications on them. Something that one should ideally be able to buy rather than build.

Waiting for the question to become irrelevant

Like most deeply-ingrained confusions, the conflation of virtualization and Cloud Computing won’t be dispelled as much as made irrelevant. The four categories enumerated in this post are a point-in-time view of a continuously evolving system. What may start today as a bundle of a hypervisor, an OS and an app server may become a somewhat monolithic “PaaS engine” over time as the components are more tightly integrated. That “engine” may have memory isolation mechanisms that look a lot like a hypervisor. But it may not be able to host a generic OS. In the same way that whales don’t have fingers and toes and yet they are still very much apparent in their skeleton.

[UPDATED 2009/10/8: A real-life example of #3! On-demand servers via bare metal provisioning (via Sam). No hypervisor in the picture. See also here.]

[UPDATED 2009/12/29: Another non-hypervisor Cloud provider! NewServers. Here is their API. And a Q&A.]

Post

VMWare publishes (and submits) vCloud API

In Application Mgmt,Automation,Cloud Computing,DMTF,Everything,IT Systems Mgmt,Mgmt integration,Protocols,REST,Specs,Standards,Utility computing,Virtualization,VMware on September 2, 2009 by @vambenepe

VMWare published its vCloud API yesterday (it was previously only available to a few partners) and submitted it to the DMTF, as had been previously announced. So much for my speculations involving IBM.

It may be time to update the Cloud API comparison. After a very quick first pass, vCloud looks quite similar to the Sun Cloud API (that’s a compliment). For example, they both handle long-lived operations via a “202 Accepted” complemented by a resource that represents the progress (“status” for Sun, “task” for vCloud). A very visible (but not critical) difference is the use of JSON (Sun) versus XML (vCloud).

As expected, OVF/OVA is central to vCloud. More once I have read the whole specification.

In any case, things are going to get interesting in the DMTF Cloud incubator. I there a path to adoption?Assuming that Amazon keeps sitting it out, what will the other Cloud vendors with an API (Rackspace, GoGrid, Sun…) do? I doubt they ever had plans/aspirations to own or even drive the standard, but how much are they willing to let VMWare do it? How much does Citrix/Xen want to steer standards versus simply implement them in the context of the Xen Cloud project? What about OGF/OCCI with which the DMTF is supposedly collaborating?How much support is VMWare going to receive from its service provider partners? How much traction does VMWare have with Cisco, HP (server division) and IBM on this? What are the plans at Oracle and Microsoft? Speaking of Microsoft, maybe it will at some point want its standard strategy playbook back. At least when VMWare is done using it.

Post

Thoughts on VMWare, SpringSource and PaaS

In Application Mgmt,Cloud Computing,Everything,Middleware,Spring,Utility computing,Virtualization,VMware on August 26, 2009 by @vambenepe

I am late to the party for  commenting on the upstream and downstream acquisitions involving SpringSource. I was away on vacations, but Rod Johnson obviously didn’t have too many holiday plans of his own in August.

First came the acquisition by VMWare. Then the acquisition of Cloud Foundry and the launch of its SpringSource reincarnation.

You’ve all read a lot about this already, so I’ll limit myself to a few bullet-point comments.

  • I was wrong to think at the time of the Hyperic acquisition that SpringSource would focus on app-centric management, BTM and transaction tracing more than Cloud computing and automation.
  • This move by VMWare helps me make some of the points I have been trying to make internally about Cloud computing.
  • This is a step in the progress from “fake machines” to true “virtual machines” (note to self: I may have to stop referring to “fake machines” as “VMWare-style virtual machines”).
  • Savio Rodrigues makes some interesting points, especially on the difference between a framework and a runtime.
  • Many people have hypervisors, a management console and middleware bits. If you are industry-darlings VMWare/SpringSource people seem more willing to assume that you can put them all in a bag, shake it and out comes a PaaS platform than if you are boring old Oracle, Microsoft or IBM. Fine. But let’s see how the (very real) potential gets delivered. Kudos to Adrian Colyer for taking a shot at describing it in a reasonable way, though there is still a fair amount of hand-waving… and already a drift towards the “I don’t need no cluster, I have a hypervisor and everything is a VM” reflex.
  • The “what does it mean for RedHat” angle seems to miss the point to me and be a byproduct of over-focusing on the “open source” aspect which is not all that relevant. This is more about Oracle, Microsoft and IBM than RedHat in my view.
  • Won’t it be fun when Cisco, VMWare and BMC are all one company and little SpringSource calls the shots from within (I have seen this happen more than once during my days at HP Software)?
  • I have no opinion on the question of whether VMWare over-paid or not. I’ll tell you in two years… :-)

Post

REST in practice for IT and Cloud management (part 2: configuration management)

In Application Mgmt,Automation,Cloud Computing,CMDB,CMDB Federation,CMDBf,DMTF,Everything,IT Systems Mgmt,Mashup,Mgmt integration,Modeling,REST,SOAP,SOAP header,Specs,Standards,Utility computing on July 28, 2009 by @vambenepe

What benefits does REST provide for configuration management (in traditional data centers and in Clouds)?

Part 1 of the “REST in practice for IT and Cloud management” investigation looked at Cloud APIs from leading IaaS providers. It examined how RESTful they are and what concrete benefits derive from their RESTfulness. In part 2 we will now look at the configuration management domain. Even though it’s less trendy, it is just as useful, if not more, in understanding the practical value of REST for IT management. Plus, as long as Cloud deployments are mainly of the IaaS kind, you are still left with the problem of managing the configuration of everything that runs of top the virtual machines (OS, middleware, DB, applications…). Or, if you are a glass-half-full person, here is another way to look at it: the great thing about IaaS (and host virtualization in general) is that you can choose to keep your existing infrastructure, applications and management tools (including configuration management) largely unchanged.

At first blush, REST is ideally suited to configuration management.

The RESTful Cloud APIs have no problem retrieving resource descriptions, but they seem somewhat hesitant in the way they deal with resource-specific actions. Tim Bray described one of the challenges in his well-considered Slow REST post. And indeed, applying REST to these “do something that may take some time and not result exactly in what was requested” scenarios is a lot less straightforward than when you’re just doing document/data retrieval. In contrast you’d think that applying REST to the task of retrieving configuration data from a CMDB or other configuration store would be a no-brainer. Especially in the IT management world, where we already have explicit resource models and a rich set of relationships defined. Let’s give each resource a URI that responds to HTTP GET requests, let’s turn the associations into hyperlinks in the resource presentation, let’s mint a MIME type to represent this format and we are out of the office in time for a 4:00PM tennis game when all the courts are available (hopefully our tennis partners are as bright as us and can get out early too). This “work smarter not harder” approach would allow us to present this list of benefits in our weekly progress report:

-1- A URI-based scheme makes the protocol independent of the resource topology, unlike today’s data stores that usually struggle to represent relationships between stores.

-2- It is simpler to code against than CIM-over-HTTP or WS-Management. It is cross-platform, unlike WMI or JMX.

-3- It makes it trivial to browse the configuration data from a Web browser (the resources themselves could provide an HTML representation based on content-type negotiation, or a simple transformation could generate it for the Web browser).

-4- You get REST-induced caching and scalability.

In the shower after the tennis game, it becomes apparent that benefit #4 is largely irrelevant for IT management use cases. That the browser in #3 would not be all that useful beyond simple use cases. That #2 is good for karma but developers will demand a library that hides this benefit anyway. And that the boss is going to say that he doesn’t care about #1either because his product is “the single source of truth” so it needs to import from the other configuration store, not reference them.

Even if we ignore the boss (once again) it only  leaves #1 as a practical benefit. Surprise, that’s also the aspect that came out on top of the analysis in part 1 (see “the API doesn’t constrain the design of the URI space” highlight, reinforced by Mark’s excellent comment on the role of hypertext). Clearly, there is something useful for IT management in this “hypermedia” thing. This will largely be the topic of part 3.

There are also quite a few things that this RESTification of the configuration management store doesn’t solve:

-1- The ability to query: “show me all the WebLogic instances that run on a Windows host and don’t have patch xyz applied”. You don’t have much of a CMDB if you can’t answer this. For an analogy, remember (or imagine) a pre-1995 Web with no search engine, where you can only navigate by starting from your browser home page and clicking through static links step by step, or through bookmarks.

-2- The ability to retrieve the configuration change history and to compare configurations across resources (or to a reference configuration).

This is not to say that these two features cannot be built on top of a RESTful IT resource model. Just that they are the real meat of configuration management (rather than a simple resource-by-resource configuration browser) and that your brilliant re-architecture hasn’t really helped in addressing them. Does a RESTful foundation make these features harder to build? Not necessarily, but there are some tricky aspects to take care of:

-1- In hypermedia systems, the links are usually part of the resource representation, not resources of their own. In IT management, relationships/associations can have their own lifecycle and configuration properties.

-2- Be careful that you can really maintain the address of a resource. It’s one thing to make sure that a UUID gets maintained as a resource configuration changes, it’s another to ensure that a dereferenceable URI remains unchanged. For example, the admin server of a cluster may move over time from one node to another.

More fundamentally, the ability to deal with multiple resources at the same time and/or to use the model at different levels of granularity is often a challenge. Either you make your protocol more complex to account for this or your pollute your resource model (with a bunch of arbitrary “groups”, implicit or explicit).

We saw this in the Cloud APIs too. It typically goes something like this: you can address an individual server (called “foo”) by sending requests to http://Cloudprovider.com/server/foo. Drop the “foo” part of the URL and now you can address all the servers, for example to retrieve their configuration or possibly to reboot them. This gives me a way of dealing with multiple resources at time, but only along the lines pre-defined by the API. What if I want to deal only with the servers that host nodes of a given cluster. Sorry, not possible. What if the servers have different hosts in their URIs (remember, “the API doesn’t constrain the design of the URI space”)? Oops.

WS-Management, in the SOAP world, takes this one step further with Selectors, through which you can embed some kind of query, the result of which is what you are addressing in your message. Or, if all you want to do is GET, you can model you entire datacenter as one giant virtual XML doc (a document which is never assembled in practice) and use WSRF/WSDM’s “QueryExpression” or WS-Management’s “FragmentTransfer” to the same effect. BTW, I have issues with the details of how these mechanisms work (and I have described an alternative under the motto “if you are going to suffer with WS-Addressing, at least get some value out of it”).

These are all non-RESTful atrocities to a RESTafarian, but in my mind the Cloud REST API reviewed in part 1 have open Pandora’s box by allowing less-qualified URIs to address all instances of a class. I expect you’ll soon see more precise query parameters in these URIs and they’ll look a lot like WS-Management Selectors (e.g. http://Cloudprovider.com/server?OS=Linux&CPUType=X86). Want to take bets about when a Cloud API URI format with an embedded regex first arrives?

When you need this, my gut feeling is that you are better off not worrying too much about trying to look RESTful. There is no shame to using an RPC pattern in the right circumstances. Don’t be the stupid skier who ends up crashing in a tree because he is just too cool for the using snowplow position.

One of the most common reasons to deal with multiple resources together is to run queries such as the “show me all the WebLogic instances that run on a Windows host and don’t have patch xyz applied” example above. Such a query mechanism recently became a DMTF standard, it’s called CMDBf. It is SOAP-based and doesn’t attempt to have anything to do with REST. Not that it didn’t cross the mind of a bunch of people, lead by Michael Coté when CMDBf first emerged (read the comments too). But as James Governor rightly predicted in the first comment, Coté heard “dick” from us on this (I represented HP in CMDBf and ended up being an editor of the specification, focusing on the “query” part). I don’t remember reading the entry back then but I must have since I have been a long time Coté fan. I must have dismissed the idea so quickly that it didn’t even register with my memory. Well, it’s 2009 now, CMDBf v1 is a DMTF standard and guess what? I, and many other SOAP-the-world-till-it-shines alumni, are looking a lot more seriously into what’s in this REST thing (thus this series of posts for me). BTW in this piece Coté also correctly predicted that CMDBf would be “more about CMDB interoperation than federation” but that didn’t take as much foresight (it was pretty obvious to me from the start).

Frankly I am still not sure that there is much benefit from REST in what CMDBf does, which is mostly a query interface. Yes the CMDBf query and its response go over SOAP. Yes in this case SOAP is mostly a useless wrapper since none of the implementations will likely support any WS-* SOAP header (other than paying the WS-Addressing tax). Sure we could remove it and send plain XML over HTTP. Or replace the SOAP wrapper with an Atom wrapper. Would it be anymore RESTful? Not one bit.

And I don’t see how to make it more RESTful. There are plenty of things in the periphery the query operation that can be made RESTful, along the lines of what I described above. REST could make the discovery/reconciliation tasks of the CMDB more efficient. The CMDBf query result format could be improved so that from the returned elements I can navigate my way among resources by following hyperlinks. But the query operation itself looks fundamentally RPCish to me, just like my interaction with the Google search page is really an RPC call that happens to return a Web page full of hyperlinks. In a way, this query (whether Google or CMDBf) can at best be the transition point from RPC to REST. It can return results that open a world of RESTful requests to you, but the query invocation itself is not RESTful. And that’s OK.

In part 3 (now available), I will try to synthesize the lessons from the Cloud APIs (part 1) and configuration management (this post) and extract specific guidance to get the best of what REST has to offer in future IT management protocols. Just so you can plan ahead, in part 4 I will reform the US health care system and in part 5 I will provide a practical roadmap for global nuclear disarmament. Suggestions for part 6 are accepted.

Post

Cloud catalog catalyst or cloud catalog cataclysm?

In Application Mgmt,Automation,Business,Cloud Computing,Everything,Governance,Manageability,Mgmt integration,Portability,Specs,Utility computing on July 27, 2009 by @vambenepe

Like librarians, we IT wonks tend to like things cataloged. To date, the last instance of this has been SOA governance and its various registries and repositories. With UDDI limping along as some kind of organizing standard for the effort. One issue I have with UDDI  is that its technical awkwardness is preventing us from learning from its failure to realize its ambitious goals (“e-business heaven”). It would be too easy to attribute the UDDI disappointment to UDDI. Rather, it should be laid at the feet of unreasonable initial expectations.

The SOA governance saga is still ongoing, now away from the spotlight and mostly from an implementation perspective rather than a standard perspective (by the way, what’s up with GIF?). Instead, the spotlight has turned to Cloud computing and that’s what we are supposedly going to control through cataloging next.

Earlier this year, I commented on the release of an ITSM catalog product for Cloud computing (though I was addressing the convergence of ITSM and Cloud computing more than catalogs per se).

More recently, Lori MacVittie related SOA governance to the need for Cloud catalogs. She makes some good points, but I also see some familiar-looking “irrational exuberance”. The idea of dynamically discovering and invoking a Cloud service reminds me too much of the initial “yellow pages” scenarios for UDDI (which quickly got dropped in favor of a more modest internal governance focus).

I am not convinced by the reason Lori gives for why things are different this time around (“one of the interesting things virtualization brings to the table that SOA did not is the ability to abstract management of services”). She argues that SOA governance only gave you access to the operational WSDL of a Web service, while Cloud catalogs will give you access to their management API. But if your service is an IT service, then your so-called management API (launch/configure/control VMs) is really its operational interface. The real management interface is the one Amazon uses under the cover and they are not going to expose it to you anymore than your bank is going to expose its application server administration console to you (if they do, move your money somewhere else before someone does it for you).

After all, isn’t SOA governance pretty close to a SaaS catalog which is itself a small part of the overall Cloud (IaaS+PaaS+SaaS) catalog question? If we still haven’t succeeded in the smaller scope, what are the odds of striking gold quickly in the larger effort?

Some analysts take a more pragmatic view, involving active brokers rather than simply a new DNS record type. I am doubtful about these brokers (0.2 probability, as Gartner would put it) but at least this moves the question onto business terms (leverage, control) rather than technical terms. Which is where the battle will be fought.

When it comes to Cloud catalogs, I think they are needed (if only for the categorization of Cloud services that they require) but will only play a supporting role, if any, in any move towards dynamic Cloud provisioning. As with SOA governance it’s as an internal tool, supported by strong processes, that they will be most useful.

Throughout human history, catalogs have been substitutes for control more often than instruments of control. Think of astronomy, zoology and… nephology for example. What kind will IT Cloud catalogs be?

Post

A small step for SCA, a giant leap for BSM

In Application Mgmt,BPEL,BSM,Business,Business Process,Everything,IT Systems Mgmt,Mgmt integration,Middleware,Modeling,Oracle,SCA,Standards on July 24, 2009 by @vambenepe

In a very short post, Khanderao Kand describes how configuration properties for BPEL processes in Oracle SOA Suite 11G are attached to SCA components. Here is the example he provides:

<component name="myBPELServiecComponent">
  ...
  <property name="bpel.config.inMemoryOptimization">true</property>
</component>

It doesn’t look like much. But it’s an major step for application-driven IT management (and eventually BSM).

Take a SCA component. Follow the SCA-defined component-to-composite and service-to-reference relationships upwards and eventually you’ll get to top level application services that have a decent chance of mapping well to business-relevant activities (e.g. order processing). Which means that the metrics of these services (e.g. availability, response time) are likely to be meaningful and important to the line of business. Follow the same SCA relationships downward and you’ll end up (in a SCA-based infrastructure like Oracle SOA Suite 11G), with target components that are meaningful to the IT administrator. Which means that their metrics and configuration settings (like “inMemoryOptimization”) are tracked and controlled by IT. You now have a direct string of connections between this configuration setting and a business relevant metric. You can navigate the connection in both directions: downward/reactive (“my service just went down, what changed in the infrastructure”) versus upward/proactive (“my service is always slow, what can I do to optimize the execution”).

Of course these examples are over-simplistic (and the title of this post is a bit too lyrical, on account of this). Following these SCA relationships in brute-force fashion will yield tens of thousands of low-level configuration settings for any top-level service, with widely differing importance and impact (not to mention that they interact). You need rules to make sense of this. Plus, configuration-based models are a complement to runtime transaction discovery, not a replacement (unless your model of the application includes every single line of code). But it’s not that often that you can see a missing link snap into place that clearly.

What this shows is the emergence of a common set of entities between the developer’s model and the IT admin model. And if the application was developed correctly, some of the entities in the developer’s model correspond to entities in the mental model of the application user and the line of business manager. SCA is the skeleton for this. Attaching configuration to SCA components puts muscle on the bone.

The road to BSM is paved with small improvements in the semantic alignment between IT infrastructure and application services. A couple of years ago, I tried to explain why SCA is very relevant for IT management. Now we can see it.

Post

With M (Oslo), is Microsoft on the path to reinventing RDF?

In Application Mgmt,Articles,Everything,Graph query,Mgmt integration,Microsoft,Modeling,RDF,Semantic tech on June 12, 2009 by @vambenepe

I have given up, at least for now, on understanding what Microsoft wants Oslo (and more specifically the “M” part) to be. I used to pull my hair reading inconsistent articles and interviews about what M tries to be (graphical programming! DSL! IT models! generic parser! application components! workflow! SOA framework! generic data layer! SQL/T-SQL for dummies! JSON replacement! all of the above!). Douglas Purdy makes a valiant 4-part effort (1, 2, 3, 4) but it’s still not crisp enough for my small brain. Even David Chapell, explainer extraordinaire, seems to throw up his hands (“a modeling platform that can be applied in lots of different ways”, which BTW is the most exact, if vague, description I’ve heard). Rather than articles, I now mainly look at the base specifications and technical documents that show what it actually is. That’s what I did  when the Oslo SDK first came out last year. A new technical document came out recently, an update to the MGraph Object Model so I took another a look.

And it turns out that MGraph is… RDF. Or rather, “RDF minus entailment”. And with turtle as the base representation rather than an add-on.

Look at section 3 (“RDF concepts”) in this table of content from W3C. It describes the core RDF concepts. Keep the first five concepts (sections 3.1 to 3.5) and drop the last one (“3.6: Entailment”). You have MGraph, a graph-oriented object model.

On top of this, the RDF community adds reasoning capabilities with RDF entailment, RDFS, OWL, SWRL, SPIN, etc and a variety of engines that implement these different levels of reasoning.

Microsoft, on the other hand, seems to ignore that direction. Instead, it focuses on creating a good mapping from this graph object model to programming languages. In two directions:

  • from programming languages to the graph model: they make it easy for you to create a domain-specific language (DSL) that can easily be turned into M instances.
  • from the graph model to programming languages: they make it easy for you to work on these M instances (including storing them) using the .NET technology stack.

So, if Microsoft is indeed reinventing RDF as the title of this entry provocatively suggests, then they are taking an interesting detour on the way. Rather than going straight to “model-based inferencing”, they are first focusing on mapping the core MGraph concepts to programming (by regular developers) and user interactions (with regular users). Something that for a long time had not gotten much attention in the RDF world beyond pointing developers to Jena (though it seemed to have improved over the last few years with companies like TopQuadrant; ironically, the Oslo model browser/editor is code-named “Quadrant”).

Whether the Oslo team sees the inferencing fun as a later addition or something that’s not needed is another question, on which I don’t see any hint at this point.

I hope they eventually get to it. But I like the fact that they cleanly separate the ability to represent and manipulate the graph model from the question of whether instances can be inferred. We could use such a reusable graph representation mechanism. Did CMDBf, for example, really have to create a new graph-oriented metamodel and query language? I failed to convince the group to adopt RDF/SPARQL, but I may have been more successful if there had been a cleanly-separated “static” version of RDF/SPARQL, a way to represent and query a graph independently of whether the edges and nodes in the graph (and their types) are declared or inferred. Instead, the RDF stack has entailment deeply embedded and that’s very scary to many.

But as much as I like this separation, I can’t help squirming when I see the first example in the MGraph document:

// Populate a small village with some people
Villagers => {
  Jenn => Person { Name => 'Jennifer', Age => 28, Spouse => Rich },
  Rich => Person { Name => 'Richard', Age => 26, Spouse => Jenn },
  Charly => Person { Name => 'Charlotte', Age => 12 }
},
HaveSpouses => { Villagers.Rich, Villagers.Jenn }

That last line is an eyesore to anyone who has been anywhere near RDF. I have just declared that Rich and Jenn are one another’s spouse, why do I have to add a line that says that they have spouses? What I want is to say that participation in a “Spouse” relationship entails membership in the “hasSpouse” class. And BTW, I also want to mark the “Spouse” relationship as symmetric so I only have to declare it one way and the inverse can be inferred.

So maybe I don’t really know what I want on this. I want the graph model to be separated from the inference logic and yet I want the syntactic simplicity that derives from base entailments like the example above. Is June too early to start a Christmas wish list?

While I am at it, can we please stop putting people’s ages in the model rather than their dates of birth? I know it’s just an example, but I see it over and over in so many modeling examples. And it’s so wrong in 99% of cases. It just hurts.

There are other things about MGraph edges that look strange if you are used to RDF. For example, edges can be labeled or not, as illustrated on this first example of the graph model:

In this example, “Age” is a labeled edge that points to the atomic node “42″, while the credit score is modeled as a non-atomic node linked from the person via an unlabeled edge. Presumably the “credit score” node is also linked to an atomic node (not shown) that contains the actual score value (e.g. “800″). I can see why one would want to call out the credit score as a node rather than having an edge (labeled “credit score”) that goes to an atomic node containing the actual credit score value (similar to how “age” is handled). For one thing, you may want to attach additional data to that “credit score” node (when was it calculated, which reporting agency provided it, etc) so it helps to have it be a node. But making this edge unlabeled worries me. Originally you may only think of one possible relationship type between a person and a credit score (the person has a credit score). But other may pop up further down the road, e.g. the person could be a loan agent who orders the credit score but the score is about a customer. So now you create a new edge label (“orders”) to link the loan agent person to the credit score. But what happens to all the code that was written previously and navigates the relationship from the person to the score with the expectation that the score is about the person. Do you think that code was careful to only navigate “unlabeled” edges? Unlikely. Most likely it just grabbed whatever credit score was linked to the person. If that code is applied to a person who happens to also be a loan agent, it might well grab a credit score about other people which happened to be ordered by the loan agent. These unlabeled edges remind me of the practice of not bothering with a “version” field in the first version of your work because, hey, there is only one version so far.

The restriction that a node can have at most one edge with a given label coming out of it is another one that puzzles me. Though it may explain why an unlabeled edge is used for the credit score (since you can get several credit scores for the same person, if you ask different rating agencies). But if unlabeled edges are just a way to free yourself from this restriction then it would be better to remove the restriction rather than work around it. Let’s take the “Spouse” label as an example. For one thing in some countries/cultures having more than one such edge might be possible. And having several ex-spouses is possible in many places. Why would the “ex-spouse” relationship have to be defined differently from “spouse”? What about children? How is this modeled? Would we be forced to have a chain of edges from parent to 1st child to next sibling to next sibling, etc? Good luck dealing with half-siblings. And my model may not care so much about capturing the order (especially if the date of birth is already captured anyway). This reminds me of how most XML document formats force element order in places where it is not semantically meaningful, just because of XSD’s bias towards “sequence”.

Having started this entry by declaring that I don’t understand what M tries to be, I really shouldn’t be criticizing its design choices. The “weird” aspects I point out are only weird in the context of a certain usage but they may make perfect sense in the usage that the Oslo team has in mind. So I’ll stop here. The bottom line is that there are traces, in M, of a nice, reusable, graph-oriented data model with strong bridges (in both directions) to programming languages and user interfaces. That is appealing to me. There are also some strange restrictions that puzzle me. We’ll see where this goes (hopefully this article, “Designing Domains and Models Using M” will soon contain more than “to be submitted” and I can better understand the M approach). In any case, kuddos to the team for being so open about their work and the evolution of their design.

Post

Interesting links

In Amazon,Application Mgmt,Automation,CA,Cloud Computing,Everything,HP,IT Systems Mgmt,Manageability,Mgmt integration,Microsoft,Middleware,Oracle,Utility computing,Virtualization on June 3, 2009 by @vambenepe

A few interesting links I noticed tonight.

HP Delivers Industry-first Management Capabilities for Microsoft System Center

That’s not going to improve the relationship between the Insight Control group (part of the server hardware group, of Compaq heritage) and the BTO group (part of HP Software, of HP heritage plus many acquisitions) in HP.  The Microsoft relationship was already a point of tension when they were still called SIM and OpenView, respectively.

CA Acquires Cassatt

Constructive destruction at work.

Setting up a load-balanced Oracle Weblogic cluster in Amazon EC2

It’s got to become easier, whether Oracle or somebody else does it. In the meantime, this is a good reference.

[UPDATED 2009/07/12: If you liked the "WebLogic on EC2" article, check out the follow-up: "Full Weblogic Load-Balancing in EC2 with Amazon ELB".]

Full Weblogic Load-Balancing in EC2 with Amazon ELB

Post

Cloud APIs need to be complemented by Cloud processes

In Amazon,Application Mgmt,Business Process,Cloud Computing,Everything,IT Systems Mgmt,ITIL,Standards,Utility computing on May 21, 2009 by @vambenepe

A lot of attention has been focused on technical standards for Cloud computing, especially over the last month (e.g. DMTF incubator announcement). That’s fine, but before we go crazy with detailed technical standards let’s realize that for Cloud computing (of the public variety at least) to take off we’ll need just as much standardization of non-technical interactions. Namely processes.

This, to me, is one of the most interesting angles on the recent announcement by Amazon AWS that they now support (in limited beta) the ability to load data from storage that is physically shipped to them. Have a look at this announcement and you’ll notice that it spends more time describing a logistical process (how to pack, how to ship…) than technical interfaces (storage device requirements, how to create a manifest…). It is still part of the “AWS Developer Guide” but clearly these instructions are not just for developers.

Many more such processes need to be “standardized” (or at least documented) for companies to efficiently be able to use public Clouds (and to some extent even private Clouds). Let’s take SLAs as an example. It sounds good when a Cloud provider says “we offer SLAs”. But what does it mean? Does it mean “we advertise some SLA numbers, you’re responsible for contacting us (trough a phone number hidden somewhere on our site) when you think we’ve violated them; if we agree with your measurements then you may get a check in the mail at some point in the future”? Not so useful. If, on the other hand, there is a clear definition of the metric that the SLA applies to, a clear definition of how it gets measured (do we trust provider performance reports, customer measurement, a third party monitor…), a clear process to claim refund, a clear process to actually provide the refund (credit for future service or direct payment, when/how is the payment made…), then it becomes more useful.

I picked the SLA enforcement example because it happens to be an area that the TMF (TeleManagement Forum) has made partially available as a teaser for its eTOM business process framework (aimed at telco providers). The full list of eTOM processes is only available to paying subscribers. One of the goals of the eTOM process framework is “to simplify procurement, serving as a common language between service providers and suppliers”. Another way to say it is that eTOM “recognizes that the enterprise interacts with external parties, and that the enterprise may need to interact with process flows defined by external parties, as in ebusiness interactions”. Exactly what we are talking about when it comes to making public Clouds easily consumable by enterprises. SLA management is just one small part of the overall eTOM framework (if you look for it in this eTOM overview poster it’s in purple, under “assurance”, in the first row).

My point is not to assert that Cloud providers should adopt eTOM. Nobody adopts eTOM directly as a blueprint anyway. But, while the cultures and maturity levels are sometimes different, it is also hard to argue that Cloud providers have nothing to learn form telco providers (many of which are becoming Cloud providers themselves). I shudder at the idea of AT&T teaching another company how to handle customer service, but have you ever tried to call Google?

Readers of this blog are likely to be more familiar with ITIL than eTOM (who, incidentally, incorporates parts of ITIL in its latest version, 8.0). For those who don’t know about either, one way to think about it is that Cloud providers would implement processes that look somewhat like eTOM processes, that Cloud consumers implement IT management processes that follow to some extent ITIL best practices and that these two sets of processes need to meet for public Clouds to work. I touched on this a few months ago, when I commented on the incorporation of Cloud services in an IT service catalog.

My main point is not about ITIL or eTOM. It’s simply that there are important process aspects to delivering/consuming Cloud services and that they have so far been overshadowed by the technical aspects. The processes sketched in the AWS import/export capability represent the first drop of an upcoming shower.

[UPDATED 2009/5/22: More on telcos becoming Cloud providers from EMC's Chuck Hollis, with a retort by James Governor. Just listing these as FYI but my main point in this post is not about telcos, it's about the need to clarify processes, independently of whether the provider is Amazon or AT&T. It's just that the telcos have been working on such process standardization for a long time. Hoff provides another example of where process standardization is needed in Cloud relationships: right to audit.]