Category Archives: Virtualization

Moving towards utility/cloud computing standards?

This Forbes article (via John) channels 3Tera’s Bert Armijo’s call for standardization of utility computing. He calls it “Open Cloud” and it would “allow a company’s IT systems to be shared between different cloud computing services and moved freely between them“. Bert talks a bit more about it on his blog and, while he doesn’t reference the Forbes interview (too modest?), he points to Cloudscape as the vision.

A few early thoughts on all this:

  • No offense to Forbes but I wouldn’t read too much into the article. Being Forbes, they get quotes from a list of well-known people/companies (Google and Amazon spokespeople, Forrester analyst, Nick Carr). But these quotes all address the generic idea of utility computing standards, not the specifics of Bert’s project.
  • Saying that “several small cloud-computing firms including Elastra and Rightscale are already on board with 3Tera’s standards group” is ambiguous. Are they on-board with specific goals and a candidate specification? Or are they on board with the general idea that it might be time to talk about some kind of standard in the general area of utility computing?
  • IEEE and W3C are listed as possible hosts for the effort, but they don’t seem like a very good match for this area. I would have thought of DMTF, OASIS or even OGF first. On the face of it, DMTF might be the best place but I fear that companies like 3Tera, Rightscale and Elastra would be eaten alive by the board member companies there. It would be almost impossible for them to drive their vision to completion, unlike what they can do in an OASIS working group.
  • A new consortium might be an option, but a risky and expensive one. I have sometimes wondered (after seeing sad episodes of well-meaning and capable start-ups being ripped apart by entrenched large vendors in standards groups) why VCs don’t play a more active role in standards. Standards sound like the kind of thing VCs should be helping their companies with. VC firms are pretty used to working together, jointly investing in companies. Creating a new standard consortium might be too hard for 3Tera, but if the VCs behind 3Tera, Elastra and Rightscale got together and looked at the utility computing companies in their portfolios, it might make sense to join forces on some well-scoped standardization effort that may not otherwise be given a chance in existing groups.
  • I hope Bert will look into the history of DCML, a similar effort (it was about data center automation, which utility computing is not that far from once you peel away the glossy pictures) spearheaded by a few best-of-bread companies but ignored by the big boys. It didn’t really take off. If it had, utility computing standards might now be built as an update/extension of that specification. Of course DCML started as a new consortium and ended as an OASIS “member section” (a glorified working group), so this puts a grain of salt on my “create a new consortium and/or OASIS group” suggestion above.
  • The effort can’t afford to be disconnected from other standards in the virtualization and IT management domains. How does the effort relate to OVF? To WS-Management? To existing modeling frameworks? That’s the main draw towards DMTF as a host.
  • What’s the open source side of this effort? As John mentions during the latest Redmonk/Willis IT management podcast (starting around minute 24), there needs to a open source side to this. Actually, John thinks all you need is the open source side. Coté brings up Eucalyptus. BTW, if you want an existing combination of standards and open source, have a look at CDDLM (standard) and SmartFrog (implementation, now with EC2/S3 deployment)
  • There seems to be some solid technical raw material to start from. 3Tera’s ADL, combined with Elastra’s ECML/EDML, presumably captures a fair amount of field expertise already. But when you think of them as a starting point to standardization, the mindset needs to switch from “what does my product need to work” to “what will the market adopt that also helps my product to work”.
  • One big question (at least from my perspective) is that of the line between infrastructure and applications. Call me biased, but I think this effort should focus on the infrastructure layer. And provide hooks to allow application-level automation to drive it.
  • The other question is with regards to the management aspect of the resulting system and the role management plays in whatever standard specification comes out of Bert’s effort.

Bottom line: I applaud Bert’s efforts but I couldn’t sleep well tonight if I didn’t also warn him that “there be dragons”.

And for those who haven’t seen it yet, here is a very good document on the topic (but it is focused on big vendors, not on how smaller companies can play the standards game).

[UPDATED 2008/6/30: A couple hours after posting this, I see that Coté has just published a blog post that elaborates on his view of cloud standards. As an addition to the podcast I mentioned earlier.]

[UPDATED 2008/7/2: If you read this in your feed viewer (rather than directly on vambenepe.com) and you don’t see the comments, you should go have a look. There are many clarifications and some additional insight from the best authorities on the topic. Thanks a lot to all the commenters.]

20 Comments

Filed under Amazon, Automation, Business, DMTF, Everything, Google, Google App Engine, Grid, HP, IBM, IT Systems Mgmt, Mgmt integration, Modeling, OVF, Portability, Specs, Standards, Utility computing, Virtualization

OVF in action: Kensho

Simon Crosby recently wrote about an upcoming Citrix product (I think that’s what it is, since he doesn’t mention open source anywhere) called Kensho. The post is mostly a teaser (the Wikipedia link in his post will improve your knowledge of oriental philosophy but not your IT management expertise) but it makes interesting claims of virtualization infrastructure interoperability.

OVF gets a lot of credit in Simon’s story. But, unless things have changed a lot since the specification was submitted to DMTF, it is still a wrapper around proprietary virtual disk formats (as previously explained). That wrapper alone can provide a lot of value. But when Simon explains that Kensho can “create VMs from VMware, Hyper-V & XenServer in the OVF format” and when he talks about “OVF virtual appliances” it tends to create the impression that you can deploy any OVF-wrapped VM into any OVF-compliant virtualization platform. Which, AFAIK, is not the case.

For the purpose of a demo, you may be able to make this look like a detail by having a couple of equivalent images and picking one or the other depending on the target hypervisor. But from the perspective of the complete lifecycle management of your virtual machines, having a couple of “equivalent” images in different formats is a bit more than a detail.

All in all, this is an interesting announcement and I take it as a sign that things are progressing well with OVF at DMTF.

[UPDATED 2008/6/29: Chris Wolf (whose firm, the Burton Group, organized the Catalyst conference at which Simon Crosby introduced Kensho) has a nice write-up about what took place there. Plenty of OVF-love in his post too, and actually he gives higher marks to VMWare and Novell than Citrix on that front. Chris makes an interesting forecast: “Look for OVF to start its transition from a standardized metadata format for importing VM appliances to the industry standard format for VM runtime metadata. There’s no technical reason why this cannot happen, so to me runtime metadata seems like OVF’s next step in its logical evolution. So it’s foreseeable that proprietary VM metadata file formats such as .vmc (Microsoft) and .vmx (VMware) could be replaced with a .ovf file”. That would be very nice indeed.]

[2008/7/15: Citrix has hit the “PR” button on Kensho, so we get a couple of articles describing it in a bit more details: Infoworld and Sysmannews (slightly more detailed, including dangling the EC2 carrot).]

Comments Off on OVF in action: Kensho

Filed under DMTF, Everything, IT Systems Mgmt, Manageability, Mgmt integration, OVF, Standards, Virtualization, Xen, XenSource

Recent IT management announcements

There were a few announcements relevant to the evolution of IT management over the last week. The most interesting is VMware’s release of the open-source (BSD license) VI SDK, a Java API to manage a host system and the virtual machines that run on it. Interesting that they went the way of a language-specific API. The alternatives, to complement/improve their existing web services SDK, would have been: define CIM classes and implement a WBEM provider (using CIM-HTTP and/or WS-Management), use WS-Management but without the CIM part (define the model as native XML, not XML-from-CIM), use a RESTful HTTP-driven interface to that same native XML model or, on the more sci-fi side, go the MDA way with a controller from which you retrieve the observed state and to which you specify the desired state. The Java API approach is the easiest one for developers to use, as long as they can access the Java ecosystem and they are mainly concerned with controlling the VMWare entities. If the management application also deals with many other resources (like the OS that runs in the guest machines or the hardware under the host, both of which are likely to have CIM models), a more model-centric approach could be more handy. The Java API of course has an underlying model (described here), but the interface itself is not model-centric. So what with all the DMTF-love that VMWare has been displaying lately (OVF submission, board membership, hiring of the DMTF president…). Should we expect a more model-friendly version of this API in the future? How does this relate to the DMTF SVPC working group that recently released some preliminary profiles? The choice to focus on beefing-up the Java-centric management story (which includes Jython, as VMWare was quick to point out) rather than the platform-agnostic, on-the-wire-interop side might be seen by the more twisted minds as a way to not facilitate Microsoft’s “manage VMWare today to replace it tomorrow” plan any more than necessary.

Speaking of Microsoft, in unrelated news we also got a heartbeat from them on the Oslo project: a tech preview of some of the components is scheduled for October. When Oslo was announced, there was a mix of “next gen BizTalk” aspects and “developer-driven DSI” aspects. From this report, the BizTalk part seems to be dominating. No word on use of SML.

And finally, SOA Software (who was previously called Digital Evolution and who acquired Blue Titan, Flamenco and LogicLibrary, in case you’re trying to keep track) has released a “SOA Development Governance Product”. Nothing too exciting from what I can see on InfoQ about it, but that’s a pretty superficial evaluation so don’t let me stop you. Am I the only one who twitches whenever “federation” is used to mean at worst “import” or at best “synchronization”? Did CMDBf start that trend? BTW, is it just an impression or did SOA Software give InfoQ a list of the questions they wanted to be asked?

4 Comments

Filed under DMTF, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Open source, Oslo, OVF, SML, Standards, Tech, Virtualization, VMware, WS-Management

Google App Engine: less is more

“If you have a stove, a saucepan and a bottle of cold water, how can you make boiling water?”

If you ask this question to a mathematician, they’ll think about it a while, and finally tell you to pour the water in the saucepan, light up the stove and put the saucepan on it until the water boils. Makes sense. Then ask them a slightly different question: “if you have a stove and a saucepan filled with cold water, how can you make boiling water?”. They’ll look at you and ask “can I also have a bottle”? If you agree to that request they’ll triumphantly announce: “pour the water from the saucepan into the bottle and we are back to the previous problem, which is already solved.”

In addition to making fun of mathematicians, this is a good illustration of the “fake machine” approach to utility computing embodied by Amazon’s EC2. There is plenty of practical value in emulating physical machines (either in your data center, using VMWare/Xen/OVM or at a utility provider’s site, e.g. EC2). They are all rooted in the fact that there is a huge amount of code written with the assumption that it is running on an identified physical machine (or set of machines), and you want to keep using that code. This will remain true for many many years to come, but is it the future of utility computing?

Google’s App Engine is a clear break from this set of assumptions. From this perspective, the App Engine is more interesting for what it doesn’t provide than for what it provides. As the description of the Sandbox explains:

“An App Engine application runs on many web servers simultaneously. Any web request can go to any web server, and multiple requests from the same user may be handled by different web servers. Distribution across multiple web servers is how App Engine ensures your application stays available while serving many simultaneous users [not to mention that this is also how they keep their costs low — William]. To allow App Engine to distribute your application in this way, the application runs in a restricted ‘sandbox’ environment.”

The page then goes on to succinctly list the limitations of the sandbox (no filesystem, limited networking, no threads, no long-lived requests, no low-level OS functions). The limitations are better described and commented upon here but even that article misses one major limitation, mentioned here: the lack of scheduler/cron.

Rather than a feature-by-feature comparison between the App Engine and EC2 (which Amazon would won handily at this point), what is interesting is to compare the underlying philosophies. Even with Amazon EC2, you don’t get every single feature your local hardware can deliver. For example, in its initial release EC2 didn’t offer a filesystem, only a storage-as-a-service interface (S3 and then SimpleDB). But Amazon worked hard to fix this as quickly as possible in order to be appear as similar to a physical infrastructure as possible. In this entry, announcing persistent storage for EC2, Amazon’s CTO takes pain to highlight this achievement:

“Persistent storage for Amazon EC2 will be offered in the form of storage volumes which you can mount into your EC2 instance as a raw block storage device. It basically looks like an unformatted hard disk. Once you have the volume mounted for the first time you can format it with any file system you want or if you have advanced applications such as high-end database engines, you could use it directly.”

and

“And the great thing is it that it is all done with using standard technologies such that you can use this with any kind of application, middleware or any infrastructure software, whether it is legacy or brand new.”

Amazon works hard to hide (from the application code) the fact that the infrastructure is a huge, shared, distributed system. The beauty (and business value) of their offering is that while the legacy code thinks it is running in a good old data center, the paying customer derives benefits from the fact that this is not the case (e.g. fast/easy/cheap provisioning and reduced management responsibilities).

Google, on the other hand, embraces the change in underlying infrastructure and requires your code to use new abstractions that are optimized for that infrastructure.

To use an automotive analogy, Amazon is offering car drivers to switch to a gas/electric hybrid that refuels in today’s gas stations while Google is pushing for a direct jump to hydrogen fuel cells.

History is rarely kind to promoters of radical departures. The software industry is especially fond of layering the new on top of the old (a practice that has been enabled by the constant increase in underlying computing capacity). If you are wondering why your command prompt, shell terminal or text editor opens with a default width of 80 characters, take a trip back to 1928, when IBM defined its 80-columns punch card format. Will Google beat the odds or be forced to be more accommodating of existing code?

It’s not the idea of moving to a more abstracted development framework that worries me about Google’s offering (JEE, Spring and Ruby on Rails show that developers want this move anyway, for productivity reasons, even if there is no change in the underlying infrastructure to further motivate it). It’s the fact that by defining their offering at the level of this framework (as opposed to one level below, like Amazon), Google puts itself in the position of having to select the right framework. Sure, they can support more than one. But the speed of evolution in that area of the software industry shows that it’s not mature enough (yet?) for any party to guess where application frameworks are going. Community experimentation has been driving application frameworks, and Google App Engine can’t support this. It can only select and freeze a few framework.

Time will tell which approach works best, whether they should exist side by side or whether they slowly merge into a “best of both worlds” offering (Amazon already offers many features, like snapshots, that aim for this “best of both worlds”). Unmanaged code (e.g. C/C++ compiled programs) and managed code (JVM or CLR) have been coexisting for a while now. Traditional applications and utility-enabled applications may do so in the future. For all I know, Google may decide that it makes business sense for them too to offer a Xen-based solution like EC2 and Amazon may decide to offer a more abstracted utility computing environment along the lines of the App Engine. But at this point, I am glad that the leaders in utility computing have taken different paths as this will allow the whole industry to experiment and progress more quickly.

The comparison is somewhat blurred by the fact that the Google offering has not reached the same maturity level as Amazon’s. It has restrictions that are not directly related to the requirements of the underlying infrastructure. For example, I don’t see how the distributed infrastructure prevents the existence of a scheduling service for background jobs. I expect this to be fixed soon. Also, Amazon has a full commercial offering, with a price list and an ecosystem of tools, why Google only offers a very limited beta environment for which you can’t buy extra capacity (but this too is changing).

2 Comments

Filed under Amazon, Everything, Google, Google App Engine, OVM, Portability, Tech, Utility computing, Virtualization, VMware

Oedipus meets IT management?

Having received John’s approval to reclaim the “mighty” adjective, I am going to have a bit of fun with it. More specifically, I am toying with adding VMWare to the list. Clearly, VMWare doesn’t want to go the way Sun did with Solaris (nice technology, right place at the right time, but commoditized in the long term). They have supposedly surrounded themselves with a pretty good patent minefield to slow the commoditization trend, but it will happen anyway and they know it. Especially with improved virtualization support in hardware making some of these patents less relevant. For this reason, they are putting a lot of effort on developing the IT management side of their portfolio.

One illustration of this is the fact that VMWare recently recruited the Senior VP of systems management at Oracle to become its Executive VP of R&D (incidentally, this happened a couple months after I joined his team at Oracle; maybe the knowledge that he wouldn’t have to deal with my bad sense of humor for too long made it easier for him to approve my hiring). I don’t think it’s a coincidence that they chose someone who is not a virtualization expert but an enterprise infrastructure expert (namely database performance and management software).

So, do we have the “Mighty Four” (Oracle, Microsoft, EMC and VMWare) for a nice symetry with the “Big Four” (HP, IBM, BMC and CA)? Or does the fact that EMC owns most of VMWare make us pause here? Might a mighty mother a mighty? How do you run a 85%-owned company whose strategic directions takes it toward direct competition with its corporate owner? EMC and VMWare are attacking IT management from different directions (EMC is actually going at it from several directions at the same time, based on its historical storage products, plus new software from acquisitions, plus hiring a few smart people away from IBM to put the whole thing together), so on paper their portfolios look pretty complementary. But if aligning and collaborating more closely may make sense from a product engineering perspective, it doesn’t make sense from a financial engineering perspective. At least as long as investors are so hungry for the few VMWare share available on the open market (as a side issue, I wonder if they like it so much because of the virtualization market per se or because they see VMWare’s position in that market as a beachhead for the larger enterprise IT infrastructure software market). And, as should not be suprising, the financial view is likely to prevail, which will keep the companies at arms length. But if both VMWare and EMC are succesful in assumbling a comprehensive enterprise infrastructure management system, things will get interesting.

[UPDATED 2008/5/28: The day after I write this, VMWare buys application performance management vendor B-hive. I am pretty lucky with my timing on this one.]

2 Comments

Filed under Everything, IT Systems Mgmt, Patents, People, Virtualization, VMware

Where will you be when the Semantic Web gets Grid’ed?

I see the tide rising for semantic technologies. On the other hand, I wonder if they don’t need to fail in order to succeed.

Let’s use the Grid effort as an example. By “Grid effort” I mean the work that took place in and around OGF (or GGF as it was known before its merger w/ EGA). That community, mostly made of researchers and academics, was defining “utility computing” and creating related technology (e.g. OGSA, OGSI, GridFTP, JSDL, SAGA as specs, Globus and Platform as implementations) when Amazon was still a bookstore. There was an expectation that, as large-scale, flexible, distributed computing became a more pressing need for the industry at large, the Grid vision and technology would find their way into the broader market. That’s probably why IBM (and to a lesser extent HP) invested in the effort. Instead, what we are seeing is a new approach to utility computing (marketed as “cloud computing”), delivered by Amazon and others. It addresses utility computing with a different technology than Grid. With X86 virtualization as a catalyst, “cloud computing” delivers flexible, large-scale computing capabilities in a way that, to the users, looks a lot like their current environment. They still have servers with operating systems and applications on them. It’s not as elegant and optimized as service factories, service references (GSR), service handle (GSH), etc but it maps a lot better to administrators’ skills and tools (and to running the current code unchanged). Incremental changes with quick ROI beat paradigm shifts 9 times out of 10.

Is this indicative of what is going to happen with semantic technologies? Let’s break it down chronologically:

  1. Trailblazers (often faced with larger/harder problems than the rest of us) come up with a vision and a different way to think about what computers can do (e.g. the “computers -> compute grid” transition).
  2. They develop innovative technology, with a strong theoretical underpinning (OGSA-BES and those listed above).
  3. There are some successful deployments, but the adoption is mostly limited to a few niches. It is seen as too complex and too different from current practices for broad adoption.
  4. Outsiders use incremental technology to deliver 80% of the vision with 20% of the complexity. Hype and adoption ensue.

If we are lucky, the end result will look more like the nicely abstracted utility computing vision than the “did you patch your EC2 Xen images today” cloud computing landscape. But that’s a necessary step that Grid computing failed to leapfrog.

Semantic web technologies can easily be mapped to the first three bullets. Replace “computers -> computer grid” with “documents/data -> information” in the first one. Fill in RDF, RDFS, OWL (with all its flavors), SPARQL etc as counterparts to OGSA-BES and friends in the second. For the third, consider life sciences and defense as niche markets in which semantic technologies are seeing practical adoption. What form will bullet #4 take for semantic technology (e.g. who is going to be the EC2 of semantic technology)? Or is this where it diverges from Grid and instead gets adopted in its “original” form?

1 Comment

Filed under Everything, Grid, HP, IBM, RDF, Research, Semantic tech, Specs, Standards, Tech, Utility computing, Virtualization

Elastra and data center configuration formats

I heard tonight for the first time of a company called Elastra. It sounds like they are trying to address a variation of the data center automation use cases covered by Opsware (now HP) and Bladelogic (now BMC). Elastra seems to be in an awareness-building phase and as far as I can tell it’s working (since I heard about them). They got to me through John’s blog. They are also using the more conventional PR channel (and in that context they follow all the cheesy conventions: you get to “unlock the value”, with “the leading provider” who gives you “a new product that revolutionizes…” etc, all before the end of the first paragraph). And while I am making fun of the PR-talk I can’t help zeroing on this quote from the CEO, who “wanted to pick up where utility computing left off – to go beyond the VM and toward virtualizing complex applications that span many machines and networks”. Does he feels the need to narrowly redefine “utility computing” (who knew that all that time “utility computing” was just referring to a single hypervisor?) as a way to justify the need for the new “cloud” buzzword (you’ll notice that I haven’t quite given up yet, this post is in the “utility computing” category and I still do not have a “cloud” category)?

The implied difference with Opsware and Bladelogic seems to be that while these incumbent (hey Bladelogic, how does it feel to be an “incumbent”?) automate data center management tasks in old boring data centers, Elastra does it in clouds. More specifically “public and private compute clouds”. I think I know roughly what a public cloud is supposed to be (e.g. EC2), but a private cloud? How is that different from a data center? Is a private cloud a data center that has the Elastra management software deployed? In that case, how is automating private clouds with Elastra different from automating data centers with Elastra? Basically it sounds like they don’t want to be seen as competing with Opsware and Bladelogic so they try to redefine the category. Which makes it easier to claim (see above) to be “the leading provider of software for designing, deploying, and managing applications in public and private compute clouds” without having the discovery or change management capabilities of Opsware (or anywhere near the same number of customers).

John seems impressed by their “public cloud” capabilities (I don’t think he has actually tested them yet though) and I trust him on that. Knowing the complexities of internal data centers, I am a lot more doubtful of the “private cloud” claims (at least if I interpret them correctly).

Anyway, I am getting carried away with some easy nitpicking on the PR-talk, but in truth it uses a pretty standard level of obfuscation/hype for this type of press release. Sad, I know.

The interesting thing (and the reason I started this blog entry in the first place) is that they seem to have created structures to capture system design (ECML) and deployment (EDML) rules. From John’s blog:

“At the core of Elastra’s architecture are the system design specifications called ECML and EDML. ECML is an XML markup language to specify a cloud design (i.e., multiple system design of firewalls, load balancers, app servers, db servers, etc…). The EDML markup provides the provisioning instructions.”

John generously adds “Elastra seems to be the first to have designed their autonomics into a standards language” which seems to assume that anything in XML is a standard. Leaving the “standard” debate aside, an XML format does tend improve interoperability and that’s a good thing.

So where are the specifications for these ECML and EDML formats? I would be very interested in reading them, but they don’t appear to be available anywhere. Maybe that would be a good first step towards making them industry standards.

I would be especially interested in comparing this to what the SML/CML effort is going after. Here are some propositions that need to be validated or disproved. Comparing SML/CML to ECML/EDML could help shade light on them:

  • SML/CML encompasses important and useful datacenter automation use cases.
  • Some level of standardization of cross-domain system design/deployment/management is needed.
  • SML/CML will be too late.
  • SML/CML will try to do too many things at once.

You can perform the same exercise with OVF. Why isn’t OVF based on SML? If you look at the benefits that could be theoretically be derived by that approach (hardware, VM, network and application configuration all in the same metamodel) it tells you about all that is attractive about SML. At the same time, if you look at the fact that OVF is happening while CML doesn’t seem to go anywhere, it tells you that the “from the very top all the way down to the very bottom” approach that SML is going after is very difficult to pull off. Especially with so many cooks in the kitchen.

And BTW, what is the relationship between ECML/EDML and OVF? I’d like to find out where the Elastra specifications land in all this. In the worst case, they are just an XML rendering of the internals of the Elastra application, mixing all domains of the IT stack. The OOXML of data center automation if you want. In the best case, it is a supple connective tissue that links stiffer domain-specific formats.

[UPDATED 2008/3/26: Elastra’s “introduction to elastic programing” white paper has a few words about the relationship between OVF and EDML: “EDML builds on the foundation laid by Open Virtual Machine Format (OVF) and extends that language’s capabilities to specify ways in which applications are deployed onto a Virtual Machine system”. Encouraging, if still vague.]

[UPDATED 2008/3/31: A week ago I hadn’t heard of Elastra and now I learn that I had been tracking the blog of its lead-architect-to-be all along! Maybe Stu will one day explain what a “private cloud” is. His description of his new company seems to confirm my impression that they are really focused (for now at least) on “public clouds” and not the Opsware-like “private clouds” automation capabilities. Maybe the “private clouds” are just in the business plan (and marketing literature) to be able to show a huge potential markets to VCs so they pony up the funds. Or maybe they really plan to go after this too. Being able to seamlessly integrate both (for mixed deployments) is the holly grail, I am just dubious that focusing on this rather than doing one or the other “right” is the best starting point for a new company. My guess is that despite the “private cloud” talk, they are really focusing on “public clouds” for now. That’s what I would do anyway.]

[UPDATED on 2008/6/25: Stephen O’Grady has an interesting post about the role of standards in Cloud computing. But he only looks at it from the perspective of possible standardization of the interfaces used by today’s Cloud providers. A full analysis also needs to include the role, in Cloud Computing, of standards (app runtime standards, IT management standards, system modeling standards, etc…) that started before Cloud computing was big. Not everything in Cloud computing is new. And even less is new about how it will be used. Especially if, as I expect, utility computing and on-premise computing are going to become more and more intertwined, resulting in the need to manage them as a whole. If my app is deployed at Amazon, why doesn’t it (and its hosts) show up in my CMDB and in my monitoring panel? As Coté recently wrote, “as the use of cloud computing for an extension of data centers evolves, you could see a stronger linking between Hyperic’s main product, HQ and something like Cloud Status”.]

9 Comments

Filed under Automation, CML, Everything, IT Systems Mgmt, OVF, SML, Tech, Utility computing, Virtualization

My web apps and me

Registering a domain name: $10 per year
Hosting it with all the features you may need: $80 per year
Controlling your on-line life: priceless

To be frank, the main reason that I do not use Facebook or MySpace is that I am not very social to start with. But, believe it or not, I have a few friends and family member with whom I share photos and personal stories. Not to mention this blog for different kinds of friends and different kinds of stories (you are missing out on the cute toddler photos).

Rather than doing so on Facebook, MySpace, BlogSpot, Flickr, Picasa or whatever the Microsoft copies of these sites are, I maintain a couple of blogs and on-line photo albums on vambenepe.com. They all provide user access control and RSS-based syndication so no-one has to come to vambenepe.com just to check on them. No annoying advertising, no selling out of privacy and no risk of being jerked around by bait-and-switch (or simply directionless) business strategies (“in order to serve you better, we have decided that you will no longer be able to download the high-resolution version of your photos, but you can use them to print with our approved print-by-mail partners”). Have you noticed how people usually do not say “I use Facebook” but rather “I am on Facebook” as if riding a mechanical bull?

The interesting thing is that it doesn’t take a computer genius to set things up in such a way. I use Dreamhost and it, like similar hosting providers, gives you all you need. From the super-easy (e.g. they run WordPress for you) to the slightly more personal (they provide a one-click install of your own WordPress instance backed by your own database) to the do-it-yourself (they give you a PHP or RoR environment to create/deploy whatever app you want). Sure you can further upgrade to a dedicated server if you want to install a servlet container or a CodeGears environment, but my point is that you don’t need to come anywhere near this to own and run your own on-line life. You never need to see a Unix shell, unless you want to.

This is not replacing Facebook lock-in with Dreamhost lock-in. We are talking about an open-source application (WordPress) backed by a MySQL database. I can move it to any other hosting provider. And of course it’s not just blogging (WordPress) but also wiki (MediaWiki), forum (phpBB), etc.

Not that every shinny new on-line service can be replaced with a self-hosted application. You may have to wait a bit. For example, there is more to Facebook than a blog plus photo hosting. But guess what. Sounds like Bob Bickel is on the case. I very much hope that Bob and the ex-Bluestone gang isn’t just going to give us a “Facebook in a box” but also something more innovative, that makes it easy for people to run and own their side of a Facebook-like presence, with the ability to connect with other implementations for the social interactions.

We have always been able to run our own web applications, but it used to be a lot of work. My college nights were soothed by the hum of an always-running Linux server (actually a desktop used as a server) under my desk on which I ran my own SMTP server and HTTPd. My daughter’s “soothing ocean waves” baby toy sounds just the same. There were no turnkey web apps available at the time. I wrote and ran my own Web-based calendar management application in Python. When I left campus, I could have bought some co-locating service but it was a hassle and not cheap, so I didn’t bother [*].

I have a lot less time (and Linux administration skills) now than when I left university, so how come it is now attractive for me to run my own web apps again? What changed in the environment?

The main driver is the rise of the LAMP stack and especially PHP. For all the flaws of the platform and the ugliness of the code, PHP has sparked a huge ecosystem. Not just in terms of developers but also of administrators: most hosting providers are now very comfortable offering and managing PHP services.

The other driver is the rise of virtualization. Amazon hosts Xen images for you. But it’s not just the hypervisor version of virtualization. My Dreamhost server, for example, is not a Xen or VMWare virtual machine. It’s just a regular server that I share with other users but Dreamhost has created an environment that provides enough isolation from other users to meet my needs as an individual. The poor man’s virtualization if you will. Good enough.

These two trends (PHP and virtualization) have allowed Dreamhost and others to create an easy-to-use environment in which people can run and deploy web applications. And it becomes easier every day for someone to compete with Dreamhost on this. Their value to me is not in the hardware they run. It’s in environment they provide that prevents me from having to do low-level LAMP administration that I don’t have time for. Someone could create such an environment and run it on top of Amazon’s utility computing offering. Which is why I am convinced that such environments will be around for the foreseeable future, Dreamhost or no Dreamhost. Running your own web applications won’t be just for geeks anymore, just like using a GPS is not just for the geeks anymore.

Of course this is not a panacea and it won’t allow you to capture all aspects of your on-line life. You can’t host your eBay ratings. You can’t host your Amazon rank as a reviewer. It takes more than just technology to break free, but technology has underpinned many business changes before. In addition to the rise of LAMP and virtualization already mentioned, I am watching with interest the different efforts around data portability: dataportability.org, OpenID, OpenSocial, Facebook API… Except for OpenID, these efforts are driven by Web service providers hoping to canalize the demand for integration. But if they are successful, they should give rise to open source applications you can host on your own to enjoy these services without the lock-in. One should also watch tools like WSO2’s Mashup Server and JackBe Presto for their potential to rescue captive data and exploit freed data. On the “social networks” side, the RDF community has been buzing recently with news that Google is now indexing FOAF documents and exposing the content through its OpenSocial interface.

Bottom line, when you are offered to create a page, account or URL that will represent you or your data, take a second to ask yourself what it would take to do the same thing under your domain name. You don’t need to be a survivalist freak hiding in a mountain cabin in Montana (“it’s been eight years now, I wonder if they’ve started to rebuild cities after the Y2K apocalypse…”) to see value in more self-reliance on the web, especially when it can be easily achieved.

Yes, there is a connection between this post and the topic of this blog, IT management. It will be revealed in the next post (note to self: work on your cliffhangers).

[*] Some of my graduating colleagues took their machines to the dorm basement and plugged them into a switch there. Those Linux Slackware machines had amazing uptimes of months and years. Their demise didn’t come from bugs, hacking or component failures (even when cats made their litter inside a running computer with an open case) but the fire marshal, and only after a couple of years (the network admins had agreed to turn a blind eye).

[UPDATED 2008/7/7: Oh, yeah, another reason to run your own apps is that you won’t end up threatened with jail time for violating the terms of service. You can still end up in trouble if you misbehave, but they’ll have to charge you with something more real, not a whatever-sticks approach.]

[UPDATED 2009/12/30: Ringside (the Bob Bickel endeavor that I mention above), closed a few months after this post. Too bad. We still need what they were working on.]

2 Comments

Filed under Everything, Portability, Tech, Virtualization

IT management in a world of utility IT

A cynic might call it “could computing” rather than “cloud computing”. What if you could get rid of your data center. What if you could pay only for what you use. What if you could ramp up your capacity on the fly. We’ve been hearing these promising pitches for a while now and recently the intensity has increased, fueled by some real advances.

As an IT management architect who is unfortunately unlikely to be in position to retire anytime soon (donations accepted for the send-William-to-retirement-on-a-beach fund) it forces me to wonder what IT management would look like in a world in which utility computing is a common reality.

First, these utility computing providers themselves will need plenty of IT management, if not necessarily the exact same kind that is being sold to enterprises today. You still need provisioning (automated of course). You definitely need access measuring and billing. Disaster recovery. You still have to deal with change planning, asset management and maybe portfolio management. You need processes and tools to support them. Of course you still have to monitor, manage SLAs, and pinpoints problems and opportunities for improvement. Etc. Are all of these a source of competitive advantage? Google is well-known for writing its infrastructure software (and of course also its applications) in house but there is no reason it should be that way, especially as the industry matures. Even when your business is to run a data center, not all aspects of IT management provide competitive differentiation. It is also very unclear at this point what the mix will be of utility providers that offer raw infrastructure (like EC2/S3) versus applications (like CRM as a service), a difference that may change the scope of what they would consider their crown jewels.

An important variable in determining the market for IT management software directed at utility providers is the number of these providers. Will there be a handful or hundreds? Many people seem to assume a small number, but my intuition goes the other way. The two main reasons for being only a handful would be regulation and infrastructure limitations. But, unlike with today’s utilities, I don’t see either taking place for utility computing (unless you assume that the network infrastructure is going to get vertically integrated in the utility data center offering). The more independent utility computing providers there are, the more it makes sense for them to pool resources (either explicitly through projects like the Collaborative Software Initiative or implicitly by buying from the same set of vendors) which creates a market for IT management products for utility providers. And conversely, the more of a market offering there is for the software and hardware building blocks of a utility computing provider, the lower the economies of scale (e.g. in software development costs) that would tend to concentrate the industry.

Oracle for one is already selling to utility providers (SaaS-type more than EC2-type at this point) with solutions that address scalability, SLA and multi-tenancy. Those solutions go beyond the scope of this article (they include not just IT management software but also databases and applications) but Oracle Enterprise Manager for IT management is also part of the solution. According to this Aberdeen report the company is doing very well in that market.

The other side of the equation is the IT management software that is needed by the consumers of utility computing. Network management becomes even more important. Identity/security management. Desktop management of some sort (depending on whether and what kind of desktop virtualization you use). And, as Microsoft reminds us with S+S, you will most likely still be running some software on-premises that needs to be managed (Carr agrees). The new, interesting thing is going to be the IT infrastructure to manage your usage of utility computing services as well as their interactions with your in-house software. Which sounds eerily familiar. In the early days of WSMF, one of the scenarios we were attempting to address (arguably ahead of the times) was service management across business partners (that is, the protocols and models were supposed to allow companies to expose some amount of manageability along with the operational services, so that service consumers would be able to optimize their IT management decision by taking into account management aspects of the consumed services). You can see this in the fact that the WSMF-WSM specification (that I co-authored and edited many years ago at HP) contains a model of a “conversation” that represents “set of related messages exchanged with other Web services” (a decentralized view of a BPEL instance, one that represents just one service’s view of its participation in the instance). Well, replace “business partner” with “SaaS provider” and you’re in a very similar situation. If my business application calls a mix of internal services, SaaS-type services and possibly some business partner services, managing SLAs and doing impact/root cause analysis works a lot better if you get some management information from these other services. Whether it is offered by the service owner directly, by a proxy/adapter that you put on your end or by a neutral third party in charge of measuring/enforcing SLAs. There are aspects of this that are “regular” SOA management challenges (i.e. that apply whenever you compose services, whether you host them yourself or not) and there are aspects (security, billing, SLA, compliance, selection of partners, negotiation) that are handled differently in the situation where the service is consumed from a third party. But by and large, it remains a problem of management integration in a word of composed, orchestrated and/or distributed applications. Which is where it connects with my day job at Oracle.

Depending on the usage type and the level of industry standardization, switching from one utility computing provider to the other may be relatively painless and easy (modify some registry entries or some policy or even let it happen automatically based on automated policies triggered by a price change for example) or a major task (transferring huge amounts of data, translating virtual machines from one VM format to another, performing in-depth security analysis…). Market realities will impact the IT tools that get developed and the available IT tools will in return shape the market.

Another intriguing opportunity, if you assume a mix of on-premises computing and utility-based computing, is that of selling back your spare capacity on the grid. That too would require plenty of supporting IT management software for provisioning, securing, monitoring and policing (coming soon to an SEC filing: “our business was hurt by weak sales of our flagship Pepsi cola drink, partially offset by revenue from renting computing power from our data center to the Coca cola company to handle their exploding ERP application volume”). I believe my neighbors with solar panels on their roofs are able to run their electric counter backward and sell power to PG&E when they generate more than they use. But I’ll stop here with the electric grid analogy because it is already overused. I haven’t read Carr’s book so the comment may be unfair, but based on extracts he posted and reviews he seems to have a hard time letting go of that analogy. It does a good job of making the initial point but gets tiresome after a while. Having personally experienced the Silicon Valley summer rolling black-outs, I very much hope the economics of utility computing won’t be as warped. For example, I hope that the telcos will only act as technical, not commercial intermediaries. One of the many problems in California is that the consumer don’t buy from the producers but from a distributor (PG&E in the Bay Area) who sells at a fixed price and then has to buy at pretty much any price from the producers and brokers who made a killing manipulating the supply during these summers. Utility computing is another area in which economics and technology are intrinsically and dynamically linked in a way that makes predictions very difficult.

For those not yet bored of this topic (or in search of a more insightful analysis), Redmonk’s Coté has taken a crack at that same question, but unlike me he stays clear of any amateurish attempt at an economic analysis. You may also want to read Ian Foster’s analysis (interweaving pieces of technology, standards, economy, marketing, computer history and even some movie trivia) on how these “clouds” line up with the “grids” that he and others have been working on for a while now. Some will see his post as a welcome reminder that the only thing really new in “cloud” computing is the name and others will say that the other new thing is that it is actually happening in a way that matters to more than a few academics and that Ian is just trying to hitch his jalopy to the express train that’s passing him. For once I am in the “less cynical” camp on this and I think a lot of the “traditional” Grid work is still very relevant. Did I hear “EC2 components for SmartFrog”?

[UPDATED 2008/6/30: For a comparison of “cloud” and “grid”, see here.]

[UPDATED 2008/9/22: More on the Cloud vs. Grid debate: a paper critical of Grid (in the OGF sense of the term) efforts and Ian Foster’s reply (reat the comments too).]

11 Comments

Filed under Business, Everything, IT Systems Mgmt, Utility computing, Virtualization

Microsoft’s Bob Muglia opens the virtualized kimono

In a recently published “executive e-mail”, Microsoft’s Bob Muglia describes the company’s view of virtualization. You won’t be surprised to learn that he thinks it’s a big deal. Being an IT management geek, I fast-forwarded to the part about management and of course I fully agree with him on the “the importance of integrated management”. But his definition of “integrated” is slightly different from mine as becomes clear when he further qualifies it as the use of “a single set of management tools”. Sure, that makes for easier integration, but I am still of the school of thought (despite the current sorry state of management integration) that we can and must find ways to integrate heterogeneous management tools.

“Although virtualization has been around for more than four decades, the software industry is just beginning to understand the full implications of this important technology” says Bob Muglia. I am tempted to slightly re-write the second part of the sentence as “the software marketing industry is just beginning to understand the full potential of this important buzzword”. To illustrate this, look no further than that same executive e-mail, in which we learn that Terminal Server actually provides “presentation virtualization”. Soon we’ll hear that the Windows TCP/IP stack provides “geographic virtualization” and that solitaire.exe provides “card deck virtualization”.

Then there is SoftGrid (or rather, “Microsoft SoftGrid Application Virtualization”). I like the technology behind SoftGrid but when Microsoft announced this acquisition my initial thought was that coming from the company that owns the OS and the development/deployment environment on top of it, this acquisition was quite an admission of failure. And I am still very puzzled by the relevance of the SoftGrid approach in the current environment. Here is my proposed motto for SoftGrid: “can’t AJAX please go away”. Yes, I know, CAD, Photoshop, blah, blah, but what proportion of the users of these applications want desktop virtualization? And of those, what proportion can’t be satisfied with “regular” desktop virtualization (like Virtual PC, especially when reinforced with the graphical rendering capabilities from Calista which Microsoft just acquired)?

In an inspirational statement, Bob Muglia asks us to “imagine, for example, if your employees could access their personalized desktop, with all of their settings and preferences intact, on any machine, from any location”. Yes, imagine that. We’d call it the Web.

In tangentially related news, David Chappell recently released a Microsoft-sponsored white paper that describes what Microsoft calls “Software + Service”. As usual, David does a good job of explaining what Microsoft means, using clearly-defined terms (e.g. “on-premises” is used as an organizational, not geographical concept) and by making the obvious connections with existing practices such as invoking partner/supplier services and SOA. There isn’t a ton of meat behind the concept of S+S once you’ve gotten the point that even in a “cloud computing” world there is still some software that you’ll run in your organization. But since, like Microsoft, my employer (Oracle) also makes most of its money from licenses today, I can’t disagree with bringing that up…

And like Microsoft, Oracle is also very aware of the move towards SaaS and engaged in it. In that respect, figure 11 of the white paper is where a pro-Microsoft bias appears (even though I understand that the names in the figure are simply supposed to be “representative examples”). Going by it, there are the SaaS companies (that would be the cool cats of Amazon, Salesforce.com and Google plus of course Microsoft) and there are the on-premises companies (where Microsoft is joined by Oracle, SAP and IBM). Which tends to ignore the fact that Oracle is arguably more advanced than Microsoft both in terms of delivering infrastructure to SaaS providers and being a SaaS provider itself. And SAP and IBM would also probably want to have a word with you on this. But then again, they can sponsor their own white paper.

Comments Off on Microsoft’s Bob Muglia opens the virtualized kimono

Filed under Everything, Mgmt integration, Microsoft, Virtualization

Book review: Xen Virtualization

Someone from Packt Publishing asked me if I was interested in reviewing the Xen Virtualization book by Prabhakar Chaganti that they recently published. I said yes and it was in my mailbox a few days letter.

The sub-title is “a fast and practical guide to supporting multiple operating systems with the Xen hypervisor” and it turns out that the operating word is “fast”. It’s a short book (approx 130 pages, many filled with screen captures and console output listings). It is best used as an introduction to Xen for people who understand computer administration (especially Linux) but are new to virtualization.

The book contains a brief overview of virtualization, followed by a description of the most common tasks:

  • the Xen install process (from binary and source) on Fedora core 6
  • creating virtual machines (using NetBSD plus three different flavors of Linux)
  • basic management of Xen using the xm command line or the XenMan and virt-manager tools
  • setting up simple networking
  • setting up simple storage
  • encrypting partitions used by virtual machines
  • simple migration of virtual machines (stopped and live)

For all of these tasks, what we get is a step by step process that corresponds to the simple case and does not cover any troubleshooting. It is likely that anyone who embarks on the task described will need options that are not covered in the book. That’s why I write that it is an introduction that shows the kind of thing you need to do, rather than a reference that will give you the information you need in your deployment. You’ll probably need to read additional documentation, but the book will give you an idea of what stage you are in the process and what comes next.

Even with this limited scope, it is pretty light on explanations. It’s mostly a set of commands followed by a display of the result. Since it’s closer to my background I’ll take the “managing Xen” chapter as an example. There is nothing more basic to management than understanding the state of a resource. The book shows how to retrieve it (“xm list”) and very briefly describes the different states (“running”, “blocked”, “paused”, “shutdown”, “crashed”) but you would expect a bit more precision and details. For example, “blocked” is supposed to correspond to “waiting for an external event” but what does “external” mean? Sure the machine could be waiting on I/O, but it could also be on a timer (e.g. “sleep(1000)”) or simply have run out of things to do. I don’t think of a cron job as an “external event”. Also, when running “xm list” you should expect to always see dom0 in the “running” state (since dom0 is busy running your xm command) and on a one-core single-CPU machine (as is the case in the book) that means that none of the other domains can be in that state. That’s the kind of clarification (obvious in retrospect) that goes one step beyond the basic command description and saves some head scratching but the book doesn’t really go there. As another example, We are told in the “encryption” section that LUKS helps prevent “low entropy-attacks” but if you’re the kind of person who already knows what that means you probably don’t have much to learn from the “encryption” chapter of the book. In case you care, it is a class of attacks that take advantage of poor sources of random numbers and you can read all the details of how entropy is defined in this classic 1948 paper (it doesn’t have much to do with how the term is defined in physics).

Among the many more advanced topics that are not covered I can think of: advanced networking, clustering, advanced storage, Windows guests (even though it’s not Xen’s strong point), migration between physical and virtual, relationship to other IT management tasks (e.g. server and OS management), performance aspects, partitioning I/O so domains play well together, security considerations (beyond simply encrypting the file system), new challenges introduced by virtualization…

Xen documentation on the web is pretty poor at this point and the book provides more than most simple “how-to” guides on installing/configuring Xen that you can Google for. And it brings a consistent sequence of such “how-to” guides together in one package. If that’s worth it to you then get the book. But don’t expect this to cover all your documentation needs for anything beyond the simplest (and luckiest) deployment. I would be pleased to see the book on the desk of an IT manager in a shop that is considering using virtualization, I would be scared to see it on the desk of an IT administrator in a shop that is actually using Xen.

[UPDATED on 2008/02/01: Dan Magenheimer, a Xen expert who works on the Oracle VM, highly recommends another Xen book that just came out: Professional Xen Virtualization by William von Hagen. I haven’t seen that book but I trust Dan on this topic.]

Comments Off on Book review: Xen Virtualization

Filed under Book review, Everything, Virtualization, Xen

Top 10 lists and virtualization management

Over the last few months, I have seen two “top 10” lists with almost the same title and nearly zero overlap in content. One is Network World’s “10 virtualization companies to watch” published in August 2007. The other is CIO’s “10 Virtualization Vendors to Watch in 2008” published three months later. To be precise, there is only one company present in both lists, Marathon Technologies. Congratulations to them (note to self: hire their PR firm when I start my own company). Things are happening quickly in that field, but I doubt the landscape changed drastically in these three months (even though the announcement of Oracle’s Virtual Machine product came during that period). So what is this discrepancy telling us?

If anything, this is a sign of the immaturity of the emerging ecosystem around virtualization technologies. That being said, it could well be that all this really reflects is the superficiality of these “top 10” lists and the fact that they measure PR efforts more than any market/technology fact (note to self: try to become less cynical in 2008) (note to self: actually, don’t).

So let’s not read too much into the discrepancy. Less striking but more interesting is the fact that these lists are focused on management tools rather than hypervisors. It is as if the competitive landscape for hypervisors was already defined. And, as shouldn’t be a surprise, it is defined in a way that closely mirrors the operating system landscape, with Xen as Linux (the various Xen-based offerings correspond to the Linux distributions), VMWare as Solaris (good luck) and Microsoft as, well Microsoft.

In the case of Windows and Hyper-V, it is actually bundled as one product. We’ll see this happen more and more on the Linux/Xen side as well, as illustrated by Oracle’s offering. I wouldn’t be surprised to see this bundling so common that people start to refer to it as “LinuX” with a capital X.

Side note: I tried to see if the word “LinuX” is already being used but neither Google nor Yahoo nor MSN seems to support case-sensitive searching. From the pre-Google days I remember that Altavista supported it (a lower-case search term meant “any capitalization”, any upper-case letter in the search term meant “this exact capitalization”) but they seem to have dropped it too. Is this too computationally demanding at this scale? Is there no way to do a case-sensitive search on the Web?

With regards to management tools for virtualized environments, I feel pretty safe in predicting that the focus will move from niche products (like those on these lists) that deal specifically with managing virtualization technology to the effort of managing virtual entities in the context of the overall IT management effort. Just like happened with security management and SOA management. And of course that will involve the acquisition of some of the niche players, for which they are already positioning themselves. The only way I could be proven wrong on such a prediction is by forecasting a date, so I’ll leave it safely open ended…

As another side note, since I mention Network World maybe I should disclose that I wrote a couple of articles for them (on topics like model-based management) in the past. But when filtering for bias on this blog it’s probably a lot more relevant to keep in mind that I am currently employed by Oracle than to know what journal/magazine I’ve been published in.

Comments Off on Top 10 lists and virtualization management

Filed under Everything, IT Systems Mgmt, Linux, Microsoft, Oracle, OVM, Tech, Virtualization, VMware, XenSource

Virtual machine or fake machine?

In yesterday’s post I wrote a bit about the recently-announced Oracle Virtual Machine. But in the larger scheme, I have always been uncomfortable with the focus on VMWare-style virtual machines as the embodiement of “virtualization”. If a VMWare VM is a virtual machine does that mean a Java Virtual Machine (JVM) is not a virtual machine? They are pretty different. When you get a fresh JVM, the first thing you do is not to install an OS on it. To help distinguish them, I think of the VMWare style as a “fake machine” and the JVM style as an “abstract machine”. A “fake machine” behaves as similarly as possible to a physical machine and that is a critical part of its value proposition: you can run all the applications that were developed for physical machines and they shouldn’t behave any differently while at the same time you get some added benefits in terms of saving images, moving images around, more efficiently using your hardware, etc. An “abstract machine”, on the other hand, provides value by defining and implementing a level of abstraction different from that of a physical machine: developing to this level provides you with increased productivity, portability, runtime management capabilities, etc. And then, in addition to these “fake machines” and “abstract machines”, there is the virtualization approach that makes many machines appear as one, often refered to as grid computing. That’s three candidates already for carrying the “virtualization” torch. You can also add Amazon-style storage/computing services (e.g. S3 and EC2) as an even more drastic level of virtualization.

The goal here is not to collect as many buzzwords as possible within one post, but to show how all these efforts represent different ways to attack similar issues of flexibility and scalability for IT. There is plenty of overlap as well. JSRs 121 and 284, for example, can be seen as paving the way for more easily moving JVMs around, WMWare-style. Something like Oracle Coherence lives at the junction of JVM-style “abstract machines” and grid computing to deliver data services. And as always, these technologies are backed by a management infrastructure that makes them usable in the way that best serves the applications running on top of the “virtualized” (by one of the definitions above) runtime infrastructure. There is a lot more to virtualization than VMWare or OVM.

[UPDATED 2007/03/17: Toutvirtual has a nice explanation of the preponderance of “hypervisor based platforms” (what I call “fake machines” above) due to, among other things, failures of operating systems (especially Windows).]

[UPDATED 2009/5/1: For some reason this entry is attracting a lot of comment spam, so I am disabling comments. Contact me if you’d like to comment.]

1 Comment

Filed under Everything, IT Systems Mgmt, OVM, Virtualization, VMware

Oracle has joined the VM party

On the occasion of the introduction of the Oracle Virtual Machine (OVM) at Oracle World a couple of weeks ago, here are a few thoughts about virtual machines in general. As usual when talking about virtualization (see the OVF review), I come to this mainly from a systems management perspective.

Many of the commonly listed benefits of VMWare-style (I guess I can also now say OVM-style) virtualization make perfect sense. It obviously makes it easier to test on different platforms/configurations and it is a convenient (modulo disk space availability) way to distribute ready-to-use prototypes and demos. And those were, not surprisingly, the places where the technology was first used when it appeared on X86 platforms many years ago (I’ll note that the Orale VM won’t be very useful for the second application because it only runs on bare metal while in the demo scenario you usually want to be able to run it on the host OS that normally runs you laptop). And then there is the server consolidation argument (and associated hardware/power/cooling/space savings) which is where virtualization enters the data center, where it becomes relevant to Oracle, and where its relationship with IT management becomes clear. But the value goes beyond the direct benefits of server consolidation. It also lies in the additional flexibility in the management of the infrastructure and the potential for increased automation of management tasks.

Sentences that contains both the words “challenge” and “opportunity” are usually so corny they make me cringe, but I’ll have to give in this one time: virtualization is both a challenge and an opportunity for IT management. Most of today’s users of virtualization in data centers probably feel that the technology has made IT management harder for them. It introduces many new considerations, at the same time technical (e.g. performance of virtual machines on the same host are not independent), compliance-related (e.g. virtualization can create de-facto super-users) and financial (e.g. application licensing). And many management tools have not yet incorporated these new requirements, or at least not in a way that is fully integrated with the rest of the management infrastructure. But in the longer run the increased uniformity and flexibility provided by a virtualized infrastructure raise the ability to automate and optimize management tasks. We will get from a situation where virtualization is justified by statements such as “the savings from consolidation justify the increased management complexity” to a situation where the justification is “we’re doing this for the increased flexibility (through more automated management that virtualization enables), and server consolidation is icing on the cake”.

As a side note, having so many pieces of the stack (one more now with OVM) at Oracle is very interesting from a technical/architectural point of view. Not that Oracle would want to restrict itself to managing scenarios that utilize its VM, its OS, its App Server, its DB, etc. But having the whole stack in-house provides plenty of opportunity for integration and innovation in the management space. These capabilities also need to be delivered in heterogeneous environments but are a lot easier to develop and mature when you can openly collaborate with engineers in all these domains. Having done this through standards and partnerships in the past, I am pleased to be in a position to have these discussions inside the same company for a change.

1 Comment

Filed under Everything, IT Systems Mgmt, Oracle, Oracle Open World, OVM, Tech, Virtualization, VMware

A review of OVF from a systems management perspective

I finally took a look at OVF, the virtual machine distribution specification that was recently submitted to DMTF. The document is authored by VMware and XenSource, but they are joined in the submission to DMTF by some other biggies, namely Microsoft, HP, IBM and Dell.

Overall, the specification does a good job of going after the low-hanging fruits of VM distribution/portability. And the white paper is very good. I wish I could say that all the specifications I have been associated with came accompanied by such a clear description of what they are about.

I am not a virtualization, operating system or hardware expert. I am mostly looking at this specification from the systems management perspective. More specifically I see virtualization and standardization as two of the many threads that create a great opportunity for increased automation of IT management and more focus on the application rather than the infrastructure (which is part of why I am now at Oracle). Since OVF falls in both the “virtualization” and “standardization” buckets, it got my attention. And the stated goal of the specification (“facilitate the automated, secure management not only of virtual machines but the appliance as a functional unit”, see section 3.1) seems to fit very well with this perspective.

On the other hand, the authors explicitly state that in the first version of the specification they are addressing the package/distribution stage and the deployment stage, not the earlier stage (development) or the later ones (management and retirement). This sidesteps many of the harder issues, which is part of why I write that the specification goes after the low-hanging fruits (nothing wrong with starting that way BTW).

The other reason for the “low hanging fruit” statement is that OVF is just a wrapper around proprietary virtual disk formats. It is not a common virtual disk format. I’ve read in several news reports that this specification provides portability across VM platforms. It’s sad but almost expected that the IT press would get this important nuance wrong, it’s more disappointing when analysts (who should know better) do, as for example the Burton Group which writes in its analysis “so when OVF is supported on Xen and VMware virtualization platforms for example, a VM packaged on a VMware hypervisor can run on a Xen hypervisor, and vice-versa”. That’s only if someone at some point in the chain translates from the Xen virtual disk format to the VMware one. OVF will provide deployment metadata and will allow you to package both virtual disks in a TAR if you so desire, but it will not do the translation for you. And the OVF authors are pretty up front about this (for example, the white paper states that “the act of packaging a virtual machine into an OVF package does not guarantee universal portability or install-ability across all hypervisors”). On a side note, this reminds me a bit of how the Sun/Microsoft Web SSO MEX and Web SSO Interop Profile specifications were supposed to bridge Passport with WS-Federation which was a huge overstatement. Except that in that case, the vendors were encouraging the misconception (which the IT press happily picked up) while in the OVF case it seems like the vendors are upfront about the limitations.

There is nothing rocket-science about OVF and even as a non-virtualization expert it makes sense to me. I was very intrigued by the promise that the specification “directly supports the configuration of multi-tier applications and the composition of virtual machines to deliver composed services” but this turns out to be a bit of an overstatement. Basically, you can distribute the VMs across networks by specifying a network name for each VM. I can easily understand the simple case, where all the VMs are on the same network and talking to one another. But there is no way (that I can see) to specify the network topology that joins different networks together, e.g. saying that there is a firewall between networks “blue” and “red” that only allows traffic on port 80). So why would I create an OVF file that composes several virtual machines if they are going to be deployed on networks that have no relationships to one another? I guess the one use case I can think of would be if one of the virtual machines was assigned to two networks and acted as a gateway/firewall between them. But that’s not a very common and scalable way to run your networks. There is a reason why Cisco sells $30 billions of networking gear every year. So what’s the point of this lightweight distributed deployment? Is it just for that use case where the network gear is also virtualized, in the expectation of future progress in that domain? Is this just a common anchor point to be later extended with more advanced network topology descriptions? This looks to me like an attempt to pick a low-hanging fruit that wasn’t there.

Departing from usual practice, this submission doesn’t seem to come with any license grant, which must have greatly facilitated its release and the recruitment of supporters for the submission. But it should be a red flag for adopters. It’s worth keeping track of its IP status as the work progresses. Unless things have changed recently, DMTF’s IP policy is pretty weak so the fact that works happens there doesn’t guarantee much protection per se to the adopters. Interestingly, there are two sections (6.2 about the virtual disk format and 11.3 about the communication between the guest software and the deployment platform) where the choice of words suggests the intervention of patent lawyers: phrases like “unencumbered specification” (presumably unencumbered with licensing requirements) and “someone skilled in the art”. Which is not surprising since this is the part where the VMWare-specific, Xen-specific or Microsoft-specific specifications would plug in.

Speaking of lawyers, the section that allows the EULA to be shipped with the virtual appliance is very simplistic. It’s just a human-readable piece of text in the OVF file. The specification somewhat naively mentions that “if unattended installs are allowed, all embedded license sections are implicitly accepted”. Great, thanks, enterprises love to implicitly accept licensing terms. I would hope that the next version will provide, at least, a way to have a URI to identify the EULA so that I can maintain a list of pre-approved EULAs for which unattended deployment is possible. Automation of IT management is supposed to makes things faster and cheaper. Having a busy and expensive lawyer read a EULA as part of my deployment process goes against both objectives.

It’s nice of the authors to do the work of formatting the specification using the DMTF-approved DSPxxxx format before submitting to the organization. But using a targetnamespace in the dmtf.org domain when the specification is just a submission seems pretty tacky to me, unless they got a green light from the DMTF ahead of time. Also, it looks a little crass on the part of VMware to wrap the specification inside their corporate white paper template (cover page and back page) if this is a joint publication. See the links at http://www.vmware.com/appliances/learn/ovf.html. Even though for all I know VMware might have done most of the actual work. That’s why the links that I used to the white paper and the specification are those at XenSource, which offers the plain version. But then again, this specification is pretty much a wrapper around a virtual disk file, so graphically wrapping it may have seemed appropriate…

OK, now for some XML nitpicking.

I am not a fan of leaving elementformdefault set to “unqualified” but it’s their right to do so. But then they qualify all the attributes in the specification examples. That looks a little awkward to me (I tend to do the opposite and qualify the elements but not the attributes) and, more importantly, it violates the schema in appendix since the schema leaves attributeFormDefault to its default value (unqualified). I would rather run a validation before makings this accusation, but where are the stand-alone XSD files? The white paper states that “it is the intention of the authors to ensure that the first version of the specification is implemented in their products, and so the vendors of virtual appliances and other ISV enablement, can develop to this version of the specification” but do you really expect us to copy/paste from PDF and then manually remove the line numbers and header/footer content that comes along? Sorry, I have better things to do (like whine about it on this blog) so I haven’t run the validation to verify that the examples are indeed in violation. But that’s at least how they look to me.

I also have a problem with the Section and Content elements that are just shells defined by the value of their xsi:type attribute. The authors claim it’s for extensibility (“the use of xsi:type is a core part of making the OVF extensible, since additional type definitions for sections can be added”) but there are better ways to do extensibility in XML (remember, that’s what the X stands for). It would be better to define an element per type (disk, network…). They could possibly be based on the same generic type in XSD. And this way you get more syntactic flexibility and you get the option to have sub-types of sub-types rather than a flat list. Interestingly, there is a comment inside the XSD that defines the Section type that reads “the base class for a section. Subclassing this is the most common form of extensibility”. That’s the right approach, but somehow it got dropped at some point.

Finally, the specification seems to have been formated based on WS-Management (which is the first specification that mixed the traditional WS-spec conventions with the DMTF DSPxxxx format), which may explain why WS-Management is listed as a reference at the end even though it is not used anywhere in the specification. That’s fine but it shows in a few places where more editing is needed. For example requirement R1.5-1 states that “conformant services of this specification MUST use this XML namespace Universal Resource Identifier (URI): http://schemas.dmtf.org/ovf”. I know what a conformant service is for WS-Management but I don’t know what it is for this specification. Also, the namespace that this requirement uses is actually not defined or used by this specification, so this requirement is pretty meaningless. The table of namespaces that follows just after is missing some namespaces. For example, the prefix “xsi” is used on line 457 (xsi:any and xsi:AnyAttribute) and I want to say it’s the wrong one as xsi is usually assigned to “http://www.w3.org/2001/XMLSchema-instance” and not “http://www.w3.org/2001/XMLSchema” but since the prefix is not in the table I guess it’s anyone’s guess (and BTW, it’s “anyAttribute”, not “AnyAttribute”).

By this point I may sound like I don’t like the specification. Not at all. I still stand with what I wrote in the second paragraph. It’s a good specification and the subset of problems that it addresses is a useful subset. There are a few things to fix in the current content and several more specifications to write to complement it, but it’s a very good first step and I am glad to see VMware and XenSource collaborating on this. Microsoft is nominally in support at this point, but it remains to be seen to what extent. I haven’t seen them in the past very interested in standards effort that they are not driving and so far this doesn’t appear to be something they are driving.

6 Comments

Filed under DMTF, Everything, IT Systems Mgmt, OVF, Specs, Standards, Tech, Virtualization, VMware, XenSource