William Vambenepe's blog

IT management in a changing IT world

viagra no prescriptionlevitra versus viagra

Archive for the 'OVM' Category

31
May
2008

Google App Engine: less is more

by William Vambenepe

“If you have a stove, a saucepan and a bottle of cold water, how can you make boiling water?”

If you ask this question to a mathematician, they’ll think about it a while, and finally tell you to pour the water in the saucepan, light up the stove and put the saucepan on it until the water boils. Makes sense. Then ask them a slightly different question: “if you have a stove and a saucepan filled with cold water, how can you make boiling water?”. They’ll look at you and ask “can I also have a bottle”? If you agree to that request they’ll triumphantly announce: “pour the water from the saucepan into the bottle and we are back to the previous problem, which is already solved.”

In addition to making fun of mathematicians, this is a good illustration of the “fake machine” approach to utility computing embodied by Amazon’s EC2. There is plenty of practical value in emulating physical machines (either in your data center, using VMWare/Xen/OVM or at a utility provider’s site, e.g. EC2). They are all rooted in the fact that there is a huge amount of code written with the assumption that it is running on an identified physical machine (or set of machines), and you want to keep using that code. This will remain true for many many years to come, but is it the future of utility computing?

Google’s App Engine is a clear break from this set of assumptions. From this perspective, the App Engine is more interesting for what it doesn’t provide than for what it provides. As the description of the Sandbox explains:

“An App Engine application runs on many web servers simultaneously. Any web request can go to any web server, and multiple requests from the same user may be handled by different web servers. Distribution across multiple web servers is how App Engine ensures your application stays available while serving many simultaneous users [not to mention that this is also how they keep their costs low -- William]. To allow App Engine to distribute your application in this way, the application runs in a restricted ’sandbox’ environment.”

The page then goes on to succinctly list the limitations of the sandbox (no filesystem, limited networking, no threads, no long-lived requests, no low-level OS functions). The limitations are better described and commented upon here but even that article misses one major limitation, mentioned here: the lack of scheduler/cron.

Rather than a feature-by-feature comparison between the App Engine and EC2 (which Amazon would won handily at this point), what is interesting is to compare the underlying philosophies. Even with Amazon EC2, you don’t get every single feature your local hardware can deliver. For example, in its initial release EC2 didn’t offer a filesystem, only a storage-as-a-service interface (S3 and then SimpleDB). But Amazon worked hard to fix this as quickly as possible in order to be appear as similar to a physical infrastructure as possible. In this entry, announcing persistent storage for EC2, Amazon’s CTO takes pain to highlight this achievement:

“Persistent storage for Amazon EC2 will be offered in the form of storage volumes which you can mount into your EC2 instance as a raw block storage device. It basically looks like an unformatted hard disk. Once you have the volume mounted for the first time you can format it with any file system you want or if you have advanced applications such as high-end database engines, you could use it directly.”

and

“And the great thing is it that it is all done with using standard technologies such that you can use this with any kind of application, middleware or any infrastructure software, whether it is legacy or brand new.”

Amazon works hard to hide (from the application code) the fact that the infrastructure is a huge, shared, distributed system. The beauty (and business value) of their offering is that while the legacy code thinks it is running in a good old data center, the paying customer derives benefits from the fact that this is not the case (e.g. fast/easy/cheap provisioning and reduced management responsibilities).

Google, on the other hand, embraces the change in underlying infrastructure and requires your code to use new abstractions that are optimized for that infrastructure.

To use an automotive analogy, Amazon is offering car drivers to switch to a gas/electric hybrid that refuels in today’s gas stations while Google is pushing for a direct jump to hydrogen fuel cells.

History is rarely kind to promoters of radical departures. The software industry is especially fond of layering the new on top of the old (a practice that has been enabled by the constant increase in underlying computing capacity). If you are wondering why your command prompt, shell terminal or text editor opens with a default width of 80 characters, take a trip back to 1928, when IBM defined its 80-columns punch card format. Will Google beat the odds or be forced to be more accommodating of existing code?

It’s not the idea of moving to a more abstracted development framework that worries me about Google’s offering (JEE, Spring and Ruby on Rails show that developers want this move anyway, for productivity reasons, even if there is no change in the underlying infrastructure to further motivate it). It’s the fact that by defining their offering at the level of this framework (as opposed to one level below, like Amazon), Google puts itself in the position of having to select the right framework. Sure, they can support more than one. But the speed of evolution in that area of the software industry shows that it’s not mature enough (yet?) for any party to guess where application frameworks are going. Community experimentation has been driving application frameworks, and Google App Engine can’t support this. It can only select and freeze a few framework.

Time will tell which approach works best, whether they should exist side by side or whether they slowly merge into a “best of both worlds” offering (Amazon already offers many features, like snapshots, that aim for this “best of both worlds”). Unmanaged code (e.g. C/C++ compiled programs) and managed code (JVM or CLR) have been coexisting for a while now. Traditional applications and utility-enabled applications may do so in the future. For all I know, Google may decide that it makes business sense for them too to offer a Xen-based solution like EC2 and Amazon may decide to offer a more abstracted utility computing environment along the lines of the App Engine. But at this point, I am glad that the leaders in utility computing have taken different paths as this will allow the whole industry to experiment and progress more quickly.

The comparison is somewhat blurred by the fact that the Google offering has not reached the same maturity level as Amazon’s. It has restrictions that are not directly related to the requirements of the underlying infrastructure. For example, I don’t see how the distributed infrastructure prevents the existence of a scheduling service for background jobs. I expect this to be fixed soon. Also, Amazon has a full commercial offering, with a price list and an ecosystem of tools, why Google only offers a very limited beta environment for which you can’t buy extra capacity (but this too is changing).

25
Dec
2007

Top 10 lists and virtualization management

by William Vambenepe

Over the last few months, I have seen two “top 10″ lists with almost the same title and nearly zero overlap in content. One is Network World’s “10 virtualization companies to watch” published in August 2007. The other is CIO’s “10 Virtualization Vendors to Watch in 2008″ published three months later. To be precise, there is only one company present in both lists, Marathon Technologies. Congratulations to them (note to self: hire their PR firm when I start my own company). Things are happening quickly in that field, but I doubt the landscape changed drastically in these three months (even though the announcement of Oracle’s Virtual Machine product came during that period). So what is this discrepancy telling us?

If anything, this is a sign of the immaturity of the emerging ecosystem around virtualization technologies. That being said, it could well be that all this really reflects is the superficiality of these “top 10″ lists and the fact that they measure PR efforts more than any market/technology fact (note to self: try to become less cynical in 2008) (note to self: actually, don’t).

So let’s not read too much into the discrepancy. Less striking but more interesting is the fact that these lists are focused on management tools rather than hypervisors. It is as if the competitive landscape for hypervisors was already defined. And, as shouldn’t be a surprise, it is defined in a way that closely mirrors the operating system landscape, with Xen as Linux (the various Xen-based offerings correspond to the Linux distributions), VMWare as Solaris (good luck) and Microsoft as, well Microsoft.

In the case of Windows and Hyper-V, it is actually bundled as one product. We’ll see this happen more and more on the Linux/Xen side as well, as illustrated by Oracle’s offering. I wouldn’t be surprised to see this bundling so common that people start to refer to it as “LinuX” with a capital X.

Side note: I tried to see if the word “LinuX” is already being used but neither Google nor Yahoo nor MSN seems to support case-sensitive searching. From the pre-Google days I remember that Altavista supported it (a lower-case search term meant “any capitalization”, any upper-case letter in the search term meant “this exact capitalization”) but they seem to have dropped it too. Is this too computationally demanding at this scale? Is there no way to do a case-sensitive search on the Web?

With regards to management tools for virtualized environments, I feel pretty safe in predicting that the focus will move from niche products (like those on these lists) that deal specifically with managing virtualization technology to the effort of managing virtual entities in the context of the overall IT management effort. Just like happened with security management and SOA management. And of course that will involve the acquisition of some of the niche players, for which they are already positioning themselves. The only way I could be proven wrong on such a prediction is by forecasting a date, so I’ll leave it safely open ended…

As another side note, since I mention Network World maybe I should disclose that I wrote a couple of articles for them (on topics like model-based management) in the past. But when filtering for bias on this blog it’s probably a lot more relevant to keep in mind that I am currently employed by Oracle than to know what journal/magazine I’ve been published in.

03
Dec
2007

Virtual machine or fake machine?

by William Vambenepe

In yesterday’s post I wrote a bit about the recently-announced Oracle Virtual Machine. But in the larger scheme, I have always been uncomfortable with the focus on VMWare-style virtual machines as the embodiement of “virtualization”. If a VMWare VM is a virtual machine does that mean a Java Virtual Machine (JVM) is not a virtual machine? They are pretty different. When you get a fresh JVM, the first thing you do is not to install an OS on it. To help distinguish them, I think of the VMWare style as a “fake machine” and the JVM style as an “abstract machine”. A “fake machine” behaves as similarly as possible to a physical machine and that is a critical part of its value proposition: you can run all the applications that were developed for physical machines and they shouldn’t behave any differently while at the same time you get some added benefits in terms of saving images, moving images around, more efficiently using your hardware, etc. An “abstract machine”, on the other hand, provides value by defining and implementing a level of abstraction different from that of a physical machine: developing to this level provides you with increased productivity, portability, runtime management capabilities, etc. And then, in addition to these “fake machines” and “abstract machines”, there is the virtualization approach that makes many machines appear as one, often refered to as grid computing. That’s three candidates already for carrying the “virtualization” torch. You can also add Amazon-style storage/computing services (e.g. S3 and EC2) as an even more drastic level of virtualization.

The goal here is not to collect as many buzzwords as possible within one post, but to show how all these efforts represent different ways to attack similar issues of flexibility and scalability for IT. There is plenty of overlap as well. JSRs 121 and 284, for example, can be seen as paving the way for more easily moving JVMs around, WMWare-style. Something like Oracle Coherence lives at the junction of JVM-style “abstract machines” and grid computing to deliver data services. And as always, these technologies are backed by a management infrastructure that makes them usable in the way that best serves the applications running on top of the “virtualized” (by one of the definitions above) runtime infrastructure. There is a lot more to virtualization than VMWare or OVM.

[UPDATED 2007/03/17: Toutvirtual has a nice explanation of the preponderance of "hypervisor based platforms" (what I call "fake machines" above) due to, among other things, failures of operating systems (especially Windows).]

02
Dec
2007

Oracle has joined the VM party

by William Vambenepe

On the occasion of the introduction of the Oracle Virtual Machine (OVM) at Oracle World a couple of weeks ago, here are a few thoughts about virtual machines in general. As usual when talking about virtualization (see the OVF review), I come to this mainly from a systems management perspective.

Many of the commonly listed benefits of VMWare-style (I guess I can also now say OVM-style) virtualization make perfect sense. It obviously makes it easier to test on different platforms/configurations and it is a convenient (modulo disk space availability) way to distribute ready-to-use prototypes and demos. And those were, not surprisingly, the places where the technology was first used when it appeared on X86 platforms many years ago (I’ll note that the Orale VM won’t be very useful for the second application because it only runs on bare metal while in the demo scenario you usually want to be able to run it on the host OS that normally runs you laptop). And then there is the server consolidation argument (and associated hardware/power/cooling/space savings) which is where virtualization enters the data center, where it becomes relevant to Oracle, and where its relationship with IT management becomes clear. But the value goes beyond the direct benefits of server consolidation. It also lies in the additional flexibility in the management of the infrastructure and the potential for increased automation of management tasks.

Sentences that contains both the words “challenge” and “opportunity” are usually so corny they make me cringe, but I’ll have to give in this one time: virtualization is both a challenge and an opportunity for IT management. Most of today’s users of virtualization in data centers probably feel that the technology has made IT management harder for them. It introduces many new considerations, at the same time technical (e.g. performance of virtual machines on the same host are not independent), compliance-related (e.g. virtualization can create de-facto super-users) and financial (e.g. application licensing). And many management tools have not yet incorporated these new requirements, or at least not in a way that is fully integrated with the rest of the management infrastructure. But in the longer run the increased uniformity and flexibility provided by a virtualized infrastructure raise the ability to automate and optimize management tasks. We will get from a situation where virtualization is justified by statements such as “the savings from consolidation justify the increased management complexity” to a situation where the justification is “we’re doing this for the increased flexibility (through more automated management that virtualization enables), and server consolidation is icing on the cake”.

As a side note, having so many pieces of the stack (one more now with OVM) at Oracle is very interesting from a technical/architectural point of view. Not that Oracle would want to restrict itself to managing scenarios that utilize its VM, its OS, its App Server, its DB, etc. But having the whole stack in-house provides plenty of opportunity for integration and innovation in the management space. These capabilities also need to be delivered in heterogeneous environments but are a lot easier to develop and mature when you can openly collaborate with engineers in all these domains. Having done this through standards and partnerships in the past, I am pleased to be in a position to have these discussions inside the same company for a change.