Category Archives: Portability

Moving towards utility/cloud computing standards?

This Forbes article (via John) channels 3Tera’s Bert Armijo’s call for standardization of utility computing. He calls it “Open Cloud” and it would “allow a company’s IT systems to be shared between different cloud computing services and moved freely between them“. Bert talks a bit more about it on his blog and, while he doesn’t reference the Forbes interview (too modest?), he points to Cloudscape as the vision.

A few early thoughts on all this:

  • No offense to Forbes but I wouldn’t read too much into the article. Being Forbes, they get quotes from a list of well-known people/companies (Google and Amazon spokespeople, Forrester analyst, Nick Carr). But these quotes all address the generic idea of utility computing standards, not the specifics of Bert’s project.
  • Saying that “several small cloud-computing firms including Elastra and Rightscale are already on board with 3Tera’s standards group” is ambiguous. Are they on-board with specific goals and a candidate specification? Or are they on board with the general idea that it might be time to talk about some kind of standard in the general area of utility computing?
  • IEEE and W3C are listed as possible hosts for the effort, but they don’t seem like a very good match for this area. I would have thought of DMTF, OASIS or even OGF first. On the face of it, DMTF might be the best place but I fear that companies like 3Tera, Rightscale and Elastra would be eaten alive by the board member companies there. It would be almost impossible for them to drive their vision to completion, unlike what they can do in an OASIS working group.
  • A new consortium might be an option, but a risky and expensive one. I have sometimes wondered (after seeing sad episodes of well-meaning and capable start-ups being ripped apart by entrenched large vendors in standards groups) why VCs don’t play a more active role in standards. Standards sound like the kind of thing VCs should be helping their companies with. VC firms are pretty used to working together, jointly investing in companies. Creating a new standard consortium might be too hard for 3Tera, but if the VCs behind 3Tera, Elastra and Rightscale got together and looked at the utility computing companies in their portfolios, it might make sense to join forces on some well-scoped standardization effort that may not otherwise be given a chance in existing groups.
  • I hope Bert will look into the history of DCML, a similar effort (it was about data center automation, which utility computing is not that far from once you peel away the glossy pictures) spearheaded by a few best-of-bread companies but ignored by the big boys. It didn’t really take off. If it had, utility computing standards might now be built as an update/extension of that specification. Of course DCML started as a new consortium and ended as an OASIS “member section” (a glorified working group), so this puts a grain of salt on my “create a new consortium and/or OASIS group” suggestion above.
  • The effort can’t afford to be disconnected from other standards in the virtualization and IT management domains. How does the effort relate to OVF? To WS-Management? To existing modeling frameworks? That’s the main draw towards DMTF as a host.
  • What’s the open source side of this effort? As John mentions during the latest Redmonk/Willis IT management podcast (starting around minute 24), there needs to a open source side to this. Actually, John thinks all you need is the open source side. Coté brings up Eucalyptus. BTW, if you want an existing combination of standards and open source, have a look at CDDLM (standard) and SmartFrog (implementation, now with EC2/S3 deployment)
  • There seems to be some solid technical raw material to start from. 3Tera’s ADL, combined with Elastra’s ECML/EDML, presumably captures a fair amount of field expertise already. But when you think of them as a starting point to standardization, the mindset needs to switch from “what does my product need to work” to “what will the market adopt that also helps my product to work”.
  • One big question (at least from my perspective) is that of the line between infrastructure and applications. Call me biased, but I think this effort should focus on the infrastructure layer. And provide hooks to allow application-level automation to drive it.
  • The other question is with regards to the management aspect of the resulting system and the role management plays in whatever standard specification comes out of Bert’s effort.

Bottom line: I applaud Bert’s efforts but I couldn’t sleep well tonight if I didn’t also warn him that “there be dragons”.

And for those who haven’t seen it yet, here is a very good document on the topic (but it is focused on big vendors, not on how smaller companies can play the standards game).

[UPDATED 2008/6/30: A couple hours after posting this, I see that Coté has just published a blog post that elaborates on his view of cloud standards. As an addition to the podcast I mentioned earlier.]

[UPDATED 2008/7/2: If you read this in your feed viewer (rather than directly on vambenepe.com) and you don’t see the comments, you should go have a look. There are many clarifications and some additional insight from the best authorities on the topic. Thanks a lot to all the commenters.]

20 Comments

Filed under Amazon, Automation, Business, DMTF, Everything, Google, Google App Engine, Grid, HP, IBM, IT Systems Mgmt, Mgmt integration, Modeling, OVF, Portability, Specs, Standards, Utility computing, Virtualization

Google App Engine: less is more

“If you have a stove, a saucepan and a bottle of cold water, how can you make boiling water?”

If you ask this question to a mathematician, they’ll think about it a while, and finally tell you to pour the water in the saucepan, light up the stove and put the saucepan on it until the water boils. Makes sense. Then ask them a slightly different question: “if you have a stove and a saucepan filled with cold water, how can you make boiling water?”. They’ll look at you and ask “can I also have a bottle”? If you agree to that request they’ll triumphantly announce: “pour the water from the saucepan into the bottle and we are back to the previous problem, which is already solved.”

In addition to making fun of mathematicians, this is a good illustration of the “fake machine” approach to utility computing embodied by Amazon’s EC2. There is plenty of practical value in emulating physical machines (either in your data center, using VMWare/Xen/OVM or at a utility provider’s site, e.g. EC2). They are all rooted in the fact that there is a huge amount of code written with the assumption that it is running on an identified physical machine (or set of machines), and you want to keep using that code. This will remain true for many many years to come, but is it the future of utility computing?

Google’s App Engine is a clear break from this set of assumptions. From this perspective, the App Engine is more interesting for what it doesn’t provide than for what it provides. As the description of the Sandbox explains:

“An App Engine application runs on many web servers simultaneously. Any web request can go to any web server, and multiple requests from the same user may be handled by different web servers. Distribution across multiple web servers is how App Engine ensures your application stays available while serving many simultaneous users [not to mention that this is also how they keep their costs low — William]. To allow App Engine to distribute your application in this way, the application runs in a restricted ‘sandbox’ environment.”

The page then goes on to succinctly list the limitations of the sandbox (no filesystem, limited networking, no threads, no long-lived requests, no low-level OS functions). The limitations are better described and commented upon here but even that article misses one major limitation, mentioned here: the lack of scheduler/cron.

Rather than a feature-by-feature comparison between the App Engine and EC2 (which Amazon would won handily at this point), what is interesting is to compare the underlying philosophies. Even with Amazon EC2, you don’t get every single feature your local hardware can deliver. For example, in its initial release EC2 didn’t offer a filesystem, only a storage-as-a-service interface (S3 and then SimpleDB). But Amazon worked hard to fix this as quickly as possible in order to be appear as similar to a physical infrastructure as possible. In this entry, announcing persistent storage for EC2, Amazon’s CTO takes pain to highlight this achievement:

“Persistent storage for Amazon EC2 will be offered in the form of storage volumes which you can mount into your EC2 instance as a raw block storage device. It basically looks like an unformatted hard disk. Once you have the volume mounted for the first time you can format it with any file system you want or if you have advanced applications such as high-end database engines, you could use it directly.”

and

“And the great thing is it that it is all done with using standard technologies such that you can use this with any kind of application, middleware or any infrastructure software, whether it is legacy or brand new.”

Amazon works hard to hide (from the application code) the fact that the infrastructure is a huge, shared, distributed system. The beauty (and business value) of their offering is that while the legacy code thinks it is running in a good old data center, the paying customer derives benefits from the fact that this is not the case (e.g. fast/easy/cheap provisioning and reduced management responsibilities).

Google, on the other hand, embraces the change in underlying infrastructure and requires your code to use new abstractions that are optimized for that infrastructure.

To use an automotive analogy, Amazon is offering car drivers to switch to a gas/electric hybrid that refuels in today’s gas stations while Google is pushing for a direct jump to hydrogen fuel cells.

History is rarely kind to promoters of radical departures. The software industry is especially fond of layering the new on top of the old (a practice that has been enabled by the constant increase in underlying computing capacity). If you are wondering why your command prompt, shell terminal or text editor opens with a default width of 80 characters, take a trip back to 1928, when IBM defined its 80-columns punch card format. Will Google beat the odds or be forced to be more accommodating of existing code?

It’s not the idea of moving to a more abstracted development framework that worries me about Google’s offering (JEE, Spring and Ruby on Rails show that developers want this move anyway, for productivity reasons, even if there is no change in the underlying infrastructure to further motivate it). It’s the fact that by defining their offering at the level of this framework (as opposed to one level below, like Amazon), Google puts itself in the position of having to select the right framework. Sure, they can support more than one. But the speed of evolution in that area of the software industry shows that it’s not mature enough (yet?) for any party to guess where application frameworks are going. Community experimentation has been driving application frameworks, and Google App Engine can’t support this. It can only select and freeze a few framework.

Time will tell which approach works best, whether they should exist side by side or whether they slowly merge into a “best of both worlds” offering (Amazon already offers many features, like snapshots, that aim for this “best of both worlds”). Unmanaged code (e.g. C/C++ compiled programs) and managed code (JVM or CLR) have been coexisting for a while now. Traditional applications and utility-enabled applications may do so in the future. For all I know, Google may decide that it makes business sense for them too to offer a Xen-based solution like EC2 and Amazon may decide to offer a more abstracted utility computing environment along the lines of the App Engine. But at this point, I am glad that the leaders in utility computing have taken different paths as this will allow the whole industry to experiment and progress more quickly.

The comparison is somewhat blurred by the fact that the Google offering has not reached the same maturity level as Amazon’s. It has restrictions that are not directly related to the requirements of the underlying infrastructure. For example, I don’t see how the distributed infrastructure prevents the existence of a scheduling service for background jobs. I expect this to be fixed soon. Also, Amazon has a full commercial offering, with a price list and an ecosystem of tools, why Google only offers a very limited beta environment for which you can’t buy extra capacity (but this too is changing).

2 Comments

Filed under Amazon, Everything, Google, Google App Engine, OVM, Portability, Tech, Utility computing, Virtualization, VMware

IT management for the personal CIO

In the previous post, I described how one can easily run their own web applications to beneficially replace many popular web sites. It was really meant as background for the present article, which is more relevant to the “IT management” topic of this blog.

Despite my assertion that recent developments (and the efforts of some hosting providers) have made the proposition of running your own web apps “easy”, it is still not as easy as it should be. What IT management tools would a “personal CIO” need to manage their personal web applications? Here are a few scenarios:

  • get a catalog of available applications that can be installed and/or updated
  • analyze technical requirements (e.g. PHP version) of an application and make sure it can be installed on your infrastructure
  • migrate data and configuration between comparable applications (or different versions of the same application)
  • migrate applications from one hosting provider to another
  • back-up/snapshot data and configuration
  • central access to application stats/logs in simple format
  • uptime, response time monitoring
  • central access to user management (share users and configure across all your applications)
  • domain name management (registration, renewal)

As the CIO of my personal web applications, I don’t need to see Linux patches that need to be applied or network latency problems. If my hosting provider doesn’t take care of these without me even noticing, I am moving to another provider. What I need to see are the controls that make sense to a user of these applications. Many of the bullet listed above correspond to capabilities that are available today, but in a very brittle and hard-to-put-together form. My hosting provider has a one-click update feature but they have a limited application catalog. I wouldn’t trust them to measure uptime and response time for my sites, but there are third party services that do it. I wouldn’t expect my hosting provider to make it easy to move my apps to a new hosting provider, but it would be nice if someone else offered this. Etc. A neutral web application management service for the “personal CIO” could bring all this together and more. While I am at it, it could also help me backup/manage my devices and computers at home and manage/monitor my DSL or cable connection.

1 Comment

Filed under Everything, IT Systems Mgmt, Portability, Tech

My web apps and me

Registering a domain name: $10 per year
Hosting it with all the features you may need: $80 per year
Controlling your on-line life: priceless

To be frank, the main reason that I do not use Facebook or MySpace is that I am not very social to start with. But, believe it or not, I have a few friends and family member with whom I share photos and personal stories. Not to mention this blog for different kinds of friends and different kinds of stories (you are missing out on the cute toddler photos).

Rather than doing so on Facebook, MySpace, BlogSpot, Flickr, Picasa or whatever the Microsoft copies of these sites are, I maintain a couple of blogs and on-line photo albums on vambenepe.com. They all provide user access control and RSS-based syndication so no-one has to come to vambenepe.com just to check on them. No annoying advertising, no selling out of privacy and no risk of being jerked around by bait-and-switch (or simply directionless) business strategies (“in order to serve you better, we have decided that you will no longer be able to download the high-resolution version of your photos, but you can use them to print with our approved print-by-mail partners”). Have you noticed how people usually do not say “I use Facebook” but rather “I am on Facebook” as if riding a mechanical bull?

The interesting thing is that it doesn’t take a computer genius to set things up in such a way. I use Dreamhost and it, like similar hosting providers, gives you all you need. From the super-easy (e.g. they run WordPress for you) to the slightly more personal (they provide a one-click install of your own WordPress instance backed by your own database) to the do-it-yourself (they give you a PHP or RoR environment to create/deploy whatever app you want). Sure you can further upgrade to a dedicated server if you want to install a servlet container or a CodeGears environment, but my point is that you don’t need to come anywhere near this to own and run your own on-line life. You never need to see a Unix shell, unless you want to.

This is not replacing Facebook lock-in with Dreamhost lock-in. We are talking about an open-source application (WordPress) backed by a MySQL database. I can move it to any other hosting provider. And of course it’s not just blogging (WordPress) but also wiki (MediaWiki), forum (phpBB), etc.

Not that every shinny new on-line service can be replaced with a self-hosted application. You may have to wait a bit. For example, there is more to Facebook than a blog plus photo hosting. But guess what. Sounds like Bob Bickel is on the case. I very much hope that Bob and the ex-Bluestone gang isn’t just going to give us a “Facebook in a box” but also something more innovative, that makes it easy for people to run and own their side of a Facebook-like presence, with the ability to connect with other implementations for the social interactions.

We have always been able to run our own web applications, but it used to be a lot of work. My college nights were soothed by the hum of an always-running Linux server (actually a desktop used as a server) under my desk on which I ran my own SMTP server and HTTPd. My daughter’s “soothing ocean waves” baby toy sounds just the same. There were no turnkey web apps available at the time. I wrote and ran my own Web-based calendar management application in Python. When I left campus, I could have bought some co-locating service but it was a hassle and not cheap, so I didn’t bother [*].

I have a lot less time (and Linux administration skills) now than when I left university, so how come it is now attractive for me to run my own web apps again? What changed in the environment?

The main driver is the rise of the LAMP stack and especially PHP. For all the flaws of the platform and the ugliness of the code, PHP has sparked a huge ecosystem. Not just in terms of developers but also of administrators: most hosting providers are now very comfortable offering and managing PHP services.

The other driver is the rise of virtualization. Amazon hosts Xen images for you. But it’s not just the hypervisor version of virtualization. My Dreamhost server, for example, is not a Xen or VMWare virtual machine. It’s just a regular server that I share with other users but Dreamhost has created an environment that provides enough isolation from other users to meet my needs as an individual. The poor man’s virtualization if you will. Good enough.

These two trends (PHP and virtualization) have allowed Dreamhost and others to create an easy-to-use environment in which people can run and deploy web applications. And it becomes easier every day for someone to compete with Dreamhost on this. Their value to me is not in the hardware they run. It’s in environment they provide that prevents me from having to do low-level LAMP administration that I don’t have time for. Someone could create such an environment and run it on top of Amazon’s utility computing offering. Which is why I am convinced that such environments will be around for the foreseeable future, Dreamhost or no Dreamhost. Running your own web applications won’t be just for geeks anymore, just like using a GPS is not just for the geeks anymore.

Of course this is not a panacea and it won’t allow you to capture all aspects of your on-line life. You can’t host your eBay ratings. You can’t host your Amazon rank as a reviewer. It takes more than just technology to break free, but technology has underpinned many business changes before. In addition to the rise of LAMP and virtualization already mentioned, I am watching with interest the different efforts around data portability: dataportability.org, OpenID, OpenSocial, Facebook API… Except for OpenID, these efforts are driven by Web service providers hoping to canalize the demand for integration. But if they are successful, they should give rise to open source applications you can host on your own to enjoy these services without the lock-in. One should also watch tools like WSO2’s Mashup Server and JackBe Presto for their potential to rescue captive data and exploit freed data. On the “social networks” side, the RDF community has been buzing recently with news that Google is now indexing FOAF documents and exposing the content through its OpenSocial interface.

Bottom line, when you are offered to create a page, account or URL that will represent you or your data, take a second to ask yourself what it would take to do the same thing under your domain name. You don’t need to be a survivalist freak hiding in a mountain cabin in Montana (“it’s been eight years now, I wonder if they’ve started to rebuild cities after the Y2K apocalypse…”) to see value in more self-reliance on the web, especially when it can be easily achieved.

Yes, there is a connection between this post and the topic of this blog, IT management. It will be revealed in the next post (note to self: work on your cliffhangers).

[*] Some of my graduating colleagues took their machines to the dorm basement and plugged them into a switch there. Those Linux Slackware machines had amazing uptimes of months and years. Their demise didn’t come from bugs, hacking or component failures (even when cats made their litter inside a running computer with an open case) but the fire marshal, and only after a couple of years (the network admins had agreed to turn a blind eye).

[UPDATED 2008/7/7: Oh, yeah, another reason to run your own apps is that you won’t end up threatened with jail time for violating the terms of service. You can still end up in trouble if you misbehave, but they’ll have to charge you with something more real, not a whatever-sticks approach.]

[UPDATED 2009/12/30: Ringside (the Bob Bickel endeavor that I mention above), closed a few months after this post. Too bad. We still need what they were working on.]

2 Comments

Filed under Everything, Portability, Tech, Virtualization

SCA is not just for code portability

(updated on 2007/10/4, see bottom of the article)

David Chappell (not the same person as the Oracle-employed Dave Chappell from my previous post) has a blog entry explaining why there would be little value if Microsoft implemented SCA. The entry is reasonable but, like this follow-up by Stephan Tilkov, it focuses on clarifying the difference between portability (for which SCA helps) and interoperability (for which SCA doesn’t help very much). Seeing it from the IT management point of view, I see another advantage to SCA: it’s a machine readable description of the logic of the composite application, at a useful level of granularity for application and service management. This is something I can use in my application infrastructure to better understand relationships and dependencies. It brings the concepts of the application world to a higher level of abstraction (than servlets, beans, rows etc), one in which I can more realistically automate tasks such as policy propagation, fail-over automation, impact analysis, etc.

As a result, even if this was an Oracle-only technology, I would still be encouraging Greg and others to build it in the Oracle app server so that I can better managed applications written on that stack. And I would still encourage the Oracle Fusion applications to take advantage of it, for the same reason.

In that perspective, going back to Dave Chappell’s question, would there be value if Microsoft implemented SCA? I think so. It would make it a lot easier for me, and all the management vendors, to efficiently manage composite applications that have components running on both Microsoft and Oracle, for example. I believe Microsoft will need a component model for composite applications and I am sure Don Box has his ideas on this (he’s not yet ready to share his opinion on Dave’s question as you can see). I know of the SML-based work that is being driven by the System Center guys at Microsoft and they see SML as playing that role across applications and infrastructure. I don’t know how much they’ve convinced Don and others that this is the right way.

From an IT management perspective, portability of code doesn’t buy me very much. Portability of my ability to introspect composite applications and consume their metadata independently of the stack they are built on, on the other hand, is of great value. Otherwise we’ll only be able to build automated and optimized application and service management software for the Oracle stack. Which, I guess, would not be a bad first step…

[UPDATE on 2007/10/4] If this topic is of interest to you, you might want to go back to some of the links above to read the comments. David Chappell and I had a little back-and-forth in the comments section of his post, and so with Don Box in his post. In addition, Hartmut Wilms at InfoQ provides his summary of the discussion.]

3 Comments

Filed under Everything, IT Systems Mgmt, Portability, SCA