Cloud Comedy, Cloud Tragedy | William Vambenepe's stage

February 13, 2011

Defining Cloud from the provider perspective

I have a new definition for Cloud Computing. No, really.

Many discussions attempted to define Cloud Computing from the perspective of the consumer. To the point where asking “what’s a Cloud” has become a private joke for “let’s waste some time”. Eventually, people settled on the NIST set of definitions either because they like them (probability 0.1), they got tired of arguing (probability 0.4) or they want to sell to the government (probability 0.5).

Well, I have another one. Mine is a definition from the perspective of the Cloud provider (or the creator of Cloud-enablement software). And it’s a simple one.

A Cloud is a computing environment in which the runtime infrastructure and the management infrastructure are indistinguishable.

Ask engineers at Google App Engine to separate their code between the runtime part and the management part. They might not even understand the question.

For companies (like Oracle, where I work) that have a runtime division (Fusion Middleware for us) and a management division (Enterprise Manager), both of which ship products, it’s a challenge.

For companies which only offer one or the other, it’s a huge challenge.

For engineers who have to put it all together, it’s a great time to be in business.

5 Comments

Filed under Cloud Computing, Everything, IT Systems Mgmt, Utility computing

February 9, 2011

To each decade its Web abomination

In the 1990 decade,we got the imagemap. Some people decided to build an entire Web page as a giant JPEG backed by an imagemap for links. Mercifully, those are all gone, I think.

In the 2000 decade, we got Flash. Plenty of people decided that their whole site would be a Flash application. Some of these are still around and only now realizing their error.

In the 2010 decade, we got AJAX. As could be expected, we are now seeing sites entirely build as a JavaScript application. Which is just as wrong.

Comments Off on To each decade its Web abomination

Filed under Everything, Flash, JavaScript

February 3, 2011

The API, the whole API and nothing but the API

When programming against a remote service, do you like to be provided with a library (or service stub) or do you prefer “the API, the whole API, nothing but the API”?

A dedicated library (assuming it is compatible with your programming language of choice) is the simplest way to get invocations flowing. On the other hand, if you expect your client to last longer than one night of tinkering then you’re usually well-advised to resist making use of such a library in your code. Save yourself license issues, support issues, packaging issues and lifecycle issues. Also, decide for yourself what the right interaction model with the remote API is for your app.

One of the key motivations of SOAP was to prevent having to get stubs from the service provider. That remains an implicit design goals of the recent HTTP APIs (often called “RESTful”). You should be able to call the API directly from your application. If you use a library, e.g. an authentication library, it’s a third party library, not one provided by the service provider you are trying to connect to.

So are provider-provided (!) libraries always bad? Not necessarily, they can be a good learning/testing tool. Just don’t try to actually embed them in your app. Use them to generate queries on the wire that you can learn from. In that context, a nice feature of these libraries is the ability to write out the exact message that they put on the wire so you don’t have to intercept it yourself (especially if messages are encrypted on the wire). Even better if you can see the library code, but even as a black box they are a pretty useful way to clarify the more obscure parts of the API.

A few closing comments:

– In a way, this usage pattern is similar to a tool like the WLST Recorder in the WebLogic Administration Console. You perform the actions using the familiar environment of the Console, and you get back a set of WLST commands as a starting point for writing your script. When you execute your script, there is no functional dependency on the recorder, it’s a WLST script like any other.

– While we’re talking about downloadable libraries that are primarily used as a learning/testing tool, a test endpoint for the API would be nice too (either as part of the library or as a hosted service at a well-known URL). In the case of most social networks, you can create a dummy account for testing; but some other services can’t be tested in a way that is as harmless and inexpensive.

– This question of provider-supplied libraries is one of the reasons why I lament the use of the term “API” as it is currently prevalent. Call me old-fashioned, but to me the “API” is the programmatic interface (e.g. the Java interface) presented by the library. The on-the-wire contract is, in my world, called a service contract or a protocol. As in, the Twitter protocol, or the Amazon EC2 protocol, etc… But then again, I was also the last one to accept to use the stupid term of “Cloud Computing” instead of “Utility Computing”. Twitter conversations don’t offer the luxury of articulating such reticence so I’ve given up and now use “Cloud Computing” and “API” in the prevalent way.

[UPDATE: How timely! Seconds after publishing this entry I noticed a new trackback on a previous entry on this blog (Cloud APIs are like military parades). The trackback is an article from ProgrammableWeb, asking the exact same question I am addressing here: Should Cloud APIs Focus on Client Libraries More Than Endpoints?]

9 Comments

Filed under API, Automation, Everything, Implementation, Protocols, REST, SOAP

January 26, 2011

The REST bubble

Just yesterday I was writing about how Cloud APIs are like military parades. To some extent, their REST rigor is a way to enforce implementation discipline. But a large part of it is mostly bling aimed at showing how strong (for an army) or smart (for an API) the people in charge are.

Case in point, APIs that have very simple requirements and yet make a big deal of the fact that they are perfectly RESTful.

Just today, I learned (via the ever-informative InfoQ) about the JBoss SteamCannon project (a PaaS wrapper for Java and Ruby apps that can deploy on different host infrastructures like EC2 and VirtualBox). The project looks very interesting, but the API doc made me shake my head.

The very first thing you read is three paragraphs telling you that the API is fully HATEOS (Hypermedia as the Engine of Application State) compliant (our URLs are opaque, you hear me, opaque!) and an invitation to go read Roy’s famous take-down of these other APIs that unduly call themselves RESTful even though they don’t give HATEOS any love.

So here I am, a developer trying to deploy my WAR file on SteamCannon and that’s the API document I find.

Instead of the REST finger-wagging, can I have a short overview of what functions your API offers? Or maybe an example of a request call and its response?

I don’t mean to pick on SteamCannon specifically, it just happens to be a new Cloud API that I discovered today (all the Cloud API out there also spend too much time telling you how RESTful they are and not enough time showing you how simple they are). But when an API document starts with a REST lesson and when PowerPoint-waving sales reps pitch “RESTful APIs” to executives you know this REST thing has gone way beyond anything related to “the fundamentals”.

We have a REST bubble on our hands.

Again, I am not criticizing REST itself. I am criticizing its religious and ostentatious application rather than its practical use based on actual requirements (this was my take on its practical aspects in the context of Cloud APIs).

14 Comments

Filed under API, Application Mgmt, Cloud Computing, Everything, JBoss, Mgmt integration, Protocols, REST, Utility computing

January 24, 2011

Cloud APIs are like military parades

The previous post (“Amazon proves that REST doesn’t matter for Cloud APIs”) attracted some interesting comments on the blog itself, on Hacker News and in a response post by Mike Pearce (where I assume the photo is supposed to represent me being an AWS fanboy). I failed to promptly follow-up on it and address the response, then the holidays came. But Mark Little was kind enough to pick the entry up for discussion on InfoQ yesterday which brought new readers and motivated me to write a follow-up.

Mark did a very good job at summarizing my point and he understood that I wasn’t talking about the value (or lack of value) of REST in general. Just about whether it is useful and important in the very narrow field of Cloud APIs. In that context at least, what seems to matter most is simplicity. And REST is not intrinsically simpler.

It isn’t a controversial statement in most places that RPC is easier than REST for developers performing simple tasks. But on the blogosphere I guess it needs to be argued.

Method calls is how normal developers write normal code. Doing it over the wire is the smallest change needed to invoke a remote API. The complexity with RPC has never been conceptual, it’s been in the plumbing. How do I serialize my method call and send it over? CORBA, RMI and SOAP tried to address that, none of them fully succeeded in keeping it simple and yet generic enough for the Internet. XML-RPC somehow (and unfortunately) got passed over in the process.

So what did AWS do? They pretty much solved that problem by using parameters in the URL as a dead-simple way to pass function parameters. And you get the response as an XML doc. In effect, it’s one-half of XML-RPC. Amazon did not invent this pattern. And the mechanism has some shortcomings. But it’s a pragmatic approach. You get the conceptual simplicity of RPC, without the need to agree on an RPC framework that tries to address way more than what you need. Good deal.

So, when Mike asks “Does the fact that AWS use their own implementation of an API instead of a standard like, oh, I don’t know, REST, frustrate developers who really don’t want to have to learn another method of communicating with AWS?” and goes on to answer “Yes”, I scratch my head. I’ve met many developers struggling to understand REST. I’ve never met a developer intimidated by RPC. As to the claim that REST is a “standard”, I’d like to read the spec. Please don’t point me to a PhD dissertation.

That being said, I am very aware that simplicity can come back to bite you, when it’s not just simple but simplistic and the task at hand demands more. Andrew Wahbe hit the nail on the head in a comment on my original post:

Exposing an API for a unique service offered by a single vendor is not going to get much benefit from being RESTful.

Revisit the issue when you are trying to get a single client to work across a wide range of cloud APIs offered by different vendors; I’m willing to bet that REST would help a lot there. If this never happens — the industry decides that a custom client for each Cloud API is sufficient (e.g. not enough offerings on the market, or whatever), then REST may never be needed.

Andrew has the right perspective. The usage patterns for Cloud APIs may evolve to the point where the benefits of following the rules of REST become compelling. I just don’t think we’re there and frankly I am not holding my breath. There are lots of functional improvements needed in Cloud services before the burning issue becomes one of orchestrating between Cloud providers. And while a shared RESTful API would be the easiest to orchestrate, a shared RPC API will still be very reasonably manageable. The issue will mostly be one of shared semantics more than protocol.

Mike’s second retort was that it was illogical for me to say that software developers are mostly isolated from REST because they use Cloud libraries. Aren’t these libraries written by developers? What about these, he asks. Well, one of them, Boto‘s Mitch Garnaat left a comment:

Good post. The vast majority of AWS (or any cloud provider’s) users never see the API. They interact through language libraries or via web-based client apps. So, the only people who really care are RESTafarians, and library developers (like me).

Perhaps it’s possible to have an API that’s so bad it prevents people from using it but the AWS Query API is no where near that bad. It’s fairly consistent and pretty easy to code to. It’s just not REST.

Yup. If REST is the goal, then this API doesn’t reach it. If usefulness is the goal, then it does just fine.

Mike’s third retort was to take issue with that statement I made:

The Rackspace people are technically right when they point out the benefits of their API compared to Amazon’s. But it’s a rounding error compared to the innovation, pragmatism and frequency of iteration that distinguishes the services provided by Amazon. It’s the content that matters.

Mike thinks that

If Rackspace are ‘technically’ right, then they’re right. There’s no gray area. Morally, they’re also right and mentally, physically and spiritually, they’re right.

Sure. They’re technically, mentally, physically and spiritually right. They may even be legally, ethically, metaphysically and scientifically right. Amazon is only practically right.

This is not a ding on Rackspace. They’ll have to compete with Amazon on service (and price), not on API, as they well know and as they are doing. But they are racing against a fast horse.

More generally, the debate about how much the technical merits of an API matters (beyond the point where it gets the job done) is a recurring one. I am talking as a recovering over-engineer.

In a post almost a year ago, James Watters declared that it matters. Mitch Garnaat weighed on the other side: “given how few people use the raw API we probably spend too much time worrying about details“, “maybe we worry too much about aesthetics“, “I still wonder whether we obsess over the details of the API’s a bit too much“ (in case you can’t tell, I’m a big fan of Mitch).

Speaking of people I admire, Shlomo Swidler (“in general, only library developers use the raw HTTP. Everyone else uses a library“) and Joe Arnold (“library integration (fog / jclouds / libcloud) is more important for new #IaaS providers than an API“) make the right point. Rather than spending hours obsessing about the finer points of your API, spend the time writing love letters to Mitch and Adrian so they support you in their libraries (also, allocate less of your design time to RESTfulness and more to the less glamorous subject of error handling).

OK, I’ll pile on two more expert testimonies. Righscale’s Thorsten von Eicken (“the API itself is more a programming exercise than a fundamental issue, it’s the semantics of the resources behind the API that really matter“) and F5’s Lori MacVittie (“the World Doesn’t Care About APIs“).

Bottom line, I see APIs a bit like military parades. Soldiers know better than to walk in tight formation, wearing bright colors and to the sound of fanfare into the battlefield. So why are parade exercises so prevalent in all armies? My guess is that they are used to impress potential enemies, reassure citizens and reflect on the strength of the country’s leaders. But military parades are also a way to ensure internal discipline. You may not need to use parade moves on the battlefield, but the fact that the unit is disciplined enough to perform them means they are also disciplined enough for the tasks that matter. Let’s focus on that angle for Cloud APIs. If your RPC API is consistent enough that its underlying model could be used as the basis for a REST API, you’re probably fine. You don’t need the drum rolls, stiff steps and the silly hats. And no need to salute either.

15 Comments

Filed under Amazon, API, Automation, Cloud Computing, Everything, IT Systems Mgmt, Mgmt integration, Protocols, REST, Specs, Utility computing

December 6, 2010

Amazon proves that REST doesn’t matter for Cloud APIs

Every time a new Cloud API is announced, its “RESTfulness” is heralded as if it was a MUST HAVE feature. And yet, the most successful of all Cloud APIs, the AWS API set, is not RESTful.

We are far enough down the road by now to conclude that this isn’t a fluke. It proves that REST doesn’t matter, at least for Cloud management APIs (there are web-scale applications of an entirely different class for which it does). By “doesn’t matter”, I don’t mean that it’s a bad choice. Just that it is not significantly different from reasonable alternatives, like RPC.

AWS mostly uses RPC over HTTP. You send HTTP GET requests, with instructions like ?Action=CreateKeyPair added in the URL. Or DeleteKeyPair. Same for any other resource (volume, snapshot, security group…). Amazon doesn’t pretend it’s RESTful, they just call it “Query API” (except for the DevPay API, where they call it “REST-Query” for unclear reasons).

Has this lack of REStfulness stopped anyone from using it? Has it limited the scale of systems deployed on AWS? Does it limit the flexibility of the Cloud offering and somehow force people to consume more resources than they need? Has it made the Amazon Cloud less secure? Has it restricted the scope of platforms and languages from which the API can be invoked? Does it require more experienced engineers than competing solutions?

I don’t see any sign that the answer is “yes” to any of these questions. Considering the scale of the service, it would be a multi-million dollars blunder if indeed one of them had a positive answer.

Here’s a rule of thumb. If most invocations of your API come via libraries for object-oriented languages that more or less map each HTTP request to a method call, it probably doesn’t matter very much how RESTful your API is.

If you think it’s rich, for someone who wrote a series of post examining “REST in practice for IT and Cloud management” (part 1, part 2 and part 3), to now declare that REST doesn’t matter, well go back to these posts. I explicitly set them up as an effort to investigate whether (and in what way) it mattered and made it clear that my intuition was that actual RESTfulness didn’t matter as much as simplicity. The AWS API being an example of the latter without the former. As I wrote in my review of the Sun Cloud API, “it’s not REST that matters, it’s the rest”. One and a half years later, I think the case is closed.

17 Comments

Filed under Amazon, Application Mgmt, Cloud Computing, Everything, Implementation, Mgmt integration, REST, Specs, Utility computing

December 2, 2010

Nice incremental progress in Google App Engine SDK 1.4

When Google released version 1.3.8 of the Google App Engine SDK in October, they introduced an instance console, showing you how many instances are serving your application and some basic metrics about these instances. I wrote a blog to consider the implications of providing this level of visibility to application administrators. It also pointed out some shortcomings of this first version of the console.

The most glaring problem was that the console showed an “average latency” which was just a straight average of the latencies of all the instances, independently of the traffic they see. Which is a meaningless number.

Today, Google released an update to the SDK (1.4), and along with it some minor updates to the instance console. Except that, as you can see below, the screen capture in their announcement happens to show three instances that have processed exactly the same number of messages. Which means that we can’t tell whether they have fixed the “unweighted average” problem or not. Is this just by chance? Google, WTF? (which stands for “what’s the formula?”, of course).

I decided it was worth spending a few minutes to find the answer. I don’t have any app currently in use on GAE, but it doesn’t take much work to generate enough load to wake up one of my old apps and get it to spin a couple of instances. Here is the resulting console instance:

If you run the numbers, you can see that they’ve fixed that issue; the average latency is now weighted based on instance traffic. Thanks Google for listening.

Apparently, not all the updates have trickled down to my version of the instance console. The “requests”, “errors” and “age” columns are missing. I assume they’re on their way. Seeing the age of the instances, especially, is a nice addition, one of those I requested in my blog.

In the grand scheme of things, these minor updates to the console (which remains quite basic) are not the big news. The major announcement with SDK 1.4 is that the dreaded 30 seconds limit on execution time has been lifted for background tasks (those from Task Queue and Cron). It’s now a much more manageable 10 minutes. This doesn’t apply to the execution of Web requests served by your app.

Google App Engine has been under criticism recently, and that 30-second limit (along with reliability issues) figured prominently in the complains. Assuming the reliability issues are also coming under control, this update will go a long way towards addressing these issues.

Just so you realize how lucky you are if you are just now starting with Google App Engine, here are the kind of hoops you had to jump through, in the early days, to process any task that took a significant amount of time. This was done a year before the Cron and Task Queue features were added to GAE.

Another nice addition with SDK 1.4 is that you can now retrieve the source code of your application from Google’s servers. Of course you should never need that if you are rigorous and well-organized… Presumably this is only for Python since in the Java case Google’s servers never see the source code.

The steady progress of the GAE SDK continues.

1 Comment

Filed under Application Mgmt, Cloud Computing, Everything, Google, Google App Engine, IT Systems Mgmt, Manageability, Middleware, PaaS, Utility computing

November 5, 2010

Partial resource update, one more time

Alex Scordellis has a good blog post about how to handle partial PUT in REST. It starts by explaining why partial PUT is needed in the first place. And then (including in the comments) it runs into the issues this brings and proposes some solutions.

I have bad news. There are many more issues.

Let’s pick a simple example. What does it mean if an element is not present in a partial update? Is it an explicit omission, intended to represent the need to remove this element in the representation? Or does it mean “don’t change its current value”. If the latter, then how do I do removal? Do I need partial DELETE like I have partial PUT? Hopefully not, but then I have to have a mechanism to remove elements as part of a PUT. Empty value? That doesn’t necessarily mean the same thing as an absent element. Nil value? And how do I handle this with JSON?

And how do you deal with repeating elements? If you PUT an element of that type, is it an addition or a replacement? If replacement, which one(s) are you replacing? Or do you force me to PUT the entire list? No matter how long it is? Even if it increases the risk of concurrency issues?

Lots of similar issues. These two are just off the top of my head, memories from hours locked in a room with my HP, IBM, Intel and Microsoft accomplices.

You know what you end up with? You end up with this. Partial Put in WS-RT. I can hear you scream from here.

I am the ghost of dead partial update mechanisms, coming back to haunt you…

As much as WS-* was criticized for re-inventing HTTP, what we see here is HTTP people re-inventing partial resource update mechanisms like those in WSDM, WS-Management and WS-ResourceTransfer. Which is fine, I am in no way advocating that they should re-use these specs.

But let’s realize that while a lot of the complexity in WS-* was unnecessary, some of it actually was a reflection of the complexity of the task at hand. And that complexity doesn’t go away because you get rid of a SOAP envelope and of stupid WS-Addressing headers.

The good news is that we’ve made a lot of the mistakes already and we’ve learned some lessons (see this technical rant, this post-mortem or this experiment). The bad news is that there are plenty of new mistakes waiting to be made.

Good luck. I mean it sincerely.

7 Comments

Filed under API, Everything, IT Systems Mgmt, Manageability, Protocols, REST, Specs, Tech, WS-Management, WS-ResourceTransfer, WS-Transfer, XMLFrag

November 3, 2010

Cloud management is to traditional IT management what spreadsheets are to calculators

It’s all in the title of the post. An elevator pitch short enough for a 1-story ride. A description for business people. People who don’t want to hear about models, virtualization, blueprints and devops. But people who also don’t want to be insulted with vague claims about “business/IT alignment” and “agility”.

The focus is on repeatability. Repeatability saves work and allows new approaches. I’ve found spreadsheets (and “super-spreadsheets”, i.e. more advanced BI tools) to be a good analogy for business people. Compared to analysts furiously operating calculators, spreadsheets save work and prevent errors. But beyond these cost savings, they allow you to do things you wouldn’t even try to do without them. It’s not just the same process, done faster and cheaper. It’s a more mature way of running your business.

Same with the “Cloud” style of IT management.

3 Comments

Filed under Application Mgmt, Automation, Big picture, Business, Cloud Computing, DevOps, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Utility computing

October 19, 2010

Lifting the curtain on PaaS Cloud infrastructure (can you handle the truth?)

The promise of PaaS is that application owners don’t need to worry about the infrastructure that powers the application. They just provide application artifacts (e.g. WAR files) and everything else is taken care of. Backups. Scaling. Infrastructure patching. Network configuration. Geographic distribution. Etc. All these headaches are gone. Just pick from a menu of quality of service options (and the corresponding price list). Make your choice and forget about it.

In theory.

In practice no abstraction is leak-proof and the abstractions provided by PaaS environments are even more porous than average. The first goal of PaaS providers should be to shore them up, in order to deliver on the PaaS value proposition of simplification. But at some point you also have to acknowledge that there are some irreducible leaks and take pragmatic steps to help application administrators deal with them. The worst thing you can do is have application owners suffer from a leaky abstraction and refuse to even acknowledge it because it breaks your nice mental model.

Google App Engine (GAE) gives us a nice and simple example. When you first deploy an application on GAE, it is deployed as just one instance. As traffic increases, a second instance comes up to handle the load. Then a third. If traffic decreases, one instance may disappear. Or one of them may just go away for no reason (that you’re aware of).

It would be nice if you could deploy your application on what looks like a single, infinitely scalable, machine and not ever have to worry about horizontal scale-out. But that’s just not possible (at a reasonable cost) so Google doesn’t try particularly hard to hide the fact that many instances can be involved. You can choose to ignore that fact and your application will still work. But you’ll notice that some requests take a lot more time to complete than others (which is typically the case for the first request to hit a new instance). And some requests will find an empty local cache even though your application has had uninterrupted traffic. If you choose to live with the “one infinitely scalable machine” simplification, these are inexplicable and unpredictable events.

Last week, as part of the release of the GAE SDK 1.3.8, Google went one step further in acknowledging that several instances can serve your application, and helping you deal with it. They now give you a console (pictured below) which shows the instances currently serving your application.

I am very glad that they added this console, because it clearly puts on the table the question of how much your PaaS provider should open the kimono. What’s the right amount of visibility, somewhere between “one infinitely scalable computer” and giving you fan speeds and CPU temperature?

I don’t know what the answer is, but unfortunately I am pretty sure this console is not it. It is supposed to be useful “in debugging your application and also understanding its performance characteristics“. Hmm, how so exactly? Not only is this console very simple, it’s almost useless. Let me enumerate the ways.

Misleading

Actually it’s worse than useless, it’s misleading. As we can see on the screen shot, two of the instances saw no traffic during the collection period (which, BTW, we don’t know the length of), while the third one did all the work. At the top, we see an “average latency” value. Averaging latency across instances is meaningless if you don’t weight it properly. In this case, all the requests went to the instance that had an average latency of 1709ms, but apparently the overall average latency of the application is 569.7ms (yes, that’s 1709/3). Swell.

No instance identification

What happens when the console is refreshed? Maybe there will only be two instances. How do I know which one went away? Or say there are still three, how do I know these are the same three? For all I know it could be one old instance and two new ones. The single most important data point (from the application administrator’s perspective) is when a new instance comes up. I have no way, in this UI, to know reliably when that happens: no instance identification, no indication of the age of an instance.

Average memory

So we get the average memory per instance. What are we supposed to do with that information? What’s a good number, what’s a bad number? How much memory is available? Is my app memory-bound, CPU-bound or IO-bound on this instance?

Configuration management

As I have described before, change and configuration management in a PaaS setting is a thorny problem. This console doesn’t tackle it. Nowhere does it say which version of the GAE platform each instance is running. Google announces GAE SDK releases (the bits you download), but these releases are mostly made of new platform features, so they imply a corresponding update to Google’s servers. That can’t happen instantly, there must be some kind of roll-out (whether the instances can be hot-patched or need to be recycled). Which means that the instances of my application are transitioned from one platform version to another (and presumably that at a given point in time all the instances of my application may not be using the same platform version). Maybe that’s the source of my problem. Wouldn’t it be nice if I knew which platform version an instance runs? Wouldn’t it be nice if my log files included that? Wouldn’t it be nice if I could request an app to run on a specific platform version for debugging purpose? Sure, in theory all the upgrades are backward-compatible, so it “shouldn’t matter”. But as explained above, “the worst thing you can do is have application owners suffer from a leaky abstraction and refuse to even acknowledge it“.

OK, so the instance monitoring console Google just rolled out is seriously lacking. As is too often the case with IT monitoring systems, it reports what is convenient to collect, not what is useful. I’m sure they’ll fix it over time. What this console does well (and really the main point of this blog) is illustrate the challenge of how much information about the underlying infrastructure should be surfaced.

Surface too little and you leave application administrators powerless. Surface more data but no control and you’ll leave them frustrated. Surface some controls (e.g. a way to configure the scaling out strategy) and you’ve taken away some of the PaaS simplicity and also added constraints to your infrastructure management strategy, making it potentially less efficient. If you go down that route, you can end up with the other flavor of PaaS, the IaaS-based PaaS in which you have an automated way to create a deployment but what you hand back to the application administrator is a set of VMs to manage.

That IaaS-centric PaaS is a well-understood beast, to which many existing tools and management practices can be applied. The “pure PaaS” approach pioneered by GAE is much more of a terra incognita from a management perspective. I don’t know, for example, whether exposing the platform version of each instance, as described above, is a good idea. How leaky is the “platform upgrades are always backward-compatible” assumption? Google, and others, are experimenting with the right abstraction level, APIs, tools, and processes to expose to application administrators. That’s how we’ll find out.

1 Comment

Filed under Application Mgmt, Automation, Cloud Computing, Everything, Google App Engine, IT Systems Mgmt, Manageability, Mgmt integration, Middleware, PaaS, Utility computing

October 14, 2010

Redeeming the service description document

A bicycle is a convenient way to go buy cigarettes. Until one day you realize that buying cigarettes is a bad idea. At which point you throw away your bicycle.

Sounds illogical? Well, that’s pretty much what the industry has done with service descriptions. It went this way: people used WSDL (and stub generation tools built around it) to build distributed applications that shared too much. Some people eventually realized that was a bad approach. So they threw out the whole idea of a service description. And now, in the age of APIs, we are no more advanced than we were 15 years ago in terms of documenting application contracts. It’s tragic.

The main fallacies involved in this stagnation are:

Assuming that service descriptions are meant to auto-generate all-encompassing program stubs,
Looking for the One True Description for a given service,
Automatically validating messages based on the service description.

I’ll leave the first one aside, it’s been widely covered. Let’s drill in a bit into the other two.

There is NOT One True Description for a given service

Many years ago, in the same galaxy where we live today (only a few miles from here, actually), was a development team which had to implement a web service for a specific WSDL. They fed the WSDL to their SOAP stack. This was back in the days when WSDL interoperability was a “promise” in the “political campaign” sense of the term so of course it didn’t work. As a result, they gave up on their SOAP stack and implemented the service as a servlet. Which, for a team new to XML, meant a long delay and countless bugs. I’ll always remember the look on the dev lead’s face when I showed him how 2 minutes and a text editor were all you needed to turn the offending WSDL in to a completely equivalent WSDL (from the point of view of on-the-wire messages) that their toolkit would accept.

(I forgot what the exact issue was, maybe having operations with different exchange patterns within the same PortType; or maybe it used an XSD construct not supported by the toolkit, and it was just a matter of removing this constraint and moving it from schema to code. In any case something that could easily be changed by editing the WSDL and the consumer of the service wouldn’t need to know anything about it.)

A service description is not the literal word of God. That’s true no matter where you get it from (unless it’s hand-delivered by an angel, I guess). Just because adding “?wsdl” to the URL of a Web service returns an XML document doesn’t mean it’s The One True Description for that service. It’s just the most convenient one to generate for the app server on which the service is deployed.

One of the things that most hurts XML as an on-the-wire format is XSD. But not in the sense that “XSD is bad”. Sure, it has plenty of warts, but what really hurts XML is not XSD per se as much as the all-too-common assumption that if you use XML you need to have an XSD for it (see fat-bottomed specs, the key message of which I believe is still true even though SML and SML-IF are now dead).

I’ve had several conversations like this one:

– The best part about using JSON and REST was that we didn’t have to deal with XSD.
– So what do you use as your service contract?
– Nothing. Just a human-readable wiki page.
– If you don’t need a service contract, why did you feel like you had to write an XSD when you were doing XML? Why not have a similar wiki page describing the XML format?
– …

It’s perfectly fine to have service descriptions that are optimized to meet a specific need rather than automatically focusing on syntax validation. Not all consumers of a service contract need to be using the same description. It’s ok to have different service descriptions for different users and/or purposes. Which takes us to the next fallacy. What are service descriptions for if not syntax validation?

A service description does NOT mean you have to validate messages

As helpful as “validation” may seem as a concept, it often boils down to rejecting messages that could otherwise be successfully processed. Which doesn’t sound quite as useful, does it?

There are many other ways in which service descriptions could be useful, but they have been largely neglected because of the focus on syntactic validation and stub generation. Leaving aside development use cases and looking at my area of focus (application management), here are a few use cases for service descriptions:

Creating test messages (aka “synthetic transactions”)

A common practice in application management is to send test messages at regular intervals (often from various locations, e.g. branch offices) to measure the availability and response time of an application from the consumer’s perspective. If a WSDL is available for the service, we use this to generate the skeleton of the test message, and let the admin fill in appropriate values. Rather than a WSDL we’d much rather have a “ready-to-use” (possibly after admin review) test message that would be provided as part of the service description. Especially as it would be defined by the application creator, who presumably knows a lot more about that makes a safe and yet relevant message to send to the application as a ping.

Attaching policies and SLAs

One of the things that WSDLs are often used for, beyond syntax validation and stub generation, is to attach policies and SLAs. For that purpose, you really don’t need the XSD message definition that makes up so much of the WSDL. You really just need a way to identify operations on which to attach policies and SLAs. We could use a much simpler description language than WSDL for this. But if you throw away the very notion of a description language, you’ve thrown away the baby (a classification of the requests processed by the service) along with the bathwater (a syntax validation mechanism).

Governance / versioning

One benefit of having a service description document is that you can see when it changes. Even if you reduce this to a simple binary value (did it change since I last checked, y/n) there’s value in this. Even better if you can introspect the description document to see which requests are affected by the change. And whether the change is backward-compatible. Offering the “before” XSD and the “after” XSD is almost useless for automatic processing. It’s unlikely that some automated XSD inspection can tell me whether I can keep using my previous messages or I need to update them. A simple machine-readable declaration of that fact would be a lot more useful.

I just listed three, but there are other application management use cases, like governance/auditing, that need a service description.

In the SOAP world, we usually make do with WSDL for these tasks, not because it’s the best tool (all we really need is a way to classify requests in “buckets” – call them “operations” if you want – based on the content of the message) but because WSDL is the only understanding that is shared between the caller and the application.

By now some of you may have already drafted in your head the comment you are going to post explaining why this is not a problem if people just use REST. And it’s true that with REST there is a default categorization of incoming messages. A simple matrix with the various verbs as columns (GET, POST…) and the various resource types as rows. Each message can be unambiguously placed in one cell of this matrix, so I don’t need a service description document to have a request classification on which I can attach SLAs and policies. Granted, but keep these three things in mind:

This default categorization by verb and resource type can be a quite granular. Typically you wouldn’t have that many different policies on your application. So someone who understands the application still needs to group the invocations into message categories at the right level of granularity.
This matrix is only meaningful for the subset of “RESTful” apps that are truly… RESTful. Not for all the apps that use REST where it’s an easy mental mapping but then define resource types called “operations” or “actions” that are just a REST veneer over RPC.
Even if using REST was a silver bullet that eliminated the need for service definitions, as an application management vendor I don’t get to pick the applications I manage. I have to have a solution for what customers actually do. If I restricted myself to only managing RESTful applications, I’d shrink my addressable market by a few orders of magnitude. I don’t have an MBA, but it sounds like a bad idea.

This is not a SOAP versus REST post. This is not a XML versus JSON post. This is not a WSDL versus WADL post. This is just a post lamenting the fact that the industry seems to have either boxed service definitions into a very limited use case, or given up on them altogether. It I wasn’t recovering from standards burnout, I’d look into a versatile mechanism for describing application services in a way that is geared towards message classification more than validation.

Comments Off on Redeeming the service description document

Filed under API, Application Mgmt, Everything, Governance, IT Systems Mgmt, Manageability, Mashup, Mgmt integration, Middleware, Modeling, Protocols, REST, SML, SOA, Specs, Standards

September 20, 2010

Exalogic, EC2-on-OVM, Oracle Linux: The Oracle Open World early recap

Among all the announcements at Oracle Open World so far, here is a summary of those I was the most impatient to blog about.

Oracle Exalogic Elastic Cloud

This was the largest part of Larry’s keynote, he called it “one big honkin’ cloud”. An impressive piece of hardware (360 2.93GHz cores, 2.8TB of RAM, 960GB SSD, 40TB disk for one full rack) with excellent InfiniBand connectivity between the nodes. And you can extend the InfiniBand connectivity to other Exalogic and/or Exadata racks. The whole packaged is optimized for the Oracle Fusion Middleware stack (WebLogic, Coherence…) and managed by Oracle Enterprise Manager.

This is really just the start of a long linage of optimized, pre-packaged, simplified (for application administrators and infrastructure administrators) application platforms. Management will play a central role and I am very excited about everything Enterprise Manager can and will bring to it.

If “Exalogic Elastic Cloud” is too taxing to say, you can shorten it to “Exalogic” or even just “EL”. Please, just don’t call it “E2C”. We don’t want to get into a trademark fight with our good friends at Amazon, especially since the next important announcement is…

Run certified Oracle software on OVM at Amazon

Oracle and Amazon have announced that AWS will offer virtual machines that run on top of OVM (Oracle’s hypervisor). Many Oracle products have been certified in this configuration; AMIs will soon be available. There is a joint support process in place between Amazon and Oracle. The virtual machines use hard partitioning and the licensing rules are the same as those that apply if you use OVM and hard partitioning in your own datacenter. You can transfer licenses between AWS and your data center.

One interesting aspect is that there is no extra fee on Amazon’s part for this. Which means that you can run an EC2 VM with Oracle Linux on OVM (an Oracle-tested combination) for the same price (without Oracle Linux support) as some other Linux distribution (also without support) on Amazon’s flavor of Xen. And install any software, including non-Oracle, on this VM. This is not the primary intent of this partnership, but I am curious to see if some people will take advantage of it.

Speaking of Oracle Linux, the next announcement is…

The Unbreakable Enterprise Kernel for Oracle Linux

In addition to the RedHat-compatible kernel that Oracle has been providing for a while (and will keep supporting), Oracle will also offer its own Linux kernel. I am not enough of a Linux geek to get teary-eyed about the birth announcement of a new kernel, but here is why I think this is an important milestone. The stratification of the application runtime stack is largely a relic of the past, when each layer had enough innovation to justify combining them as you see fit. Nowadays, the innovation is not in the hypervisor, in the OS or in the JVM as much as it is in how effectively they all combine. JRockit Virtual Edition is a clear indicator of things to come. Application runtimes will eventually be highly integrated and optimized. No more scheduler on top of a scheduler on top of a scheduler. If you squint, you’ll be able to recognize aspects of a hypervisor here, aspects of an OS there and aspects of a JVM somewhere else. But it will be mostly of interest to historians.

Oracle has by far the most expertise in JVMs and over the years has built a considerable amount of expertise in hypervisors. With the addition of Solaris and this new milestone in Linux access and expertise, what we are seeing is the emergence of a company for which there will be no technical barrier to innovation on making all these pieces work efficiently together. And, unlike many competitors who derive most of their revenues from parts of this infrastructure, no revenue-protection handcuffs hampering innovation either.

Fusion Apps

Larry also talked about Fusion Apps, but I believe he plans to spend more time on this during his Wednesday keynote, so I’ll leave this topic aside for now. Just remember that Enterprise Manager loves Fusion Apps.

And what about Enterprise Manager?

We don’t have many attention-grabbing Enterprise Manager product announcements at Oracle Open World 2010, because we had a big launch of Enterprise Manager 11g earlier this year, in which a lot of new features were released. Technically these are not Oracle Open World news anymore, but many attendees have not seen them yet so we are busy giving demos, hands-on labs and presentations. From an application and middleware perspective, we focus on end-to-end management (e.g. from user experience to BTM to SOA management to Java diagnostic to SQL) for faster resolution, application lifecycle integration (provisioning, configuration management, testing) for lower TCO and unified coverage of all the key parts of the Oracle portfolio for productivity and reliability. We are also sharing some plans and our vision on topics such as application management, Cloud, support integration etc. But in this post, I have chosen to only focus on new product announcements. Things that were not publicly known 48 hours ago. I am also not covering JavaOne (see Alexis). There is just too much going on this week…

Just kidding, we like it this way. And so do the customers I’ve been talking to.

Comments Off on Exalogic, EC2-on-OVM, Oracle Linux: The Oracle Open World early recap

Filed under Amazon, Application Mgmt, Cloud Computing, Conference, Everything, Linux, Manageability, Middleware, Open source, Oracle, Oracle Open World, OVM, Tech, Trade show, Utility computing, Virtualization, Xen

September 12, 2010

The PaaS Lament: In the Cloud, application administrators should administrate applications

Some organizations just have “systems administrators” in charges of their applications. Others call out an “application administrator” role but it is usually overloaded: it doesn’t separate the application platform administrator from the true application administrator. The first one is in charge of the application runtime infrastructure (e.g. the application server, SOA tools, MDM, IdM, message bus, etc). The second is in charge of the applications themselves (e.g. Java applications and the various artifacts that are used to customize the middleware stack to serve the application).

In effect, I am describing something close to the split between the DBA and the application administrators. The first step is to turn this duo (app admin, DBA) into a triplet (app admin, platform admin, DBA). That would be progress, but such a triplet is not actually what I am really after as it is too strongly tied to a traditional 3-tier architecture. What we really need is a first-order separation between the application administrator and the infrastructure administrators (not the plural). And then, if needed, a second-order split between a handful of different infrastructure administrators, one of which may be a DBA (or a DBA++, having expanded to all data storage services, not just relational), another of which may be an application platform administrator.

There are two reasons for the current unfortunate amalgam of the “application administrator” and “application platform administrator” roles. A bad one and a good one.

The bad reason is a shortcomings of the majority of middleware products. While they generally do a good job on performance, reliability and developer productivity, they generally do a poor job at providing a clean separation of the performance/administration functions that are relevant to the runtime and those that are relevant to the deployed applications. Their usual role definitions are more structured along the lines of what actions you can perform rather than on what entities you can perform them. From a runtime perspective, the applications are not well isolated from one another either, which means that in real life you have to consider the entire system (the middleware and all deployed applications) if you want to make changes in a safe way.

The good reason for the current lack of separation between application administrators and middleware administrators is that middleware products have generally done a good job of supporting development innovation and optimization. Frameworks appear and evolve to respond to the challenges encountered by developers. Knobs and dials are exposed which allow heavy customization of the runtime to meet the performance and feature needs of a specific application. With developers driving what middleware is used and how it is used, it’s a natural consequence that the middleware is managed in tight correlation with how the application is managed.

Just like there is tension between DBAs and the “application people” (application administrators and/or developers), there is an inherent tension in the split I am advocating between application management and application platform management. The tension flows from the previous paragraph (the “good reason” for the current amalgam): a split between application administrators and application platform administrators would have the downside of dampening application platform innovation. Or rather it redirects it, in a mutation not unlike the move from artisans to industry. Rather than focusing on highly-specialized frameworks and highly-tuned runtimes, the application platform innovation is redirected towards the goals of extreme cost efficiency, high reliability, consistent security and scalability-by-default. These become the main objectives of the application platform administrator. In that perspective, the focus of the application architect and the application administrator needs to switch from taking advantage of the customizability of the runtime to optimize local-node performance towards taking advantage of the dynamism of the application platform to optimize for scalability and economy.

Innovation in terms of new frameworks and programming models takes a hit in that model, but there are ways to compensate. The services offered by the platform can be at different levels of generality. The more generic ones can be used to host innovative application frameworks and tools. For example, a highly-specialized service like an identity management system is hard to use for another purpose, but on the other hand a JVM can be used to host not just business applications but also platform-like things like Hadoop. They can run in the “application space” until they are mature enough to be incorporated in the “application platform space” and become the responsibility of the application platform administrator.

The need to keep a door open for innovation is part of why, as much as I believe in PaaS, I don’t think IaaS is going away anytime soon. Not only do we need VMs for backward-looking legacy apps, we also need polyvalent platforms, like a VM, for forward-looking purposes, to allow developers to influence platform innovation, based on their needs and ideas.

Forget the guillotine, maybe I should carry an axe around. That may help get the point across, that I want to slice application administrators in two, head to toe. PaaS is not a question of runtime. It’s a question of administrative roles.

Comments Off on The PaaS Lament: In the Cloud, application administrators should administrate applications

Filed under Application Mgmt, Cloud Computing, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Middleware, PaaS, Utility computing, Virtualization

September 10, 2010

Let me explain, officer

I am not in the habit of using a camera in public bathrooms, but since I haven’t written any post in the CrazyStats category for a while I figured this was worth taking the risk of being arrested. Last weekend, I had the honor of using a urinal which “saves 88% more water than a one gallon urinal”. A completely meaningless statement that masquerades as a statistic (presumably they mean “uses 88% less water”). How much water does a one gallon urinal save? I know how much it consumes (one gallon) but how do you define how much it saves? Compared to what? To a standard one gallon model? Well, a one gallon doesn’t save anything compared to a one gallon, so the urinal I used (if it uses less than one gallon) actually saves infinitely more water than a one gallon urinal. If you are going to make meaningless claims, why stop at 88%?

Marketing claims based on meaningless statistics. It’s not just for Cloud Computing.

1 Comment

Filed under CrazyStats, Off-topic

August 19, 2010

URL shorteners and privacy: The Good, the Bad and the Cookie

The table below compares various URL shorteners based on how much they value service performance and the privacy of their users.

Here is the short version of the reading guide: a URL shorterner which gives a high priority to reliability, performance and privacy will use a 301 (“Moved Permanently”) response code, will not use cache control headers and will not use cookies. A URL shortener which gives high priority to its own ability to monetize its traffic by tracking users will do one or more of these things.

Here is how a few of the most popular shorteners perform by this measure (red is bad).

For the long version (and an explanation of how I came to create this table) read below the table.

Service name	Cookie	Status code	Caching limitations
t.co (Twitter)	–	301	5 min
bit.ly	tracking	301	–
tinyurl.com	–	301	–
goo.gl (Google)	–	301	24h
wp.me (WordPress)	–	301	–
snurl.com	–	301	10h
fb.me (Facebook)	(*)	301	–
twurl.nl	tracking	301	–
is.gd	–	–	–
ping.fm	–	301	–
p.ly	tracking	301	no caching
ff.im	tracking	301	(**)
u.nu	–	301	–
tiny.cc	tracking	301	–
snipurl.com	–	301	10h
chkit.in	tracking	301	–
ur1.ca	–	302	no caching
digs.by	–	302	no caching

Notes:

(*) Facebook’s service, fb.me, tries to set a cookie but its content is “locale=en_US” and cannot be used for identification. In addition, it sets the domain to “.facebook.com” in the Set-Cookie directive but since the response comes from another domain (fb.me) the cookie is actually never returned by the browser and therefore useless. It looks like this is a leftover configuration setting copied from the normal facebook.com servers. Defying all expectations, Facebook comes out as one of the most privacy-friendly URL shorteners.

(**) ff.im limits the cache to being “private” which means that your browser can cache the result but a shared proxy (e.g. your company’s proxy) should not cache it. Forcing each user behind that proxy to resolve the URL once. I magnanimously did not ding them for this, even though it’s sub-optimal.

Now for the longer explanation

Despite the potential it offers to stretch out our tweets, I wasn’t too impressed when I learned of Twitter’s plan to roll out (and mandate) its own URL shortening service. My fundamental issue is that URL shortening is made necessary by an arbitrary decision on Twitter’s part (the 140 character limit and the fact that URLs count toward it) and that it would be entirely within their power to make these abominations unneeded. Or, at least, much more rarely needed (when tinyurl.com came out, the main use case was to insert a very long URL in an email without having problems with carriage returns, not to turn third-world countries into purveyors of silly domain names).

Beyond this fundamental issue, my main concerns about Twitter’s t.co mechanism are that it reduces privacy and it demands that you break the HTTP specification.

From a privacy perspective, the issue is that anyone who clicks on these links tells Twitter where they are going. And Twitter can collect and correlate these actions. The easiest way for them (or any other URL shortener) to do this is to use cookies. Cookies aren’t often used as part of redirections, but technically nothing prevents them. So I wanted to see if Twitter used them.

[Side note: in practice there are ways to track your browser without using identifying cookies, not to mention simply using the IP address which works quite well on people who browse from home. Still, identifying cookies are the preferred method.]

From a specification conformance perspective, the problem is that Twitter announced that they would modify the Terms of Service of their API to prevent you from replacing the short URL with the real location once you’ve resolved it the first time (as of this writing they apparently haven’t yet made the ToS change). That behavior would be in violation of the HTTP specification if the redirection used status code 301 (“Moved Permanently”) which states that “any future references to this resource SHOULD use one of the returned URIs” and “clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server“. So I wanted to see whether t.co indeed returns a 301 (and asks us to violate the spec) or if they use a Temporary Redirect (302 or the new 307) in which case the specification would not be violated but other problems would arise (for example, search engines would not give you PageRank karma for such a link).

The other (spec-compliant) way to force a 301 to call back home once a while is the (strange but legal) practice of using cache control headers on permanent redirections. So I also wanted to see how t.co behaves on that front.

And then I decided to also test a few other services, which is how the table above came to be.

Comments Off on URL shorteners and privacy: The Good, the Bad and the Cookie

Filed under Everything, Facebook, Google, Protocols, Security, Social networks, Tech, Testing, Twitter

Tagged as Web

August 18, 2010

The necessity of PaaS: Will Microsoft be the Singapore of Cloud Computing?

From ancient Mesopotamia to, more recently, Holland, Switzerland, Japan, Singapore and Korea, the success of many societies has been in part credited to their lack of natural resources. The theory being that it motivated them to rely on human capital, commerce and innovation rather than resource extraction. This approach eventually put them ahead of their better-endowed neighbors.

A similar dynamic may well propel Microsoft ahead in PaaS (Platform as a Service): IaaS with Windows is so painful that it may force Microsoft to focus on PaaS. The motivation is strong to “go up the stack” when the alternative is to cultivate the arid land of Windows-based IaaS.

I should disclose that I work for one of Microsoft’s main competitors, Oracle (though this blog only represents personal opinions), and that I am not an expert Windows system administrator. But I have enough experience to have seen some of the many reasons why Windows feels like a much less IaaS-friendly environment than Linux: e.g. the lack of SSH, the cumbersomeness of RDP, the constraints of the Windows license enforcement system, the Windows update mechanism, the immaturity of scripting, the difficulty of managing Windows from non-Windows machines (despite WS-Management), etc. For a simple illustration, go to EC2 and compare, between a Windows AMI and a Linux AMI, the steps (and time) needed to get from selecting an image to the point where you’re logged in and in control of a VM. And if you think that’s bad, things get even worse when we’re not just talking about a few long-lived Windows server instances in the Cloud but a highly dynamic environment in which all steps have to be automated and repeatable.

I am not saying that there aren’t ways around all this, just like it’s not impossible to grow grapes in Holland. It’s just usually not worth the effort. This recent post by RighScale illustrates both how hard it is but also that it is possible if you’re determined. The question is what benefits you get from Windows guests in IaaS and whether they justify the extra work. And also the additional license fee (while many of the issues are technical, others stem more from Microsoft’s refusal to acknowledge that the OS is a commodity). [Side note: this discussion is about Windows as a guest OS and not about the comparative virtues of Hyper-V, Xen-based hypervisors and VMWare.]

Under the DSI banner, Microsoft has been working for a while on improving the management/automation infrastructure for Windows, with tools like PowerShell (which I like a lot). These efforts pre-date the Cloud wave but definitely help Windows try to hold it own on the IaaS battleground. Still, it’s an uphill battle compared with Linux. So it makes perfect sense for Microsoft to move the battle to PaaS.

Just like commerce and innovation will, in the long term, bring more prosperity than focusing on mining and agriculture, PaaS will, in the long term, yield more benefits than IaaS. Even though it’s harder at first. That’s the good news for Microsoft.

On the other hand, lack of natural resources is not a guarantee of success either (as many poor desertic countries can testify) and Microsoft will have to fight to be successful in PaaS. But the work on Azure and many research efforts, like the “next-generation programming model for the cloud” (codename “Orleans”) that Mary Jo Foley revealed today, indicate that they are taking it very seriously. Their approach is not restricted by a VM-centric vision, which is often tempting for hypervisor and OS vendors. Microsoft’s move to PaaS is also facilitated by the fact that, while system administration and automation may not be a strength, development tools and application platforms are.

The forward-compatible Cloud will soon overshadow the backward-compatible Cloud and I expect Microsoft to play a role in it. They have to.

10 Comments

Filed under Application Mgmt, Automation, Azure, Cloud Computing, DevOps, Everything, IT Systems Mgmt, Linux, Manageability, Mgmt integration, Microsoft, Middleware, Oslo, PaaS, Research, Utility computing, WS-Management

August 12, 2010

There should be a word for this (Blog/Twitter edition) part 3

Resuming where we left off (part 1 , part 2), here are more words that we need in the age of Twitter and blogs.

#17 The feeling of elation when writing an IM which you know is approaching 140 characters, your fingers start to tense but you go past it and nothing happens, nothing turns red, and you suddenly feel so free to express yourself. Also works on IRC.

#18 The art of calibrating how many hints you should put that there is a joke/punt/double-entendre in your tweet. Some jokes are best delivered with a straight face (e.g. without a smiley or a #humor tag). And readers derived more pleasure from less obvious jokes. But the risk is that the joke will go completely unnoticed as people hurriedly scan their timelines.

#19 The trauma of temporarily switching back to a “feature phone” (e.g. a basic clamshell) while waiting for the replacement of a broken smartphone to arrive. In response to this request, Lori MacVittie suggested “retrotrauma” which I like a lot though I may shorten it to “retrauma” or “retroma”.

#20 The intuition that the thought you just had is original enough to interest your readers but probably not originally enough to not have been tweeted already. The quasi-certitude that doing a twitter search on it would find previous occurrences, thereby making you an involuntary plagiarist. The refusal to perform such search (in violation of the “Google before you Tweet is the new Think before you Speak” adage) before writing your tweet. Or, on the other side, the abandon of a tweet idea based on the assumption that it’s already out there. E.g. I could think of a few jokes on the HP “invent” tagline in the wake of Mark Hurd’s resignation (“HP Invent… business expenses”) but didn’t bother, based on the assumption that these tweets were already doing the rounds.

#21 A brand, especially a personal one, (e.g. twitter ID, domain name…) that has aged badly because it uses a now-out-of-favor buzzword. Like, soon enough, everything with “Cloud” in it. I still remember, over 10 years later, laughing out loud when I heard a KQED radio program sponsored by Busse Design USA who was inviting us to visit them at “myBDUPortal.com”. This was in the late nineties when “portals” where the hot thing on the Internet (as well as the “my” prefix, when Yahoo and others got into personalization). I am happy to see that they are now using a much more reasonable domain name but Yahoo’s calcified directory still bears witness of their hubris. Look for Busse Design on this listing.

# 22 Someone who has never been on-line. I don’t personally feel the need for a new term for this, but we have to find an alternative to this most unfortunate and ambiguous coinage: “digital virgins” (as in “30 percent of Europeans are ‘digital virgins'”)

#23 Chris Hoff wanted a term to describe “someone who tries to escape from the suffocating straight jacket of disingenuousness exposed by their own Twitter timeline.” His proposal: “tweetdini”

As always, submissions are welcome in the comments if you think you’ve coined the right term for any of these.

3 Comments

Filed under Everything, Off-topic, Twitter

August 11, 2010

The Way of the Weasel

Say you want to play the tough guy on Twitter, but would rather not be taken to task on your proclamations. Here is a technique you can use to publicly insult/challenge/criticize someone by name without them knowing about it.

Let’s assume you want to challenge Internet darling Chuck Norris to a duel, but aren’t too sure that the result of an actual fight would look like The Way of the Dragon (with you as Bruce Lee and Chuck as Chuck). So you would prefer that he didn’t hear about your challenge. Here is the process to follow.

First, in the “settings” page of your Twitter account, check the “protect my tweets” option.
Then write your challenge tweet, e.g. “I challenge @ChuckNorris to a fight to death but the coward will probably never dare to answer this tweet.”
Then, back on the “settings” page, uncheck “protect my tweet”.

Voila. All your followers will see your bravado and Chuck Norris will never hear about it. No trace should remain of this subterfuge once it’s over and the whole thing can be done in a couple of seconds.

Note that this only works if Chuck doesn’t follow you direclty. This method prevents someone from noticing your tweet in the list of mentions of their @username but it doesn’t prevent your followers from seeing the tweet. Which is the whole point, since you want your followers to see what a tough guy you are. You would just rather not face the consequences.

Anyway, I just thought this was an interesting corner case. Not that I or any of my readers would be ever do this, but be aware that it’s something someone (who takes Twitter too seriously) could do.

1 Comment

Filed under Everything, Off-topic, Twitter

August 5, 2010

Updates on Microsoft Oslo and “SSH on Windows”

I’ve been tracking the modeling technology previously known as “Microsoft Oslo” with a sympathetic eye for the almost three years since it’s been introduced. I look at it from the perspective of model-driven IT management but the news hadn’t been good on that front lately (except for Douglas Purdy’s encouraging hint).

The prospects got even bleaker today, at least according to the usually-well-informed Mary Jo Foley, who writes: “Multiple contacts of mine are telling me that Microsoft has decided to shelve Quadrant and ‘refocus’ M.” Is “M” the end of the SDM/SML/M model-driven management approach at Microsoft? Or is the “refocus” a hint that M is returning “home” to address IT management use cases? Time (or Doug) will tell…

While we’re talking about Microsoft and IT automation, I have one piece of free advice for the Microsofties: people *really* want to SSH into Windows servers. Here’s how I know. This blog rarely talks about Microsoft but over the course of two successive weekends over a year ago I toyed with ways to remotely manage Windows machines using publicly documented protocols. In effect, showing what to send on the wire (from Linux or any platform) to leverage the SOAP-based management capabilities in recent versions of Windows. To my surprise, these posts (1, 2, 3) still draw a disproportionate amount of traffic. And whenever I look at my httpd logs, I can count on seeing search engine queries related to “windows native ssh” or similar keywords.

If heterogeneous Cloud is something Microsoft cares about they need to better leverage the potential of the PowerShell Remoting Protocol. They can release open-source Python, Java and Ruby client-side libraries. Alternatively, they can drastically simplify the protocol, rather than its current “binary over SOAP” (you read this right) incarnation. Because the poor Kridek who is looking for the “WSDL for WinRM / Remote Powershell” is in for a nasty surprise if he finds it and thinks he’ll get a ready-to-use stub out of it.

That being said, a brave developer willing to suck it up and create such a Python/Ruby/Java library would probably make some people very grateful.

3 Comments

Filed under Application Mgmt, Automation, Everything, Implementation, IT Systems Mgmt, Manageability, Mgmt integration, Microsoft, Modeling, Oslo, Protocols, SML, SOAP, Specs, Tech, WS-Management

July 25, 2010

The Tragedy of the Commons in Cloud standards

I wasn’t at the OSCON Cloud Summit this past week, but I’ve spent some time over the weekend trying to collect the good bits. Via Twitter, I had heard echos of an interesting debate on Cloud standards between Sam Johnston and Benjamin Black. Today I got to see Benjamin’s slides and read reports from two audience members, Charles Engelke and Krishnan Subramanian. Sam argued that Cloud standards are needed, Benjamin that they would be premature.

Benjamin is right about what to think and Sam is right about what to do.

Let me put it differently: Benjamin is right in theory, but it doesn’t matter. Here is why.

Say I’m a vendor and Benjamin convinces me

Assume I truly believe the industry would be better served if we all waited. Does this mean I’ll stay away from Cloud standards efforts for now? Not necessarily, because nothing is stopping my competitors from doing it. In the IT standards world, your only choice is to participate or opt out. For the most part you can’t put your muscle towards stopping an effort. Case in point, Amazon has so far chosen to opt out; has that stopped VMWare and others from going to DMTF and elsewhere to ratify specifications as standards? Of course not. To the contrary, it has made the option even more attractive because when the leader stays home it is a lot easier for less popular candidates to win the prize. So as a vendor-who-was-convinced-by-Benjamin I now have the choice between letting my competitor get his specification rubberstamped (and then hit me with the competitive advantage of being “standard compliant” and even “the standard leader”) or getting involved in an effort that I know to be counterproductive for the industry. Guess what most will choose?

Even the initial sinner (who sets the wheels of premature standardization in motion) may himself be convinced that it’s too early for Cloud standards. But he has to assume that one of his competitors will make the move, and in that context why give them first mover advantage (and the choice of the battlefield). It’s the typical Tragedy of the Commons scenario. By acting in a rational and self-interested way, participants invariably end up creating a bad situation, one that they might all know is against everyone’s self interest.

And it’s not just vendors.

Say I’m an officer of a Standard-setting organization and Benjamin convinces me

If you expect that I would use my position in the organization to prevent companies from starting a Cloud standard effort there, you live in fantasy-land. Standard-setting organizations compete with one another just as fiercely as companies do. If I have achieved a position of leadership in a given standard organization, the last thing I want is to see another organization lay claims to a strategic and fast-growing area of the IT landscape. It takes a lot of time and money for a company to get elected on the right board and gets its employees (or other reliable allies) in the right leadership positions. Or to acquire people already in that place. You only get a return on that investment if the organization manages to be the one where the key standards get created. That’s what’s behind the landgrab reflex of many standards organizations.

And it goes beyond vendors and standards organizations

Say I’m an IT buyer and Benjamin convinces me

Assume I really believe Cloud standards are premature. Assume they get created anyway and I have to choose between a vendor who supports them and one who doesn’t. Do I, as a matter of principle, refuse consider the “standard-compliant” label in my purchasing decision? Even if I know that the standard shouldn’t have been created, I also know that, all other things being equal, the “standard-compliant” product will attract more tools and complementary solutions and will likely ease future integration problems.

And then there is the question of how I’ll explain this to my boss. Will Benjamin be by my side with his beautiful slides when I am called in an emergency meeting to explain to the CIO why we, unlike the competitors, didn’t pick “a standards-based solution”?

In the real world, the only way to solve problems caused by the Tragedy of the Commons is to have some overarching authority regulate the usage of the resource at risk of being ruined. This seems unlikely to be a workable solution when the resource is not a river to protect from sewer discharges but an IT domain to protect from premature standardization. If called, I’d be happy to serve as benevolent dictator for the IT industry (I could fix a few other things beyond the Cloud standards landgrab issue). But as long as neither I nor anyone else is in a dictatorial position, Benjamin’s excellent exposé has no audience for which his call to arms (or rather to lay down the arms) is actionable. I am not saying that everyone agrees with Benjamin, but that even if everyone did it still wouldn’t make a difference. Many of us in the industry share his views and rationally act as if we didn’t.

[UPDATED 2010/7/25: In a nice example of Blog/Twitter synergy, minutes after posting this I was having a conversation on Twitter with Benjamin Black about my interpretation of what he said. Based on this conversation, I realize that I should clarify that what I mean by “standards” in this post is “something that comes out of a standard-setting organization” (whether or not it gets adopted), in other words what Benjamin calls a “standard specification”. He uses the word “standard” to mean “what most people use”, which may or may not be a “standard specification”. That’s a big part of the disconnect that led to our Twitter chat. The other part is that what I presented as Benjamin’s thesis in my post is actually only one of the propositions in his talk, and not even the main one. It’s the proposition that it is damaging for the industry when a standard specification comes out of a standard organization too early. I wasn’t at the conference where Benjamin presented but it’s hard to understand anything else out of slide 61 (“standardize too soon, and you lock to the wrong thing”) and 87 (“to discover the right standards, we must eschew standards”). So if I misrepresented him I believe it was in making it look like this was the focus of his talk while in fact it was only one of the points he made. As he himself clarified for me: “My _actual_ argument is that it doesn’t matter what we think about cloud standards, if they are needed, they will emerge” (again, in this sentence he uses “standards” to mean “something that people have converged on”).

More generally, my main point here has nothing to do with Benjamin, Sam and their OSCON debate, other than the fact that reading about it prompted me to type this blog entry. It’s simply that there is a perversion in the IT standards landscape that makes it impossible for premature standardization *not* to happen. It’s something I’ve written before, e.g. in this post:

Saying “it’s too early” in the standards world is the same as saying nothing. It puts you out of the game and has no other effect. Amazon, the clear leader in the space, has taken just this position. How has this been understood? Simply as “well I guess we’ll do it without them”. It’s sad, but all it takes is one significant (but not necessarily leader) company trying to capitalize on some market influence to force the standards train to leave the station. And it’s a hard decision for others to not engage the pursuit at that point. In the same way that it only takes one bellicose country among pacifists to start a war.

Benjamin is just a messenger; and I wasn’t trying to shoot him.]

[UPDATED 2010/8/13: The video of the debate between Sam Johnston and Benjamin Black is now available, so you can see for yourself.]

6 Comments

Filed under Amazon, Big picture, Cloud Computing, DMTF, Ecology, Everything, Governance, Standards, Utility computing, VMware