Category Archives: Everything

REST in practice for IT and Cloud management (part 1: Cloud APIs)

In this entry I compare four public Cloud APIs (AWS EC2, GoGrid, Rackspace and Sun Cloud) to see what practical benefits REST provides for resource management protocols.

As someone who was involved with the creation of the WS-* stack (especially the parts related to resource management) and who genuinely likes the SOAP processing model I have a tendency to be a little defensive about REST, which is often defined in opposition to WS-*. On the other hand, as someone who started writing web apps when the state of the art was a CGI Perl script, who loves on-the-wire protocols (e.g. this recent exploration of the Windows management stack from an on-the-wire perspective), who is happy to deal with raw XML (as long as I get to do it with a good library), who appreciates the semantic web, and who values models over protocols the REST principles are very natural to me.

I have read the introduction and the bible but beyond this I haven’t seen a lot of practical and profound information about using REST (by “profound” I mean something that is not obvious to anyone who has written web applications). I had high hopes when Pete Lacey promised to deliver this through a realistic example, but it seems to have stalled after two posts. Still, his conversation with Stefan Tilkov (video + transcript) remains the most informed comparison of WS-* and REST.

The domain I care the most about is IT resource management (which includes “Cloud” in my view). I am familiar with most of the remote API mechanisms in this area (SNMP to WBEM to WMI to JMX/RMI to OGSI, to WSDM/WS-Management to a flurry of proprietary interfaces). I can think of ways in which some REST principles would help in this area, but they are mainly along the lines of “any consistent set of principles would help” rather than anything specific to REST. For a while now I have been wondering if I am missing something important about REST and its applicability to IT management or if it’s mostly a matter of “just pick one protocol and focus on the model” (as well as simply avoiding the various drawbacks of the alternative methods, which is a valid reason but not an intrinsic benefit of REST).

I have been trying to learn from others, by looking at how they apply REST to IT/Cloud management scenarios. The Cloud area has been especially fecund in such specifications so I will focus on this for part 1. Here is what I think we can learn from this body of work.

Amazon EC2

When it came out a few years ago, the Amazon EC2 API, with its equivalent SOAP and plain-HTTP alternatives, did nothing to move me from the view that it’s just a matter of picking a protocol and being consistent. They give you the choice of plain HTTP versus SOAP, but it’s just a matter of tweaking how the messages are serialized (URL parameters versus a SOAP message in the input; whether or not there is a SOAP wrapper in the output). The operations are the same whether you use SOAP or not. The responses don’t even contain URLs. For example, “RunInstances” returns the IDs of the instances, not a URL for each of them. You then call “TerminateInstances” and pass these instance IDs as parameters rather than doing a “delete” on an instance URL. This API seems to have served Amazon (and their ecosystem) well. It’s easy to understand, easy to use and it provides a convenient way to handle many instances at once. Since no SOAP header is supported, the SOAP wrapper adds no value (I remember reading that the adoption rate for the EC2 SOAP API reflect this though I don’t have a link handy).

Overall, seeing the EC2 API did not weaken my suspicion that there was no fundamental difference between REST and SOAP in the IT/Cloud management field. But I was very aware that Amazon didn’t really “do” REST in the EC2 API, so the possibility remained that someone would, in a way that would open my eyes to the benefits of true REST for IT/Cloud management.

Fast forward to 2009 and many people have now created and published RESTful APIs for Cloud computing. APIs that are backed by real implementations and that explicitly claim RESTfulness (unlike Amazon). Plus, their authors have great credentials in datacenter automation and/or REST design. First came GoGrid, then the Sun Cloud API and recently Rackspace. So now we have concrete specifications to analyze to understand what REST means for resource management.

I am not going to do a detailed comparative review of these three APIs, though I may get to that in a future post. Overall, they are pretty similar in many dimensions. They let you do similar things (create server instances based on images, destroy them, assign IPs to them…). Some features differ: GoGrid supports more load balancing features, Rackspace gives you control of backup schedules, Sun gives you clusters (a way to achieve the kind of manage-as-group features inherent in the EC2 API), etc. Leaving aside the feature-per-feature comparison, here is what I learned about what REST means in practice for resource management from each of the three specifications.

GoGrid

Though it calls itself “REST-like”, the GoGrid API is actually more along the lines of EC2. The first version of their API claimed that “the API is a REST-like API meaning all API calls are submitted as HTTP GET or POST requests” which is the kind of “HTTP ergo REST” declaration that makes me cringe. It’s been somewhat rephrased in later versions (thank you) though they still use the undefined term “REST-like”. Maybe it refers to their use of “call patterns”. The main difference with EC2 is that they put the operation name in the URI path rather than the arguments. For example, EC2 uses

https://ec2.amazonaws.com/?Action=TerminateInstances&InstanceId.1=i-2ea64347&…(auth-parameters)…

while GoGrid uses

https://api.gogrid.com/api/grid/server/delete?name=My+Server+Name&…(auth-parameters)…

So they have action-specific endpoints rather than a do-everything endpoint. It’s unclear to me that this change anything in practice. They don’t pass resource-specific URLs around (especially since, like EC2, they include the authentication parameters in the URL), they simply pass IDs, again like EC2 (but unlike EC2 they only let you delete one server at a time). So whatever “REST-like” means in their mind, it doesn’t seem to be “RESTful”. Again, the EC2 API gets the job done and I have no reason to think that GoGrid doesn’t also. My comments are not necessarily a criticism of the API. It’s just that it doesn’t move the needle for my appreciation of REST in the context of IT management. But then again, “instruct William Vambenepe” was probably not a goal in their functional spec

Rackspace

In this “interview” to announce the release of the Rackspace “Cloud Servers” API, lead architects Erik Carlin and Jason Seats make a big deal of their goal to apply REST principles: “We wanted to adhere as strictly as possible to RESTful practice. We iterated several times on the design to make it more and more RESTful. We actually did an update this week where we made some final changes because we just didn’t feel like it was RESTful enough”. So presumably this API should finally show me the benefits of true REST in the IT resource management domain. And to be sure it does a better job than EC2 and GoGrid at applying REST principles. The authentication uses HTTP headers, keeping URLs clean. They use the different HTTP verbs the way they are intended. Well mostly, as some of the logic escapes me: doing a GET on /servers/id (where id is the server ID) returns the details of the server configuration, doing a DELETE on it terminates the server, but doing a PUT on the same URL changes the admin username/password of the server. Weird. I understand that the output of a GET can’t always have the same content as the input of a PUT on the same resource, but here they are not even similar. For non-CRUD actions, the API introduces a special URL (/servers/id/action) to which you can POST. The type of the payload describes the action to execute (reboot, resize, rebuild…). This is very similar to Sun’s “controller URLs” (see below).

I came out thinking that this is a nice on-the-wire interface that should be easy to use. But it’s not clear to me what REST-specific benefit it exhibits. For example, how would this API be less useful if “delete” was another action POSTed to /servers/id/action rather than being a DELETE on /servers/id? The authors carefully define the HTTP behavior (content compression, caching…) but I fail to see how the volume of data involved in using this API necessitates this (we are talking about commands here, not passing disk images around). Maybe I am a lazy pig, but I would systematically bypass the cache because I suspect that the performance benefit would be nothing in comparison to the cost of having to handle in my code the possibility of caching taking place (“is it ok here that the content might be stale? what about here? and here?”).

Sun

Like Rackspace, the Sun Cloud API is explicitly RESTful. And, by virtue of Tim Bray being on board, we benefit from not just seeing the API but also reading in well-explained details the issues, alternatives and choices that went into it. It is pretty similar to the Rackspace API (e.g. the “controller URL” approach mentioned above) but I like it a bit better and not just because the underlying model is richer (and getting richer every day as I just realized by re-reading it tonight). It handles many-as-one management through clusters in a way that is consistent with the direct resource access paradigm. And what you PUT on a resource is closely related to what you GET from it.

I have commented before on the Sun Cloud API (though the increasing richness of their model is starting to make my comments less understandable, maybe I should look into changing the links to a point-in-time version of Kenai). It shows that at the end it’s the model, not the protocol that matters. And Tim is right to see REST in this case as more of a set of hygiene guidelines for on-the-wire protocols then as the enabler for some unneeded scalability (which takes me back to wondering why the Rackspace guys care so much about caching).

Anything learned?

So, what do these APIs teach us about the practical value of REST for IT/Cloud management?

I haven’t written code against all of them, but I get the feeling that the Sun and Rackspace APIs are those I would most enjoy using (Sun because it’s the most polished, Rackspace because it doesn’t force me to use JSON). The JSON part has two component. One is simply my lack of familiarity with using it compared to XML, but I assume I’ll quickly get over this when I start using it. The second is my concern that it will be cumbersome when the models handled get more complex, heterogeneous and versioned, chiefly from the lack of namespace support. But this is a topic for another day.

I can’t tell if it’s a coincidence that the most attractive APIs to me happen to be the most explicitly RESTful. On the one hand, I don’t think they would be any less useful if all the interactions where replaced by XML RPC calls. Where the payloads of the requests and responses correspond to the parameters the APIs define for the different operations. The Sun API could still return resource URLs to me (e.g. a VM URL as a result of creating a VM) and I would send reboot/destroy commands to this VM via XML RPC messages to this URL. How would it matter that everything goes over HTTP POST instead of skillfully choosing the right HTTP verb for each operation? BTW, whether the XML RPC is SOAP-wrapped or not is only a secondary concern.

On the other hand, maybe the process of following REST alone forces you to come up with a clear resource model that makes for a clean API, independently of many of the other REST principles. In this view, REST is to IT management protocol design what classical music training is to a rock musician.

So, at least for the short-term expected usage of these APIs (automating deployments, auto-scaling, cloudburst, load testing, etc) I don’t think there is anything inherently beneficial in REST for IT/Cloud management protocols. What matter is the amount of thought you put into it and that it has a clear on-the-wire definition.

What about longer term scenarios? Wouldn’t it be nice to just use a Web browser to navigate HTML pages representing the different Cloud resources? Could I use these resource representations to create mashups tying together current configuration, metrics history and events from wherever they reside? In other words, could I throw away my IT management console because all the pages it laboriously generates today would exist already in the ether, served by the controllers of the resources. Or rather as a mashup of what is served by these controllers. Such that my IT management console is really “in the cloud”, meaning not just running in somebody else’s datacenter but rather assembled on the fly from scattered pieces of information that live close to the resources managed. And wouldn’t this be especially convenient if/when I use a “federated” cloud, one that spans my own datacenter and/or multiple Cloud providers? The scalability of REST could then become more relevant, but more importantly its mashup-friendliness and location transparency would be essential.

This, to me, is the intriguing aspect of using REST for IT/Cloud management. This is where the Sun Cloud API would beat the EC2 API. Tim says that in the Sun Cloud “the router is just a big case statement over URI-matching regexps”. Tomorrow this router could turn into five different routers deployed in different locations and it wouldn’t change anything for the API user. Because they’d still just follow URLs. Unlike all the others APIs listed above, for which you know the instance ID but you need to somehow know which controller to talk to about this instance. Today it doesn’t matter because there is one controller per Cloud and you use one Cloud at a time. Tomorrow? As Tim says, “the API doesn’t constrain the design of the URI space at all” and this, to me, is the most compelling long-term reason to use REST. But it only applies if you use it properly, rather than just calling your whatever-over-HTTP interface RESTful. And it won’t differentiate you in the short term.

The second part in the “REST in practice for IT and Cloud management” series will be about the use of REST for configuration management and especially federation. Where you can expect to read more about the benefits of links (I mean “hypermedia”).

[UPDATE: Part 2 is now available. Also make sure to read the comments below.]

35 Comments

Filed under Amazon, API, Cloud Computing, Everything, IT Systems Mgmt, Manageability, Mgmt integration, REST, SOA, SOAP, SOAP header, Specs, Utility computing, Virtualization

YACSOE

Yet another cloud standards organization effort. This one is better than the others because it has the best domain name.

A press release to announce a Wiki. Sure. Whatever. Electrons are cheap.

Cynicism aside, it can’t hurt. But what would be really useful is if all these working groups opened up their mailing list archives and document repositories so that the Wiki can be a launching pad to actual content rather than a set of one-line descriptions of what each group is supposed to work on. With useful direct links to the most recent drafts and lists of issues under consideration. Similar to the home page of a W3C working group, but across groups. Let’s hope this is a first step in that direction.

I am also interested in where they’ll draw the line between Cloud computing and IT management. If such a line remains.

2 Comments

Filed under Cloud Computing, DMTF, Everything, Grid, Manageability, Mgmt integration, Specs, Standards, Utility computing, Virtualization, W3C

The CMDBf specification is now a DMTF standard

The CMDBf specification has finished its trek through the DMTF standard process. The last step was board approval and finally here is the official DMTF standard. It’s called version 1.0.0 which is a bit confusing since the version submitted to DMTF was dubbed “version 1.0”. I guess it means that this standard is the first version of the DMTF specification called CMDBf.

If you have been following the process closely, then you won’t find many technical changes since the last public draft. If you last read the specification when it was submitted to DMTF, then you’ll notice several improvements but no drastic change. If you are yet to take a first look at CMDBf, now is the perfect time.

To help you in that endeavor, I plan to update the query pseudo-algorithm to conform to the standard version of the specification when I get a chance. In the meantime, the slightly-outdated one is probably still helpful in wrapping your mind around the query mechanism.

Gentle(wo)men, rev your (query) engines.

Comments Off on The CMDBf specification is now a DMTF standard

Filed under CMDB, CMDB Federation, CMDBf, DMTF, Everything, IT Systems Mgmt, Specs, Standards

File upload/download and remote program execution using WS-Management – a practical solution

The previous blog post described a way to upload and (in theory at least) download text files to/from a remote Windows machine using WS-Management. In practice, the applicability of the method is  limited for upload (text files only, slow for large files) and almost nonexistent for download. Here is a much improved version.

This is another example of something that was too obvious for me to see last weekend when I was in the thick of fighting with WS-Management SOAP messages and learning about WMI classes. It just took a day of not thinking about it to have the solution pop in my mind: use ftp.exe. For the longest time (at least since Windows NT) Windows has been shipping with this FTP client. And the documentation shows that you can call it from the command line and provide it with the name of a text file containing the commands to execute. Bingo.

Specifically, here are the steps. Let’s say that I want to run a program called task.exe on a remote Windows machine and that program takes a large binary file (data.bin) as input. I want to transfer both to the remote machine and then run the program. This can be done in 3 simple steps:

Step 1: upload the FTP command file to the remote Windows machine. The content of the command file is below. mgmtserver.myco.com is the name of the machine from which the two files can be retrieved over FTP. I use anonymous FTP here, but you could just as well provide a username and password.

open mgmtserver.myco.com
anonymous
binary
get task.exe
get data.bin
quit

Step 2: execute the FTP commands above. This downloads task.exe and data.bin from mgmtserver.myco.com onto the remote Windows machine.

Step 3: execute the program on the remote Windows machine (“task.exe data.bin”).

Here are the on-the-wire messages corresponding to each step:

Step 1: upload the FTP command file to the remote Windows machine

<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope"
  xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing"
  xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd">
  <s:Header>
    <a:To>http://server:80/wsman</a:To>
    <w:ResourceURI s:mustUnderstand="true">http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process </w:ResourceURI>
    <a:ReplyTo>
    <a:Address s:mustUnderstand="true">http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous</a:Address>
    </a:ReplyTo>
    <a:Action s:mustUnderstand="true">http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process/Create</a:Action>
    <a:MessageID>uuid:9A989269-283B-4624-BAC5-BC291F72E854</a:MessageID>
  </s:Header>
  <s:Body>
    <p:Create_INPUT xmlns:p="http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process">
      <p:CommandLine>cmd /c echo open mgmtserver.myco.com>ftpscript&amp;&amp;echo
      anonymous>>ftpscript&amp;&amp;echo binary>>ftpscript&amp;&amp;echo get
      task.exe>>ftpscript&amp;&amp;echo get data.bin>>ftpscript&amp;&amp;echo
      quit>>ftpscript</p:CommandLine>
      <p:CurrentDirectory>C:datawinrm-test</p:CurrentDirectory>
    </p:Create_INPUT>
  </s:Body>
</s:Envelope>

As before, you need to set the Content-Type HTTP header to “application/soap+xml;charset=UTF-8” (or UTF-16).

Step 2: execute the FTP commands to download the files from your server

It’s the same message, except the <p:CommandLine> element now has this value:

<p:CommandLine>ftp -s:ftpscript</p:CommandLine>

Step 3: execute the task.exe program on the remote Windows machine

Again, the same message except that the command line is simply:

<p:CommandLine>C:datawinrm-testtask.exe data.bin</p:CommandLine>

Note that I have broken this down in three messages for clarity, but you can easily bundle all three steps in one SOAP message. Just use this command line:

<p:CommandLine>cmd /c echo open mgmtserver.myco.com>ftpscript&amp;&amp;echo
anonymous>>ftpscript&amp;&amp;echo binary>>ftpscript&amp;&amp;echo get
task.exe>>ftpscript&amp;&amp;echo get data.bin>>ftpscript&amp;&amp;echo
quit>>ftpscript&amp;&amp;ftp -s:ftpscript&amp;&amp;C:datawinrm-testtask.exe
data.bin</p:CommandLine>

Of course this can also be used in reverse, to download files from the remote Windows machine rather than upload files to it. Just use PUT or MPUT as FTP commands instead of GET or MGET.

This mechanism is a major improvement, for many use cases, over what I originally described. I feel a bit like someone who just changed a flat tire by loosening the lug nuts with his teeth and then found the lug wrench under the spare tire.

2 Comments

Filed under Everything, Implementation, IT Systems Mgmt, Manageability, Microsoft, Portability, SOAP, Standards, WS-Management

Uploading a file to a Windows machine via WMI/WS-Management

[UPDATED 2009/6/30: Check the following post for a more practical solution.]

Here is a simple way to upload a text (i.e. not binary) file to a Windows machine. Because my interest is to be able to do it from any platform, I investigated the use of WS-Management. But the method relies on invoking WMI methods over WS-Management, so I don’t see why it would not also work in a straight WMI scenario if you prefer.

I am not a Windows management expert, so there may be a much better way to do this (e.g. BITS). But if what you’re after is the simplest possible way to drop a file on a Windows machine it from a non-Windows machine, it doesn’t get much simpler than sending an XML doc over HTTP and calling it a day. Here is how.

The easiest would be if the CIM_DataFile WMI class had a “create” method to create a new file. It doesn’t. But Win32_Process does. Invoking this method creates a new process and you get to specify the command line to execute. All you need to do is come up with a command line that invokes a program that will create the file that you want to upload.

There may be alternatives, but the command line I came up with for this purpose uses the “cmd.exe” interpreter (the Windows command-line shell). By using the “/c” option, you can invoke this interpreter with its instructions as parameters directly on the command line (it gets a bit confusing because we have two “command lines” here, the one that is used to launch the “cmd.exe” shell and the one that is presented inside the “cmd.exe” shell).

Anyway, if you type the following line inside the “start/run” field in Windows

cmd /c echo 1st line > test1.txt

It will have the same effect as opening a command shell, typing “echo 1st line > test1.txt” in it and the closing it. It creates a new file called “test1.txt” with one line of content (“1st line”). If you want a second line, you can do this by adding a second command that uses “>>” (append) instead of “>”. And the two commands can be joined by “&&” to invoke them in one pass. So to create a file with three lines, we’d execute:

cmd /c echo 1st line > test1.txt && echo 2nd line >> test1.txt
&& echo 3rd line >> test1.txt

Now all we have to do is package this in a WS-Management SOAP message and post it to the WS-Management listener of the Windows machine. In the process, we have to escape the “&” in the command line to “&amp;” because of XML syntax rules. The resulting message looks like:

<s:Envelope
  xmlns:s="http://www.w3.org/2003/05/soap-envelope"
  xmlns:a="http://schemas.xmlsoap.org/ws/2004/08/addressing"
  xmlns:w="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd">
<s:Header>
<a:To>http://localhost/wsman</a:To>
<w:ResourceURI s:mustUnderstand="true">
  http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process
</w:ResourceURI>
<a:ReplyTo>
<a:Address s:mustUnderstand="true">
  http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous
</a:Address>
</a:ReplyTo>
<a:Action s:mustUnderstand="true">
  http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process/Create
</a:Action>
<a:MessageID>uuid:9A989269-283B-4624-BAC5-BC291F72E854</a:MessageID>
</s:Header>
<s:Body>
<p:Create_INPUT
  xmlns:p="http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process">
<p:CommandLine>cmd /c echo 1st line > test1.txt &amp;&amp; echo 2nd line >>
  test1.txt &amp;&amp; echo 3rd line >> test1.txt</p:CommandLine>
<p:CurrentDirectory>C:datawinrm-test</p:CurrentDirectory>
</p:Create_INPUT>
</s:Body>
</s:Envelope>

You don’t even need a WS-Management toolkit to do this as the only WS-Management header is w:ResourceURI which can easily be set manually. You don’t need a WS-Addressing library either as all the headers are also static (except for the MessageID even though nobody will care in practice if you always send the same value; I hereby authorize you to re-use the one in my example as much as you want). As a side note, this is yet another illustration of how useless this header (and more generally WS-Addressing) is in 95% of the case. And yet the Microsoft WS-Management implementation (like many others) will make a point to fault if you don’t send it. But ranting against WS-Addressing is a topic for another day (look for a future post titled “WS-IfInteroperabilityWasEasyItWouldNotBeFunWouldIt”).

I should mention that you want to set the Content-Type HTTP header to “application/soap+xml;charset=UTF-8” for this message. Or UTF-16 if that’s what you’re sending.

A few comments:

  • This obviously only works for character-based files, not binaries
  • I’ve noticed that the parsing of the wsa:Action header is pretty minimalistic. The Microsoft implementation seems to just pick up the text behind the last “/”. So you can type send “blahblah/Create” and it works just as well as the correct value, “http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/Win32_Process/Create” (it knows what class to apply the operation on from the Resource URI). Interestingly, there is only one URL ending in “/Create” that doesn’t work and it’s the WS-Transfer “Create” operation (“http://schemas.xmlsoap.org/ws/2004/09/transfer/Create”). That’s because the “Create” operation invoked in the message above is not the WS-Transfer “Create” operation but rather the homonymous operation on the WMI class.
  • Using the “/k” modifier on “cmd” in the command line (instead of “/c”) would also work, but the command shell would stay alive after returning so over time you’d have quite a few of them hanging out and using up memory on the remote machine. Not a good move.
  • As part of this exercise, I noticed an error in the MSDN page describing the “invoke” method of Win32_Process. In the SOAP body, the URI for the “p” namespace prefix uses “…/cim/…” instead of “…/cimv2/…”, which caused my first attempts to fail.

If the file you want to upload is large, you can break the upload over several successive messages similar to the one above. As long as you use the same file name and use “>>” instead of “>” you’ll keep appending to the end of the file until it’s complete.

Of course this could be any type of text file, including XML (watch for the character-escaping rules though, both for XML and for “cmd” as you have to apply them in the right sequence). Even better, it could be a Python, Perl or PowerShell script too. And in that case (assuming the corresponding interpreter is installed on the machine) you can use the same mechanism to also invoke the script for execution. So that you use this WS-Management interface just to bootstrap into a more comfortable remote-control mechanism.

The next logical question (for extra credit) is whether WS-Management can be used to read files remotely instead of writing them. In theory yes, though in practice you’re much better off with alternate solutions, like the remote shell extension to WS-Management that I have described as “dumb SSH” previously.

But since you ask, here is the theory. My first attempt was to do a WS-Management “Get” (the Get operation from WS-Transfer) on an instance of CIM_DataFile (using the “Name” selector and setting it to “C:datawinrm-testtest1.txt”). But this returns the properties of the file rather than its content. Whether this is kosher is an interesting theoretical question to ponder from a REST-beard-stroking perspective, but it’s useless for my file retrieval purpose. As before, one solution is to use the magical Win32_Process “Create” method to overcome the shortcomings of the CIM_DataFile class. The windows command shell “type” command can be used to display the content of a text file. But the WMI Win32_Process “create” operation that we use here only returns the processId and a result code, not the stdout stream (unlike the remote shell protocol that I mentioned above). We cannot therefore use it directly to return the output of the “type” command over the wire.

The solution is to use one Win32_Process “create” operation over WS-Management to write the content of the file in a place where a subsequent WS-Management opeation can read it. I can think of two examples off the top of my head: directory names and environment variables.

Here is how you’d do it with directory names. The following command takes the test1.txt file, reads it and creates nested subdirectories, one for each line in the input file. The name of the directory is the content of the corresponding line in the file.

for /f "delims=" %I in (test1.txt) do @mkdir "%I" && cd "%I"

For example, if the file content is

1st line
2nd line
3rd line

The command will generate the following three subdirectories:

1st line
  |_ 2nd line
      |_ 3rd line

What’s the point? You can use WS-Management enumeration to retrieve the names of all directories (using the Win32_Directory WMI class). Now that may be a bit overwhelming, so you want to add a WS-Enumeration filter to your WS-Management request. The Microsoft WS-Management implementation supports the WQL filter syntax that lets you do just that.

BTW, you can presumably do the same thing with files, but directories by their nesting make it easy to read the lines in the order in which their appear in the file. Though you’d quickly run into path length limitations (and characters that are not valid in file/directory names).

A slightly more robust approach may be to set each line of the file in an environment variable (again via the “for”, and using “set” after the “do”). You can then read these environment variables over WS-Management by doing a WS-Transfer Get on the Win32_Environment WMI class. Unlike CIM_DataFile (for which Get only return properties, not the content), a Get on Win32_Environment includes the value of the environment variable as one of the properties. The pragmatic reasons for this dichotomy are obvious, but the architectural consequences will give a headache to anyone who still has any illusion that WS-Transfer has anything to do with REST.

As a side note, the “for” instruction can keep no more than 52 variables at a time, so if your file has more than 52 lines you’d have to send successive WS-Management requests and add a “skip” option to the “for” operation on subsequent requests (“skip=52”, “skip=104”, etc…). Again, practicality isn’t much of a concern here, we’re just playing with theory (Ed: “we”? how many people do you expect will still be reading at this point?).

That’s it for today’s episod of “Windows management for the on-the-wire-protocol guy”. Maybe next weekend I’ll take some time to look more into the remote shell over WS-Management protocol extention and how it can be misued/abused.

[UPDATE: The next post describes a more practical approach.]

5 Comments

Filed under DMTF, Everything, Implementation, IT Systems Mgmt, Manageability, Microsoft, SOAP header, Specs, Standards, WS-Management

Whose ******* idea was this?

My last two entries have been uncharacteristically Microsoft-friendly, so it’s time to restore some balance. Coincidentally, I just noticed the latest “alertbox” entry by Jakob Nielsen, about putting an end to password masking (the ******* that appears when you type a password). I actually disagree with Nielsen on this (it’s not just about shoulder-surfing, who hasn’t had to enter a password while sharing their desktop via a projector or a webex-like conference service; plus I either know my password very well or I paste it directly from a password management tool, either way the lack of visual feedback doesn’t bother me).

But, and this is where the Microsoft-bashing starts, there is one area where password-masking is inane: wifi keys. Unlike passwords, these are never things that you have picked yourself, so they are harder to type, often hexadecimal (the one I chose, for my home network, I never have to type).  And where do we do this? Either in a meeting room, where the key is written on the white board, or in a dentist waiting room, where it is pinned on the wall. In almost all cases, everyone in the room has access to the key. And if it is not on a wall, then it is on a piece of paper that’s right next to my computer and easier to snoop from. Masking this field, as Windows XP does, is plain stupid.

But stupidity turns into depravity and sadism when they force you to type it twice. I understand the reason for entering passwords twice when you initially set them in the system (accidentally entering a different password than what you intended can be trouble). But not when you provide them as a user requesting access (accidentally entering the wrong password just means you have to try again). So why does Windows insist on this? In the best case (I enter the key correctly twice) I’ve had to do double work for the same result. In the worst case (at least one is mistyped) I am in no better situation than if there was only one field but I have done twice the work. And this worst case is twice as likely to happen, since I have twice the opportunity to foul-up.

When confronted with this, I usually type the key in a regular text box (e.g. the search box in Firefox) and copy-paste from there to both fields in the Windows dialog box. But I shouldn’t have to.

While I am at it, do you also want to read what I think about the practice, initiated by MS Word as far as I can tell, to include formatting in copy/paste by default? And how deep you have to go in the “paste special” menu to get the obviously superior behavior (unformatted text)? Not really? Ok, I’ll save that for a future rant. Let’s just say that this idea must have come from a relative of the Windows wifi-key-screen moron. Just give me their names and I’ll be the arm of Darwinism.

[UPDATED 2009/6/26: Bruce Schneier agrees with Jakob Nielsen. So this is an issue at the confluence of security and usability on which both security guru Schneier and usability guru Nielsen are wrong. Gurus can’t always be right, but what’s the chance of them being wrong at the same time?]

2 Comments

Filed under Everything, Microsoft, Off-topic

Native “SSH” on Windows via WS-Management

Did you know that you can now SSH to a Windows machine over WS-Management and its is a documented protocol that can be implemented from any platform and programming language? This is big news to me and I am surprised that, as management protocol geek, I hadn’t heard about it until I started to search MSDN for a related but much smaller feature (file transfer over WS-Management).

OK, so it’s not exactly SSH but it is a remote shell. In fact it comes in two flavors, which I think of as “dumb SSH” and “super SSH”.

Dumb SSH

Dumb SSH is the ability to remotely run a DOS-like command shell over WS-Management. Anyone who has had to use the Windows command shell as a scripting language ersatz understands why I call it “dumb”. I expect that even in Microsoft most would agree (otherwise why would they have created PowerShell?).

Still, you can do quite a few basic things using the Windows command shell and being able to do them remotely is not something to sneer at if you’re building a management product. If you’re interested, you need to read MS-WSMV, the WS-Management Protocol Extensions for Windows Vista specification (available here as a PDF). By the name of the specification, I expected a laundry list of tweaks that the WS-Management and WS-CIM implementation in Vista makes on top of the standards (e.g. proprietary extensions, default values, unsupported features, etc). And there is plenty of that, in sections 3.1, 3.2 and 3.3. The kind of “this is my way” decisions that you’d come to expect from Microsoft on implementing standards. A bit frustrating when you know that they pretty much wrote the standard but at least it’s well documented. Plus, being one of those that forced a few changes in WS-Management between the Microsoft submission and the DMTF standard (under laments from Microsoft that “it’s too late to change Longhorn”) I am not really in position to complain that “Longhorn” (now Vista) indeed deviates from the standard.

But then we get to section 3.4 and we enter a new realm. These are not tweaks to WS-Management anymore. It’s a stateful tunneling protocol going over WS-Management, complete with base-64-encoded streams (stdin, stdout, stderr) and signals. It gives you all you need to run a remote command shell over WS-Management. In addition to the base Windows command shell, it also supports “custom remote shells”, which lets you leverage the tunneling mechanism for another protocol than the one made of Windows shell commands. For example, you could build an HTTP emulation over this on top of which you could run WS-Management on top of which… you know where this is going, don’t you?

A more serious example of such a “custom remote shell” is PowerShell, which takes us to…

Super SSH

Imagine SSH with the guarantee that the shell that you log into on the other side was a Python interpreter, complete with full access to the server’s management API. I think that would qualify as “super SSH”, at least for IT management purposes (no so exciting if all you want to do is check your email with mutt). This is equivalent to what you get when the remote shell invoked over WS-Management (or rather WS-Management plus Vista extensions described above) is PowerShell instead of the the Windows command shell. I have always liked PowerShell but it hasn’t really be all that relevant to me (other than as a design study) because of its ties to the Windows platform. Now, thanks to MS-PSRP, the PowerShell Remoting Protocol specification (PDF here) we are only a good Java (or Python, or Ruby) library away from being able to invoke PowerShell commands from any language, anywhere.

I have criticized over-reliance on libraries to shield developers from XML for task that really would be much better handled by simply learning to use XML. But in this case we really need a library because there is quite a bit of work involved in this protocol, most of which has nothing to do with XML. We have to fragment/defragment packets, compress/decompress messages, not to mention the security aspects. At this point you may question what the value of doing all this on top of WS-Management is, for which I respectfully redirect you to your local Microsoft technology evangelist, MVP or, in last resort, sales representative.

Even if PowerShell is not your scripting language of choice, you can at least use it to create a bootstrap mechanism that will install whatever execution engine you want (e.g. Ruby) and download scripts from your management server. At which point you can sign out of PowerShell. For some reason, I get the feeling that we just got one step closer to Puppet managing Windows machines.

A few closing comments

First, while the MS-WSMV part that lets you run a basic command shell seems already available (Vista SP1, Win2K3R2, Win2K8, etc), the PowerShell part is a lot greener. The MS-PSRP specification is marked “preliminary” and the supported platform list only contains Windows 7 and Win2K8R2. Nevertheless, the word from Microsoft is that they have the intention to make this available on XP and above shortly after Windows 7 comes out. Let’s hope this is the case, otherwise this technology will remain largely irrelevant for years to come.

The other caveat comes from the standard angle. In this post, I only concern myself with the technical aspects. If you want to implement these specifications you have to also take into account that they are proprietary specifications with no IP grant (“Microsoft has patents that may cover your implementations of the technologies described in the Open Specifications. Neither this notice nor Microsoft’s delivery of the documentation grants any licenses under those or any other Microsoft patents”) and fully controlled by Microsoft (who could radically change or kill them tomorrow). As to whether Microsoft plans to eventually standardize them, I would again refer you to your friendly local Microsoft representative. I can just predict, based on the content of the specification, that it would make for some interesting debates in the DMTF (or wherever they may go).

This is a big step towards the citizenship of Windows machines in an automated datacenter (and, incidentally, an endorsement for the “these scripts have to grow up” approach to automation). As Windows comes to parity with Unix in remote scripting abilities, the only question remaining (well, in addition to the pesky license) will be “why another mechanism”. Which could be solved either via standardization of MS-PSRP, de-facto adoption (PowerShell on Suse Linux is only one Microsoft-to-Novell check away) or simply using PowerShell as just a bootstrapping mechanism for Puppet or others, as mentioned above.

[UPDATE: On a related topic, these two posts describe ways to transfer files over WS-Management.]

8 Comments

Filed under Automation, DMTF, Everything, Implementation, IT Systems Mgmt, Manageability, Mgmt integration, Microsoft, Portability, Specs, Standards, WS-Management

With M (Oslo), is Microsoft on the path to reinventing RDF?

I have given up, at least for now, on understanding what Microsoft wants Oslo (and more specifically the “M” part) to be. I used to pull my hair reading inconsistent articles and interviews about what M tries to be (graphical programming! DSL! IT models! generic parser! application components! workflow! SOA framework! generic data layer! SQL/T-SQL for dummies! JSON replacement! all of the above!). Douglas Purdy makes a valiant 4-part effort (1, 2, 3, 4) but it’s still not crisp enough for my small brain. Even David Chapell, explainer extraordinaire, seems to throw up his hands (“a modeling platform that can be applied in lots of different ways”, which BTW is the most exact, if vague, description I’ve heard). Rather than articles, I now mainly look at the base specifications and technical documents that show what it actually is. That’s what I did  when the Oslo SDK first came out last year. A new technical document came out recently, an update to the MGraph Object Model so I took another a look.

And it turns out that MGraph is… RDF. Or rather, “RDF minus entailment”. And with turtle as the base representation rather than an add-on.

Look at section 3 (“RDF concepts”) in this table of content from W3C. It describes the core RDF concepts. Keep the first five concepts (sections 3.1 to 3.5) and drop the last one (“3.6: Entailment”). You have MGraph, a graph-oriented object model.

On top of this, the RDF community adds reasoning capabilities with RDF entailment, RDFS, OWL, SWRL, SPIN, etc and a variety of engines that implement these different levels of reasoning.

Microsoft, on the other hand, seems to ignore that direction. Instead, it focuses on creating a good mapping from this graph object model to programming languages. In two directions:

  • from programming languages to the graph model: they make it easy for you to create a domain-specific language (DSL) that can easily be turned into M instances.
  • from the graph model to programming languages: they make it easy for you to work on these M instances (including storing them) using the .NET technology stack.

So, if Microsoft is indeed reinventing RDF as the title of this entry provocatively suggests, then they are taking an interesting detour on the way. Rather than going straight to “model-based inferencing”, they are first focusing on mapping the core MGraph concepts to programming (by regular developers) and user interactions (with regular users). Something that for a long time had not gotten much attention in the RDF world beyond pointing developers to Jena (though it seemed to have improved over the last few years with companies like TopQuadrant; ironically, the Oslo model browser/editor is code-named “Quadrant”).

Whether the Oslo team sees the inferencing fun as a later addition or something that’s not needed is another question, on which I don’t see any hint at this point.

I hope they eventually get to it. But I like the fact that they cleanly separate the ability to represent and manipulate the graph model from the question of whether instances can be inferred. We could use such a reusable graph representation mechanism. Did CMDBf, for example, really have to create a new graph-oriented metamodel and query language? I failed to convince the group to adopt RDF/SPARQL, but I may have been more successful if there had been a cleanly-separated “static” version of RDF/SPARQL, a way to represent and query a graph independently of whether the edges and nodes in the graph (and their types) are declared or inferred. Instead, the RDF stack has entailment deeply embedded and that’s very scary to many.

But as much as I like this separation, I can’t help squirming when I see the first example in the MGraph document:

// Populate a small village with some people
Villagers => {
  Jenn => Person { Name => 'Jennifer', Age => 28, Spouse => Rich },
  Rich => Person { Name => 'Richard', Age => 26, Spouse => Jenn },
  Charly => Person { Name => 'Charlotte', Age => 12 }
},
HaveSpouses => { Villagers.Rich, Villagers.Jenn }

That last line is an eyesore to anyone who has been anywhere near RDF. I have just declared that Rich and Jenn are one another’s spouse, why do I have to add a line that says that they have spouses? What I want is to say that participation in a “Spouse” relationship entails membership in the “hasSpouse” class. And BTW, I also want to mark the “Spouse” relationship as symmetric so I only have to declare it one way and the inverse can be inferred.

So maybe I don’t really know what I want on this. I want the graph model to be separated from the inference logic and yet I want the syntactic simplicity that derives from base entailments like the example above. Is June too early to start a Christmas wish list?

While I am at it, can we please stop putting people’s ages in the model rather than their dates of birth? I know it’s just an example, but I see it over and over in so many modeling examples. And it’s so wrong in 99% of cases. It just hurts.

There are other things about MGraph edges that look strange if you are used to RDF. For example, edges can be labeled or not, as illustrated on this first example of the graph model:

In this example, “Age” is a labeled edge that points to the atomic node “42”, while the credit score is modeled as a non-atomic node linked from the person via an unlabeled edge. Presumably the “credit score” node is also linked to an atomic node (not shown) that contains the actual score value (e.g. “800”). I can see why one would want to call out the credit score as a node rather than having an edge (labeled “credit score”) that goes to an atomic node containing the actual credit score value (similar to how “age” is handled). For one thing, you may want to attach additional data to that “credit score” node (when was it calculated, which reporting agency provided it, etc) so it helps to have it be a node. But making this edge unlabeled worries me. Originally you may only think of one possible relationship type between a person and a credit score (the person has a credit score). But other may pop up further down the road, e.g. the person could be a loan agent who orders the credit score but the score is about a customer. So now you create a new edge label (“orders”) to link the loan agent person to the credit score. But what happens to all the code that was written previously and navigates the relationship from the person to the score with the expectation that the score is about the person. Do you think that code was careful to only navigate “unlabeled” edges? Unlikely. Most likely it just grabbed whatever credit score was linked to the person. If that code is applied to a person who happens to also be a loan agent, it might well grab a credit score about other people which happened to be ordered by the loan agent. These unlabeled edges remind me of the practice of not bothering with a “version” field in the first version of your work because, hey, there is only one version so far.

The restriction that a node can have at most one edge with a given label coming out of it is another one that puzzles me. Though it may explain why an unlabeled edge is used for the credit score (since you can get several credit scores for the same person, if you ask different rating agencies). But if unlabeled edges are just a way to free yourself from this restriction then it would be better to remove the restriction rather than work around it. Let’s take the “Spouse” label as an example. For one thing in some countries/cultures having more than one such edge might be possible. And having several ex-spouses is possible in many places. Why would the “ex-spouse” relationship have to be defined differently from “spouse”? What about children? How is this modeled? Would we be forced to have a chain of edges from parent to 1st child to next sibling to next sibling, etc? Good luck dealing with half-siblings. And my model may not care so much about capturing the order (especially if the date of birth is already captured anyway). This reminds me of how most XML document formats force element order in places where it is not semantically meaningful, just because of XSD’s bias towards “sequence”.

Having started this entry by declaring that I don’t understand what M tries to be, I really shouldn’t be criticizing its design choices. The “weird” aspects I point out are only weird in the context of a certain usage but they may make perfect sense in the usage that the Oslo team has in mind. So I’ll stop here. The bottom line is that there are traces, in M, of a nice, reusable, graph-oriented data model with strong bridges (in both directions) to programming languages and user interfaces. That is appealing to me. There are also some strange restrictions that puzzle me. We’ll see where this goes (hopefully this article, “Designing Domains and Models Using M” will soon contain more than “to be submitted” and I can better understand the M approach). In any case, kuddos to the team for being so open about their work and the evolution of their design.

8 Comments

Filed under Application Mgmt, Articles, Everything, Graph query, Mgmt integration, Microsoft, Modeling, RDF, Semantic tech

Interesting links

A few interesting links I noticed tonight.

HP Delivers Industry-first Management Capabilities for Microsoft System Center

That’s not going to improve the relationship between the Insight Control group (part of the server hardware group, of Compaq heritage) and the BTO group (part of HP Software, of HP heritage plus many acquisitions) in HP.  The Microsoft relationship was already a point of tension when they were still called SIM and OpenView, respectively.

CA Acquires Cassatt

Constructive destruction at work.

Setting up a load-balanced Oracle Weblogic cluster in Amazon EC2

It’s got to become easier, whether Oracle or somebody else does it. In the meantime, this is a good reference.

[UPDATED 2009/07/12: If you liked the “WebLogic on EC2” article, check out the follow-up: “Full Weblogic Load-Balancing in EC2 with Amazon ELB”.]

Full Weblogic Load-Balancing in EC2 with Amazon ELB

Comments Off on Interesting links

Filed under Amazon, Application Mgmt, Automation, CA, Cloud Computing, Everything, HP, IT Systems Mgmt, Manageability, Mgmt integration, Microsoft, Middleware, Oracle, Utility computing, Virtualization

/me thinks Google Wave looks like IRC

If you’re not yet seasick with all the reviews of Google Wave, here are a few additional thoughts.

My mental picture for a Wave is that of an IRC channel on which each message is an edit to an XML doc. And where the IRC server (or a bot, like Zakim) keeps a log of all messages. I think it’s the use of bots in Wave as in IRC that pushed me towards this view. The character-per-character update reminded me of the arguments about the comparative values of the Unix “talk” command and IRC. And if the IRC comparison holds water, hang on for the upcoming bot wars. BTW, doesn’t this Wave Federation Protocol look like an ideal opportunity to resurrect the IRC bot attack code that leveraged server splits?

Leaving IRC aside, the other obvious lens through which to look at Wave is the good old WS/REST debate. Let’s brace ourselves for the “is Wave RESTful” analysis that are sure to follow. I’ll note, tongue in cheek, that an alternative (to XMPP) way to implement a Wave could be provided by the WS specifications currently being worked on in the W3C Web Services Resource Access working group : send a succession of WS-RT “Put” messages to a WS-Eventing event sink that, in turn, acts as an event source. Or formalize the sink/source combination more cleanly as a broker from WS-BrokeredNotification. Finally a non-management use case for these specifications! Good luck doing character-by-character updates over this, but I am not sure that this is the most fundamental part of Wave anyway (though it makes for a good demo).

Nick Gall is right to separate the “technology showcase” aspect from the “killer app” aspect. The demo is very nice but it takes more than cool technology to change years of habits and social conventions, supported by hundreds of tools. So I am not sure how much of a killer app this collaboration demo is, however nice. On the other hand, I can see how the underlying framework (or at least the techniques used to create it) could quickly spread. I need more time looking at the federation protocol to decide what I think about it. This blog entry clearly describes the three Ps (product, platform, protocol) and some of the history.

As far as how this may relate to systems management, I don’t see too much alignment from a modeling perspective. What really matters in IT models are the relationships between the entities and Wave puts a lot more focus on the content of each wave than its relationships with others. At least for now. The underlying synchronization techniques on the other hand seem more readily applicable. The Rasmussen brothers previously created Google Maps which I found very inspiring from an IT management point of view. Years later the IT management industry still hasn’t caught up with them.

2 Comments

Filed under Everything, Google, JavaScript, Mashup

Cloud APIs need to be complemented by Cloud processes

A lot of attention has been focused on technical standards for Cloud computing, especially over the last month (e.g. DMTF incubator announcement). That’s fine, but before we go crazy with detailed technical standards let’s realize that for Cloud computing (of the public variety at least) to take off we’ll need just as much standardization of non-technical interactions. Namely processes.

This, to me, is one of the most interesting angles on the recent announcement by Amazon AWS that they now support (in limited beta) the ability to load data from storage that is physically shipped to them. Have a look at this announcement and you’ll notice that it spends more time describing a logistical process (how to pack, how to ship…) than technical interfaces (storage device requirements, how to create a manifest…). It is still part of the “AWS Developer Guide” but clearly these instructions are not just for developers.

Many more such processes need to be “standardized” (or at least documented) for companies to efficiently be able to use public Clouds (and to some extent even private Clouds). Let’s take SLAs as an example. It sounds good when a Cloud provider says “we offer SLAs”. But what does it mean? Does it mean “we advertise some SLA numbers, you’re responsible for contacting us (trough a phone number hidden somewhere on our site) when you think we’ve violated them; if we agree with your measurements then you may get a check in the mail at some point in the future”? Not so useful. If, on the other hand, there is a clear definition of the metric that the SLA applies to, a clear definition of how it gets measured (do we trust provider performance reports, customer measurement, a third party monitor…), a clear process to claim refund, a clear process to actually provide the refund (credit for future service or direct payment, when/how is the payment made…), then it becomes more useful.

I picked the SLA enforcement example because it happens to be an area that the TMF (TeleManagement Forum) has made partially available as a teaser for its eTOM business process framework (aimed at telco providers). The full list of eTOM processes is only available to paying subscribers. One of the goals of the eTOM process framework is “to simplify procurement, serving as a common language between service providers and suppliers”. Another way to say it is that eTOM “recognizes that the enterprise interacts with external parties, and that the enterprise may need to interact with process flows defined by external parties, as in ebusiness interactions”. Exactly what we are talking about when it comes to making public Clouds easily consumable by enterprises. SLA management is just one small part of the overall eTOM framework (if you look for it in this eTOM overview poster it’s in purple, under “assurance”, in the first row).

My point is not to assert that Cloud providers should adopt eTOM. Nobody adopts eTOM directly as a blueprint anyway. But, while the cultures and maturity levels are sometimes different, it is also hard to argue that Cloud providers have nothing to learn form telco providers (many of which are becoming Cloud providers themselves). I shudder at the idea of AT&T teaching another company how to handle customer service, but have you ever tried to call Google?

Readers of this blog are likely to be more familiar with ITIL than eTOM (who, incidentally, incorporates parts of ITIL in its latest version, 8.0). For those who don’t know about either, one way to think about it is that Cloud providers would implement processes that look somewhat like eTOM processes, that Cloud consumers implement IT management processes that follow to some extent ITIL best practices and that these two sets of processes need to meet for public Clouds to work. I touched on this a few months ago, when I commented on the incorporation of Cloud services in an IT service catalog.

My main point is not about ITIL or eTOM. It’s simply that there are important process aspects to delivering/consuming Cloud services and that they have so far been overshadowed by the technical aspects. The processes sketched in the AWS import/export capability represent the first drop of an upcoming shower.

[UPDATED 2009/5/22: More on telcos becoming Cloud providers from EMC’s Chuck Hollis, with a retort by James Governor. Just listing these as FYI but my main point in this post is not about telcos, it’s about the need to clarify processes, independently of whether the provider is Amazon or AT&T. It’s just that the telcos have been working on such process standardization for a long time. Hoff provides another example of where process standardization is needed in Cloud relationships: right to audit.]

2 Comments

Filed under Amazon, Application Mgmt, Business Process, Cloud Computing, Everything, IT Systems Mgmt, ITIL, Standards, Utility computing

IT automation: the seven roads to management middleware

You can call it a “Cloud operating system”, an “adaptive infrastructure framework” or simply “IT management middleware” (my vote) as you prefer. It’s the software that underpins the automation engine of your Cloud. You can’t have a Cloud without an automation engine, unless you live in a country where IT admins run really fast, never push the wrong button, never plug a cable in the wrong port, can interpret blinking lights at a rate of 9,600 bauds and are very cheap. The automation engine is what technically makes a Cloud. That engine is an application whose business is to know what needs to be done to maintain the IT environment you use in a state that is acceptable to you at any point in time (where you definition of “acceptable” can evolve). Like any application, you want to keep its business logic neatly isolated from the mundane tasks that it relies on. These mundane tasks include things like:

  • collecting events and delivered them to the right place
  • collecting metrics of the different IT elements
  • discovering available resources and accessing them (with or without agents)
  • performing coordinated actions on IT elements
  • maintaining an audit of management actions
  • securing the management interactions
  • managing long-running tasks and processes
  • etc

That’s what management middleware does. It doesn’t automate anything by itself, but it provides an environment in which it is feasible to implement automation. This middleware is useful even if you don’t automate anything, but it often doesn’t get called out in that scenario. On the other hand, automation means capturing more business logic in software which makes it imperative to clearly layer concerns, at which point the IT management middleware can be more clearly identified within the overall IT management infrastructure.

This is happening in many different ways. I can count seven roads to IT management middleware, seven ways in which it is emerging as an identifiable actor in data centers. Each road represents a different history and comes with different assumptions and mindsets. And yet, they go after the same base problem of enabling IT management automation. Here is a quick overview of these seven roads.

Road #1: “these scripts have to grow up”

This road starts from all the scripts common in IT operations and matures them. It’s based on the realization that they are crucial business assets, just like the applications that they support. And that they implement reusable patterns. Alex Honor described it well here. Puppet and Powershell are in this category.

Road #2: “it’s just another integration job”

We’ve been doing computer integration almost since the second piece of software was written. There are plenty of mechanisms available to do so. IT management is just another integration problem, so let’s present it in a way that allows us to use our favorite integration tools on it. That’s the driver behind the use of Web Services for management integration (e.g. WSDM/WS-Management): create interfaces to manageable resources so that existing middleware (mostly J2EE application servers, along with their WS stack) can be used to solve the “enterprise IT management” integration problems in a robust and reliable way. The same logic is behind the current wave of REST-based IT management efforts (see this presentation). REST is a good integration approach, so let’s turn IT resources into RESTful resources so we can apply this generic integration mechanism to enterprise IT management. Different tool, but same logical approach. Which is why they can be easily compared.

Road #3: “top-down”

This is the “high road”, the one most intellectually satisfying and most promising in the long term. But also the one with this highest hurdle off the gate. In this approach, you create a top-down model of your system and you try to mediate management actions through this model. But for this to be practical, you need to hit the sweetspot in many dimensions. You need composable sub-models at a level of granularity that makes them maintainable. You need to force enough uniformity but not so much as to loose all optimizations. You need to decide which of the myriads of configuration variables you include in your model. Because you can’t take the traditional approach of “I’ll model it and display it to my user who can decide what to do with it”. Because the user now is a piece of software and it can’t make a judgment of whether it is ok if parameter foo differs from the desires state or not. This has been worked on for a long time (remember HP’s UDC?) with steady but slow progress. Elastra has some of the most interesting technology there, and a healthy dose of realism and opportunism to make it work.

Think of it as SCA component/composites but not just for software artifacts. Rather, it’s SCA for all IT elements, with wires and policies that are just rich enough to allow meaningful optimization but not too rich to be unmanageable. If you can pull off such model-driven IT management middleware, then the automation code almost writes itself on top of it.

Road #4: “management integration is another feature of our management console”

That was the road followed by the Big Four. Buy enough of their products (CMDBs, network management console, operations console, service desk, etc…) and you’ll get APIs that allow you to leverage their discovery, collection, eventing and process management features. So you can write your automation on top. At least on paper. In reality, these APIs are too inconsistent and import/export-oriented to really support SOA-style (or REST-style) integration, even though they usually have a SOAP and/or plain HTTP option available. It’s a challenge just to get point to point integration between these products, even more a true set of management services that can be orchestrated. These vendors know it and rather than turning their product suites into a real SOI (Service Oriented Infrastructure) they have decided to build/buy automation engines on the side that can be hard-wired with the existing IT management products. That’s your IT management middleware but it comes bundled with the automation engine rather than as an independent layer.

Road #5: “management integration is another feature of our hypervisor”

If the virtual machine (in the x86 virtualization sense of the term, a.k.a. a fake machine) is the basic building block of your IT infrastructure then hypervisor interfaces to manipulate these VMs are pretty much all you need in terms of middleware to build data center automation on top, right? Are we done then? Not really, since there is a lot more to an application than the VMs on which it runs. Still, hypervisors bring the potential of automation to what used to be a hardware domain and as such play a big part in the composition of the IT management middleware of modern data centers.

Road #6: “make it all the same”

From what I understand about how Yahoo, Google (see section 2.1 “System Health Infrastructure”), Microsoft (see the “device manager” and “collection service” parts of Autopilot) and others run their Web applications, they have put a lot of work in that management middleware and have made simplicity a key design goal for it. To that end, they are  willing to accept drastic limitations at both ends of the IT infrastructure chain: at the bottom, they actively limit the heterogeneity of resources in the data center. At the top, they limit the capabilities exposed to the business applications. In an extreme scenario, all servers are the same and all the business applications are written to a few execution/persistence/communication environments (think GAE SDK as an example). Even if you only approximate this ideal, it’s a dramatic simplification that makes your IT management middleware much simpler and thinner.

Road #7: “it’s the Grid”

The Grid computing and HPC (High Performance Computing) communities have long been active in this area. There is a lot of relevant expertise in all the Grid work, but we also need to understand the difference between IT management middleware and the Grid infrastructure as defined by OGSI. OGSI defines a virtualization layer on which to build applications. It doesn’t define how to deploy, manage and configure the physical datacenter infrastructure that allows OGSI interfaces to be exposed to consumers. With regards to HPC, we also need to keep in mind that the profile of the applications is very different from your typical enterprise application (especially the user-driven apps as opposed to batch jobs). In HPC environments, CPUs can run at full capacity for days and new requests just go in a queue. The Web applications of your typical enterprise don’t have this luxury and usually need spare capacity.

All these approaches can complement each other and I am not trying to pin each product/vendor to just one approach. In this post (motivated by this podcast), Stu Charlton discusses the overlap and differences between some of these approaches. Rather than a taxonomy of products, this list of seven roads to IT management middleware is simply a cultural history, a reading guide to understand the background, vocabulary and assumptions of the different solutions. This list cuts across the declarative versus procedural debate (#1 is clearly procedural, #3 is clearly declarative, the others could go either way).

[UPDATED 2009/6/23: Stu has a somewhat related (similary structured but much more entertainingly writen) list of Cloud Computing approaches. I feel good that I have one more item in my list than him.]

3 Comments

Filed under Application Mgmt, Automation, Cloud Computing, Everything, Grid, IT Systems Mgmt, Mgmt integration, Middleware, Utility computing

Oracle buys Virtual Iron

The rumor had some legs. Oracle announced today that is has acquired Virtual Iron for its virtualization management technology. This publicly-available white paper is a great description of the technology and product capabilities.

Here is a short overview (from here).

VI-Center provides the following capabilities:

  • Physical infrastructure: Physical hardware discovery, bare metal provisioning, configuration, control, and monitoring
  • Virtual Infrastructure: Virtual environment creation and hierarchy, visual status dashboards, access controls
  • Virtual Servers: Create, Manage, Stop, Start, Migrate, LiveMigrate
  • Policy-based Automation: LiveCapacity™, LiveRecovery™, LiveMaintenance, Rules Engine, Statistics, Event Monitor, Custom policies
  • Reports: Resource utilization, System events

Interesting footnote: I read that SAP Ventures was an investor in Virtual Iron…

I also notice that the word “cloud” does not appear once in the list of all press releases issued by Virtual Iron over three years. For a virtualization start-up, that’s a pretty impressive level of restrain and hype resistance.

1 Comment

Filed under Everything, IT Systems Mgmt, Manageability, Mgmt integration, Oracle, Virtualization, Xen

The law of conservation of hype

To the various conservation laws from physics (e.g. of energy and of momentum), one can add the law of conservation of hype. In the IT industry, as in others, there is only so much bandwidth for over-hyped concepts. Old ones have to move out of the limelight to make room for new ones, independently of their usefulness.

Here is this law, I think, illustrated in action. After running a Google Trends report on “web services”, “SOA”, “virtualization” and “cloud computing”, I downloaded the underlying data and added one line: the total search volume across all four terms. Here is the result:

SOA/WS/Cloud/Virtualization search volume trend

The black line, “total”, is remarkably flat (if you ignore the annual Christmas-time drop). There is a surge in late 2007 for both WS and SOA that I can’t really link to anything (Microsoft first announced Oslo around that time, but I doubt this explains it). Other than this, there is a nice continuity that seems to graphicaly support the following narrative:

Web services were the hot thing in the beginning of the decade among people who sell and buy corporate IT systems. Then the cool kids decided that Web services were just an implementation technology but what matters is the underlying pattern. So “SOA” became the word to go after. Just ask Sys-con: exit “Web Services Journal”, hello “SOA World magazine”. Meanwhile “virtualization” has been slowly growing and suddenly came Cloud computing. These two are largely an orthogonal concern from the SOA/WS pair but it doesn’t matter. Since they interest the same people, the law of conservation of hype demands that room be made. So down goes SOA.

The bottom line (and the reason why I ran these queries on Google Trends to start with) is that I feel that application integration and architecture concerns have been pushed out of the limelight by Cloud computing, but that important work is still going on there (some definition work and a lot of implementation work). Work that in fact will become critical when Cloud computing grows out of its VM-centric adolescent phase. I plan to write more entries about this connection (between Cloud computing and application architecture) in the future.

[Side note: I also put this post in the crazyStats category because I understand that by carefully picking the terms you include you can show any trend you want for the “total”. My real point is not about proving “the law of conservaton of hype” (though I believe in it). Rather, it is captured in the previous paragraph.]

5 Comments

Filed under Cloud Computing, CrazyStats, Everything, SOA, Virtualization

Hyperic joins SpringSource

SpringSource’s Rod Johnson tells us today that his company just bought Hyperic. The press release is a bit more specific, announcing that SpringSource acquired “substantially all of the assets of Hyperic”, which sounds different from acquiring the company itself. Maybe not for SpringSource customers, but possibly for current Hyperic customers (and investors). Acquiring the assets of an open source company may sound like a bit of an oxymoron (though I understand it’s not just about the source code), but Hyperic is what’s called an “open core” company, which means not all the code is open source (see Tarus’ take on it). But the main difference between this and forking might be that you are getting the key employees; who are nice enough with their investors do to it in an orderly way.

Anyway, this is not a business or HR blog, it’s about the technology. And on that front, this looks like an interesting way for SpringSource to expand their monitoring from just the application down into some parts of the infrastructure, at least to some extent. SpringSource’s AMS (Application Management Suite) was already based on Hyperic, so the integration headaches should be minimal. And Hyperic has been doing some Cloud monitoring work too (see this podcast if you want to learn more about it), which if nothing else is PR gold these days (I am not saying it’s just that, but it is that for sure).

As a side note, it is ironic that Hyperic (which started inside Covalent until Javier Soltero spun it off and became its CEO) is now reunited with its mothership (SpringSource acquired Covalent last year).

I am a big proponent of management capabilities in application infrastructure. I applauded Rod Johnson for writing something along the same line last year and I am pleased to see him really push this approach with this acquisition.

Here are the questions that come to my mind when I read about this deal (keep in mind that this is competition from my perspective, so feel free to “question my questions” as you read):

I was going to ask whether this acquisition means that Hyperic users who don’t care for Spring are going to see diminishing value as the product becomes more tied to Spring. But if you look at what Hyperic gives you on the resources it manages, it’s mainly a list of metrics and a few control operations. These will still be there because they’ll be needed for the Spring-centric view anyway. It would be more of a question if Hyperic had advanced discovery features (e.g. examine all the config files of the managed resources and extract infrastructure topology from them). I would wonder if these would still be maintained/improved for non-Spring middleware. But again, not an issue here since I don’t think there is much of this in Hyperic today. And since presumably SpringSource made the acquisition in part to cover more resources types in their management offering (Rod talks about DB and VM management in his post), the list of supported infrastructure elements (OS, DB, VM, network…) will presumably grow rather than shrink. What may be trimmed down eventually is the list of application runtimes currently supported. If you’re a Hyperic/Coldfusion user you should probably attend the upcoming webcast to hear about the plans.

Still on the topic of Hyperic’s monitoring-only capabilities, it means that if Rod Johnson really wants to provide everything for Java developers to put “applications into production without the mediation of operations”, as he says, then he should keep his checkbook open (as a side note, if a developer puts “applications into production” then s/he doesn’t bypass operations but rather becomes operations; you may not think of yourself as one, but if you’re the one who gets called when the application crashes then you are in “operations”). SpringSource is still a long way from offering the complete picture. Here are my guesses for the management features on Rod’s grocery list:

  • configuration management -many potential acquisition candidates
  • in depth database management (going beyond the “you want metrics? we’ve got metrics!” approach to DB management) – fewer candidates

As far as in-house developement, I would expect this acquisition to first yield some auto-discovery of application (and infrastructure) topology in a Spring environment. Then they’ll have to decide if they want to double-down on Cloud support and build/buy more automation features or rather focus on application-centric management and join the fray of BTM / transaction tracing. Doing both at the same time would be very ambitious. This Register article seems to imply the former (Cloud) but my guess is that SpringSource will make the smart choice of focusing on the latter (application-centric management). I see in the Register that, “Peter Cooper-Ellis, SpringSource’s senior vice president of engineering and product management called management of the cloud and virtualized datacenters a strategic driver for the deal”. But this sounds more like telling a buzzword-hungry reporter what he wants to hear rather than actual strategy to me. We’ll see. I hope this acquisition and its follow-through will help move the industry in the right direction of application-centric management, something that will take more than one company.

[UPDATED 2009/5/7: A nice article on the acquisition by Charles Humble at InfoQ. Though I have to take issue with the assertion that “many aspects of monitoring that are essential in a data centre, such as OS and network monitoring, are irrelevant in the context of the cloud”.]

[UPDATED 2009/6/23: Via Coté, an announcement that shows that the Cloud angle might have more post-aquisition juice than I expected. Unless this thing coasted on momentum alone.]

3 Comments

Filed under Application Mgmt, Business, Cloud Computing, Everything, IT Systems Mgmt, Manageability, Middleware, Open source, Spring, Utility computing

Cloud API: what’s cooking between IBM and VMWare?

In the previous entry, I declared that I had a “guess as to why [the DMTF Cloud] incubator was created without a submission”, that I may later reveal. Well here it is: VMWare and IBM are negotiating a joint Cloud API submission to DMTF and need more time before they can submit it.

This is 100% speculation on my part. It’s not even based on rumors or leaks. I made it up. Here are the data points that influenced me. You decide what they’re worth.

  • VMWare has at numerous time announced (comments here and here) that they would submit a vCloud API to DMTF in the first half of 2009.
  • In the transcript of this VMWare webcast we learn that an important part of the vCloud API is its adoption of REST as part of a move towards more abstraction and simplicity (“this is not simply proxy-ing of VIM APIs”).
  • IBM, meanwhile, has been trying to get a SOAP-based IT management framework for a while. Unsuccessfully so far. WSDM was a first failed attempt. The WS-Management/WSDM reconciliation was another one (I was in the same boat on both of these). The WS-RA working group at W3C (where the ashes of WS-RT are smoldering) could be where the third attempt springs from. But IBM is currently very quiet about their plans (compared to all the conference talks, PowerPoint slides and white papers that that heralded the previous two attempts). They obviously haven’t given up, but they are planning the next move. And the emergence of Cloud computing in the meantime is redefining the IT automation landscape in a way that they will make sure to incorporate in their updated standards plans.
  • Then comes the DMTF Cloud incubator of which the co-chairs are from VMWare and IBM (“interim” co-chairs in theory, but we know how these things go). Which seems to imply an agreement around a proposal (this is what the incubator process is explicitly designed for: “allow vendors aligned with a certain proposal to move forward and produce an interoperability specification”). But there is no associated specification submission, which suggest that the agreed-upon proposal is still being negotiated.

VMWare has a lot of momentum in a virtualization-focused view of IT automation (the predominant view right now, though I am not sure it will always be) and IBM sees them as the right partner for their third attempt (HP was the main partner in the first, Microsoft in the second). VMWare knows that they are going against Microsoft and they need IBM’s strength to control the standard. This could justify an alliance.

It seems pretty clear that VMWare has an API specification already (they supposedly even gave it to partners). It is also pretty clear that IBM would not agree to it in a wholesale way. For technical and pride reasons. They did it for OVF because it is a narrow specification, but a more comprehensive Cloud API would touch on a lot of aspects where IBM has set ideas and existing products. Here are some of the aspects that may be in contention.

REST versus WS-* – Yes, that old rathole. Having just moved to REST, the VMWare folks probably don’t feel like turning around. IBM has invested a lot in a WS-* approach over the years. It doesn’t mean that they won’t go with the REST approach, but it would take them some time to get over it. Lots of fellows and distinguished engineers would need to be convinced. There are some very REST-friendly parts in IBM (in Rational, in WebSphere) but Tivoli has seemed a lot less so to me. The worst outcome is if they offer both options. If you see this (or if you see XPath/XQuery expressions embedded inside URLs or HTTP headers), run for the escape hatches.

While REST versus WS-* is an easy one to grab on, I don’t think it’s the most important issue. Both parties are smart enough to realize it’s not that critical (it’s the model, not the protocol, that matters).

CBE/WEF – IBM has been trying to get a standard stamp on its Common Base Event format (CBE) forever. When they did (as WEF, the WSDM Event Format) it was in a simplified form (by yours truly, among others) and part of a standard that wasn’t widely adopted. But it’s still there in Tivoli and you can expect it to resurface in some form in their next proposal.

Software packaging – I am not sure what’s up with SDD, but whether it’s this specification or something else I would expect that IBM would have a lot to say about software packaging and patching. A lot more than VMWare probably cares about. Expect IBM’s fingerprints all over that part.

Security – I have criticized IBM many times for the “security considerations” boilerplate that they stick on every specification. But this in an area in which it actually make sense to have a very focused security analysis, something that IBM could do a lot better than VMWare I suspect.

ITSM / ITIL – In addition to the technical aspect of IT management operations, there are plenty of process and human aspects. Many areas of ITSM are applicable (e.g. I have written about the role of service catalogs, or you can think about the link to CMDBf). IBM has a lot more exposure there than VMWare.

Grid – IBM’s insistence to align Grid computing and IT management is one of the things that weighted WSDM down. Will they repeat this? In a way, Cloud computing *is* that junction of IT management and Grid that they were after with WSRF. But how much of the existing GGF Grid infrastructure are they going to try to accommodate? I don’t think they’ll be too rigid on this, but it’s worth watching.

Seeing how the topics above are handled in the VMWare/IBM proposal (if such a proposal ever materializes) will tell the alert readers a lot about the balance of power between VMWare and IBM.

As a side note, there are very smart people in the EMC CTO office (starting with the CTO himself and my friend Tom Maguire) who came from IBM and are veterans of the WSDM/WSRF/OGSI efforts. These people could play an interesting role in the IBM/VMWare relationship if the corporate arrangement between EMC and VMWare allows it (my guess is it doesn’t). Another interesting side note is to ask what Microsoft would do if indeed VMWare and IBM were dancing together on this. Microsoft is listed in the members of the DMTF Cloud incubator, but I notice a certain detachment in this post from Steve Martin. For now at least.

Did I mention that this is all pure speculation on my part? We’ll see what happens. Hopefully it’s at least entertaining. And even if I am wrong, the questions raised (around the links between previous IT management efforts and the new wave of Cloud standards) are relevant anyway. I am still in “lessons learned” mode on this.

[UPDATED 2009/5/5: Here is a first-hand source for the data point that VMWare plans to submit the vCloud API (rather than second-hand reports from reporters): Winsont Bumpus (VMWare’s Director of Standards Architecture) says that “VMware announced its intention to submit its key elements of the vCloud API to an existing standards organization for the basis of developing an industry standard”.]

1 Comment

Filed under Automation, Cloud Computing, DMTF, Everything, Grid, IBM, IT Systems Mgmt, Mgmt integration, OVF, SOAP, Specs, Standards, Utility computing, Virtualization, VMware

A pulp view of Cloud computing politics

As promised, here are some more thoughts on the creation by DMTF of an incubator for Cloud standards. The first part of this entry asks whether DMTF will play nicely with the other kids in the playground. The second part examines the choice of the “incubator” process in DMTF for this work.

Sharing the sandbox with the other kids

In other words, will the DMTF seek collaboration with other standards bodies, as well as less-structured organizations (the different Cloud forums and interest groups out there) and other communities (e.g. open source projects). The short answer is “no”, for reasons explained below.

The main reason is that companies don’t have the same level of influence in all organizations. Unless you’re IBM, who goes in force pretty much everywhere, you place your bets. If you are very influential in organization A but not in B, then the choice of whether a given piece of work happens in A or B decides the amount of influence you’ll have on it. That’s very concrete. When companies see it that way, the public-facing discussions about the “core competencies” of the different organizations is just hand-waving that has little actual weight in the decision. Just like plaintiffs pick friendly jurisdictions to press charge (e.g. East Texas for patent holders), companies try to choose the standard organization they want the game to be played in. As a result, companies influential in the DMTF want the DMTF to do the work and companies influential in other organizations would rather have the other organization. Since by definition those influential companies make the will of the organizations, you see organizations always trying to grow to cover more ground. For example, VMWare has invested quite a lot in DMTF. I don’t know if they are even members of OGF (at least they are not organizational members) so it makes a huge difference to them. Sure they could just as well ramp up in OGF. But at a cost.

That’s a general rule that apply to DMTF like others. But collaboration is especially hard for DMTF because it is on the “opaque” side of the openess scale (e.g. compare it to OASIS, W3C and OGF which have large amounts of publicly-accessible working documents and mailing list archives). It’s hard to collaborate if the others can’t even see what you’re doing.

But, you may ask, doesn’t the Cloud incubator charter list “Work register(s) with appropriate alliance partners” as a deliverable, and aren’t “work registers” what DMTF calls its collaboration agreements with other organizations? Surely they are taking this collaboration to heart, aren’t they? Let me tell you a story.

Once upon a time, there was a work register in place between DMTF and the OASIS WSDM technical committee which said things like “OASIS web service standardization for resource sharing and provisioning will be cross-leveraged in DMTF’s CIM and WBEM standards” and “recommendations related to management of and management using web services will be submitted to OASIS”. Then Microsoft submitted WS-Management, a replacement for WSDM, to DMTF and DMTF used the work register as a doormat.

Don’t get me wrong though. I do believe that Cloud standards are closely related to IT management automation and that the DMTF has a central role to play there. I am not arguing against DMTF’s attempt to tackle this. I am just doing a reality check on the prospect of open and meaningful collaboration with other organizations.

OGF is not standing still and has also staked its claim to the Cloud (also focusing on the IaaS form of Cloud computing): it’s called OCCI for Open Cloud Computing Interface and will share its documents here. OGF and DMTF have long had a work register too (it includes an eerily familiar sounding sentence, “Grid technology will be cross-leveraged in the DMTF’s CIM and WBEM standards”). Looks like it is going to endure its first stress test.

As for the less structured Cloud gatherings (like CCIF), they’ll be welcome as long as they play the cheerleader role (“If this group forms a Cloud trade association, I can see us establishing an alliance with the DMTF to coordinate the messaging and driving adoption of the DMTF standards”) or are happy providing feedback into a black hole (“DMTF already has a process for providing feedback: http://www.dmtf.org/standards/feedback/ so no additional legal agreements need be made for community members to provide their input”). These are from Mark Carlson, the DMTF VP of Alliances, in a thread about the incubator announcement on the CCIF mailing list. BTW, Mark is a very fair-minded person and an ardent promoter of collaboration (disclosure: he once gave me a ride in a cool Volvo convertible to the Martha’s Vineyard airport so I could catch my puddle-jumper back to Boston, so I owe him). It’s not him personally, it’s the DMTF that is so tightfisted.

The use of the “incubator” process

This second part is for standards junkies and other process wonks who run their family dinners by Robert’s rules of order. Normal people should feel free to move on.

I am not at all surprised to see the incubator process being used here, but I am surprised to see it used in the absence of a submitted specification. I expected VMWare to submit a vCloud API document to this group. What’s a rubber stamp for if you don’t have a piece of paper to stamp with it?

I have my guess as to why this incubator was created without a submission, but that’s a topic for a future post (a good soap opera writer knows to pace the drama).

In any case, this leaves us in an interesting situation. The incubator process document (DSP 4008) itself says that “the purpose of this is to allow vendors aligned with a certain proposal to move forward and produce an interoperability specification without being blocked by those who would prefer a different proposal”. What’s the “proposal” that members of this incubator align with? That Cloud computing is important? Not something that too many people would dispute at this time.

This has interesting repercussions from a process standpoint. The incubator process pushes you towards an informational specification that is then sent to a new working group for quick ratification. The quick ratification is, in effect, the reward for doing the work in the incubator rather than in private. But this Cloud incubator is currently chartered to produce proposed changes to OVF and other DMTF standard (rather than a new specification). Say it does that, what happens to the proposed changes then? Presumably they are sent to the working groups that own the original specifications, but what directives do these groups get from the board? Are they expected to roll over and alter their specifications as demanded by the Cloud incubator? Or do these changes come as comments like any other, for the groups to handle however they sees fit?

Take a concrete example. Oracle, BMC, CA and Fujitsu are very involved in the DMTF CMDBf working group but not (that I can see) in the incubator. If the Cloud incubator comes up with changes needed in CMDBf for Cloud usage, are these companies supposed to accept the changes even if they are disruptive to the original goals of the CMDBf specification? Same goes for WS-Management and even OVF. It’s one thing for an incubator to produce its own specification, it is another entirely to go and try to change someone else’s work. Presumably this wouldn’t stand (or would it?).

The lack of a submission to this incubator may end up creating a lot of argument about the interpretation of DSP 4008. For one thing, the DSP is not precise about when a submission to an incubator can take place. Since an incubator is meant to assemble people who agree with a given proposal, you’d expect that the proposal would be there at the start (so people can self-select and only join if they buy into it). But this is not explicit in the process.

The more Cloud API standardization unfolds, the more it looks like the previous attempt.

[UPDATED 2009/5/5: I just saw that Winston Bumpus has been blogging recently on the VMWare exec blog. Hopefully he will soon have his own feed for those of us interested in Cloud standards, an area in which he is a major actor. In this entry he describes his view of the DMTF incubator process. It doesn’t really align with my reading of the incubator process document though. Winston sees it as “a place for ideas to be developed or incubate before specifications are created”, while I see the process as geared towards work that starts from an existing submission. In any case, what really matters is less what the process says than how it is used, and so far it seems that it is being used as Winston describes.]

4 Comments

Filed under Cloud Computing, CMDBf, DMTF, Everything, Grid, IT Systems Mgmt, Mgmt integration, Standards, Utility computing, VMware

DMTF calls the ball on Cloud standards

To no surprise to industry watchers (and especially the small subset of them who read this blog), the DMTF has announced today (warning, PDF) that they are creating their very first “incubator” group and it is chartered with standardizing deployment, management and portability of Cloud systems. You’ve probably skipped it at the time (you’re forgiven), but you may now be motivated to go back and read this short analysis of the DMTF incubator process. And now you know why I bothered to look into this never-used two-year old process. Since it was DMTF-internal information, I couldn’t at the time explain that my motivation was the preparations under way for this Cloud computing incubator.

Since the press release talks about Cloud compatibility and since I am obviously in very self-referencing mood today, I have to point to this “reality check on Cloud portability” for a historical perspective.

Three things to notice in the charter (warning, PDF) of the incubator:

First and foremost, it explicitly takes a very IaaS-centric view of Cloud computing. And within that, a very VM-driven view. VMWare could have written it…

“Virtualization technology and the evolution from software packages that can be created and deployed as a collection of virtual images is becoming the primary focus for delivering and managing software solutions into enterprise customers today”. I guess the “is becoming” formulation provides enough wiggle room (interesting rhetorical twist that lets you make a prognostic and yet use the present tense) that one can’t really call them on it and ask how many enterprise software systems are actually delivered and managed as virtual machines today (see my colleague Adam’s view of what it will take).

Let’s next look at the description of the deliverables:

Cloud taxonomy:
– Terms and definitions
Cloud Interoperability whitepaper
Informational specifications:
– Proposed OVF changes for cloud usage
– Proposed Profiles  for management of resources exposed by a cloud
– Proposed changes to other DMTF standards
Requirements for trust for cloud resource management.
Work register(s) with appropriate alliance partners (See below)

We find the requisite “cloud taxonomy” (all the blog chatter about this a few months ago died without producing much alignment beyond the good old “IaaS, PaaS and SaaS”, or did I miss something). The interesting aspect to notice is the lack of new specification in the list. Just adjustments to the current ones (including OVF) and some profiling on top. I guess we are much closer to Cloud interoperability and portability than I thought! And the lessons form the past have been learned.

The third thing to notice is the name of the “interim co-chairs”. Who happen to be from VMWare and IBM. Who also happen to be the DMTF President and DMTF Chairman. In case you had any doubt, this is very high profile in DMTF. Especially for something that’s theoretically only an “incubator”. It may just be an egg, but there is a baby T-Rex in it.

Who’s missing in the party? Two groups of people. First, DMTF members who chose not to join (Oracle, CA, BMC…). And more importantly, the non-DMTF members who may nevertheless have a few ideas about Clouds: Google, Amazon, Salesforce and all the small Cloud pure-plays. You know, the kind of people who publish their docs in HTML rather than just PDF.

[Note: this is a quick first take written over lunch. More thoughts about the choice of the “incubator process” and the prospects for collaboration with other standards groups to follow, maybe as soon as tonight. — UPDATE: done]

3 Comments

Filed under Cloud Computing, DMTF, Everything, IT Systems Mgmt, OVF, Portability, Standards, Utility computing, Virtualization, VMware

A post-mortem on the previous IT management revolution

Before rushing to standardize “Cloud APIs”, let’s take a look back at the previous attempt to tackle the same problem, which is one of IT management integration and automation. I am referring to the definition of specifications that attempted to use the then-emerging SOAP-based Web services framework to easily integrate IT management systems and their targets.

Leaving aside the “Cloud” spin of today and the “Web services” frenzy of yesterday, the underlying problem remains to provide IT services (mostly applications) in a way that offers the best balance of performance, availability, security and economy. Concretely, it is about being able to deploy whatever IT infrastructure and application bits need to be deployed, configure them and take any required ongoing action (patch, update, scale up/down, optimize…) to keep them humming so customers don’t notice anything bothersome and you don’t break any regulation. Or rather so that any disruption a customer sees and any mandate you violate cost you less than it would have cost to avoid them.

The realization that IT systems are moving more and more towards distributed/connected applications was the primary reason that pushed us towards the definition of Web services protocols geared towards management interactions. By providing a uniform and network-friendly interface, we hoped to make it convenient to integrate management tasks vertically (between layers of the IT stack) and horizontally (across distributed applications). The latter is why we focused so much on managing new entities such as Web services, their execution environments and their conversations. I’ll refer you to the WSMF submission that my HP colleagues and I made to OASIS in 2003 for the first consistent definition of such a management framework. The overview white paper even has a use case called “management as a service” if you’re still not convinced of the alignment with today’s Cloud-talk.

Of course there are some differences between Web service management protocols and Cloud APIs. Virtualization capabilities are more advanced than when the WS effort started. The prospect of using hosted resources is more realistic (though still unproven as a mainstream business practice). Open source component are expected to play a larger role. But none of these considerations fundamentally changes the task at hand.

Let’s start with a quick round-up and update on the most relevant efforts and their status.

Protocols

WSMF (Web Services Management Framework): an HP-created set of specifications, submitted to the OASIS WSDM working group (see below). Was subsumed into WSDM. Not only a protocol BTW, it includes a basic model for Web services-related artifacts.

WS-Manageability: An IBM-led alternative to parts of WSDM, also submitted to OASIS WSDM.

WSDM (Web Services Distributed Management): An OASIS technical committee. Produced two standards (a protocol, “Management Using Web Services” and a model of Web services, “Management Of Web Services”). Makes use of WSRF (see below). Saw a few implementations but never achieved real adoption.

OGSI (Open Grid Services Infrastructure): A GGF (the organization now known as OGF) standard to provide a service-oriented resource manipulation infrastructure for Grid computing. Replaced with WSRF.

WSRF: An OASIS technical committee which produced several standards (the main one is WS-ResourceProperties). Started as an attempt to align the GGF/OGSI approach to resource access with the IT management approach (represented by WSDM). Saw some adoption and is currently quietly in use under the cover in the GGF/OGF space. Basically replaced OGSI but didn’t make it in the IT management world because its vehicle there, WSDM, didn’t.

WS-Management: A DMTF standard, based on a Microsoft-led submission. Similar to WSDM in many ways. Won the adoption battle with it. Based on WS-Transfer and WS-Enumeration.

WS-ResourceTransfer (aka WS-RT): An attempt to reconcile the underlying foundations of WSDM and WS-Management. Stalled as a private effort (IBM, Microsoft, HP, Intel). Was later submitted to the W3C WS-RA working group (see below).

WSRA (Web Services Resource Access): A W3C working group created to standardize the specifications that WS-Management is built on (WS-Transfer etc) and to add features to them in the form of WS-RT (which was also submitted there, in order to be finalized). This is (presumably) the last attempt at standardizing a SOAP-based access framework for distributed resources. Whether the window of opportunity to do so is still open is unclear. Work is ongoing.

WS-ResourceCatalog : A discovery helper companion specification to WS-Management. Started as a Microsoft document, went through the “WSDM/WS-Management reconciliation” effort, emerged as a new specification that was submitted to DMTF in May 2007. Not heard of since.

CMDBf (Configuration Management Database Federation): A DMTF working group (and soon to be standard) that mainly defines a SOAP-based protocol to query repositories of configuration information. Not linked with (or dependent on) any of the specifications listed above (it is debatable whether it belongs in this list or is part of a new breed).

Modeling

DCML (Data Center Markup Language): The first comprehensive effort to model key elements of a data center, their relationships and their policies. Led by EDS and Opsware. Never managed to attract the major management vendors. Transitioned to an OASIS member section and died of being ignored.

SDM (System Definition Model): A Microsoft specification to model an IT system in a way that includes constraints and validation, with the goal of improving automation and better linking the different phases of the application lifecycle. Was the starting point for SML.

SML (Service Modeling Language): Currently a W3C “proposed recommendation” (soon to be a recommendation, I assume) with the same goals as SDM. It was created, starting from SDM, by a consortium of companies that eventually submitted it to W3C. No known adoption other than the Eclipse COSMOS project (Microsoft was supposed to use it, but there hasn’t been any news on that front for a while). Technically, it is a combination of XSD and Schematron. It appears dead, unless it turns out that Microsoft is indeed using it (I don’t know whether System Center is still using SDM, whether they are adopting SML, whether they are moving towards M or whether they have given up on the model-centric vision).

CML (Common Model Library): An effort by the SML authors to create a set of model elements using the SML metamodel. Appears to be dead (no news in a long time and the cml-project.org domain name that was used seems abandoned).

SDD (Solution Deployment Descriptor): An OASIS standard to define a packaging mechanism meant to simplify the deployment and configuration of software units. It is to an application archive what OVF is to a virtual disk. Little adoption that I know of, but maybe I have a blind spot on this.

OVF (Open Virtualization Format): A recently released DMTF standard. Defines a packaging and descriptor format to distribute virtual machines. It does not defined a common virtual machine format, but a wrapper around it. Seems to have some momentum. Like CMDBf, it may be best thought of as part of a new breed than directly associated with WS-Management and friends.

This is not an exhaustive list. I have left aside the eventing aspects (WS-Notification, WS-Eventing, WS-EventNotification) because while relevant it is larger discussion and this entry to too long already (see here and here for some updates from late last year on the eventing front). It also does not cover the Grid work (other than OGSI/WSRF to the extent that they intersect with the IT management world), even though a lot of the work that took place there is just as relevant to Cloud computing as the IT management work listed above. Especially CDDLM/CDL an abandoned effort to port SmartFrog to the then-hot XML standards, from which there are plenty of relevant lessons to extract.

The lessons

What does this inventory tell us that’s relevant to future Cloud API standardization work? The first lesson is that protocols are easy and models are hard. WS-Management and WSDM technically get the job done. CMDBf will be a good query language. But none of the model-related efforts listed above seem to have hit the mark of “doing the job”. With the possible exception of OVF which is promising (though the current expectations on it are often beyond what it really delivers). In general, the more focused and narrow a modeling effort is, the more successful it seems to be (with OVF as the most focused of the list and CML as the other extreme). That’s lesson learned number two: models that encompass a wide range of systems are attractive, but impossible to deliver. Models that focus on a small sub-area are the way to go. The question is whether these specialized models can at least share a common metamodel or other base building blocks (a type system, a serialization, a relationship model, a constraint mechanism, etc), which would make life easier for orchestrators. SML tries (tried?) to be all that, with no luck. RDF could be all that, but hasn’t managed to get noticed in this context. The OVF and SDD examples seems to point out that the best we’ll get is XML as a shared foundation (a type system and a serialization). At this point, I am ready to throw the towel on achieving more modeling uniformity than XML provides, and ready to do the needed transformations in code instead. At least until the next window of opportunity arrives.

I wish that rather than being 80% protocols and 20% models, the effort in the WS-based wave of IT management standards had been the other way around. So we’d have a bit more to show for our work, for example a clear, complete and useful way to capture the operational configuration of application delivery services (VPN, cache, SSL, compression, DoS protection…). Even if the actual specification turns out to not make it, its content should be able to inform its successor (in the same way that even if you don’t use CIM to model your server it is interesting to see what attributes CIM has for a server).

It’s less true with protocols. Either you use them (and they’re very valuable) or you don’t (and they’re largely irrelevant). They don’t capture domain knowledge that’s intrinsically valuable. What value does WSDM provide, for example, now that’s it’s collecting dust? How much will the experience inform its successor (other than trying to avoid the WS-Addressing disaster)? The trend today seems to be that a more direct use of HTTP (“REST”) will replace these protocols. Sure. Fine. But anyone who expects this break from the past to be a vaccination against past problems is in for a nasty surprise. Because, and I am repeating myself, it’s the model, stupid. Not the protocol. Something I (hopefully) explained in my comments on the Sun Cloud API (before I knew that caring about this API might actually become part of my day job) and something on which I’ll come back in a future post.

Another lesson is the need for clear use cases. Yes, it feels silly to utter such an obvious statement. But trust me, standards groups still haven’t gotten this. It’s not until years spent on WSDM and then WS-Management that I realized that most people were not going after management integration, as I was, but rather manageability. Where “manageability” is concerned with discovering and monitoring individual resources, while “management integration” is concerned with providing a systematic view of the environment, with automation as the goal. In other words, manageability standards can allow you to get a traditional IT management console without the need for agents. Management integration standards can allow you to coordinate your management systems and automate their orchestration. WS-Management is for manageability. CMDBf is in the management integration category. Many of the (very respectful and civilized) head-butting sessions I engaged in during the WSDM effort can be traced back to the difference between these two sets of use cases. And there is plenty of room for such disconnect in the so-loosely-defined “Cloud” world.

We have also learned (or re-learned) that arbitrary non-backward compatible versioning, e.g. for political or procedural reasons as with WS-Addressing, is a crime. XML namespaces (of the XSD and WSDL types, as well as URIs used in similar ways in specifications, e.g. to identify a dialect or profile) are tricky, because they don’t have backward compatibility metadata and because of the practice to use organizations domain names in the URI (as opposed to specification-specific names that can be easily transferred, e.g. cmdbf.org versus dmtf.org/cmdbf). In the WS-based management world, we inherited these problems at the protocol level from the generic WS stack. Our hands are more or less clean, but only because we didn’t have enough success/longevity to generate our own versioning problems, at the model level. But those would have been there had these models been able to see the light of day (CML) or see adoption (DCML).

There are also practical lessons that can be learned about the tactics and strategies of the main players. Because it looks like they may not change very much, as corporations or even as individuals. Karla Norsworthy speaks for IBM on Cloud interoperability standards in this article. Andrew Layman represented Microsoft in the post-Manifestogate Cloud patch-up meeting in New York. Winston Bumpus is driving the standards strategy at VMWare. These are all veterans of the WS-Management, WSDM and related wars collaborations (and more generally the whole WS-* effort for the first two). For the details of what there is to learn from the past in that area, you’ll have to corner me in a hotel bar and buy me a few drinks though. I am pretty sure you’d get your money’s worth (I am not a heavy drinker)…

In summary, here are my recommendations for standardizing Cloud API, based on lessons from the Web services management effort. The theme is “focus on domain models”. The line items:

  • Have clear goals for each effort. E.g. is your use case to deploy and run an existing application in a Cloud-like automated environment, or is it to create new applications that efficiently take advantage of the added flexibility. Very different problems.
  • If you want to use OVF, then beef it up to better apply to Cloud situations, but keep it focused on VM packaging: don’t try to grow it into the complete model for the entire data center (e.g. a new DCML).
  • Complement OVF with similar specifications for other domains, like the application delivery systems listed above. Informally try to keep these different specifications consistent, but don’t over-engineer it by repeating the SML attempt. It is more important to have each specification map well to its domain of application than it is to have perfect consistency between them. Discrepancies can be bridged in code, or in a later incarnation.
  • As you segment by domain, as suggested in the previous two bullets, don’t segment the models any further within each domain. Handle configuration, installation and monitoring issues as a whole.
  • Don’t sweat the protocols. HTTP, plain old SOAP (don’t call it POS) or WS-* will meet your need. Pick one. You don’t have a scalability challenge as much as you have a model challenge so don’t get distracted here. If you use REST, do it in the mindset that Tim Bray describes: “If you’re going to do bits-on-the-wire, Why not use HTTP? And if you’re going to use HTTP, use it right. That’s all.” Not as something that needs to scale to Web scale or as a rebuff of WS-*.
  • Beware of versioning. Version for operational changes only, not organizational reasons. Provide metadata to assert and encourage backward compatibility.

This is not a recipe for the ideal result but it is what I see as practically achievable. And fault-tolerant, in the sense that the failure of one piece would not negate the value of the others. As much as I have constrained expectations for Cloud portability, I still want it to improve to the extent possible. If we can’t get a consistent RDF-based (or RDF-like in many ways) modeling framework, let’s at least apply ourselves to properly understanding and modeling the important areas.

In addition to these general lessons, there remains the question of what specific specifications will/should transition to the Cloud universe. Clearly not all of them, since not all of them even made it in the “regular” IT management world for which they were designed. How many then? Not surprisingly (since IBM had a big role in most of them), Karla Norsworthy, in the interview mentioned above, asserts that “infrastructure as a service, or virtualization as a paradigm for deployment, is a situation where a lot of existing interoperability work that the industry has done will surely work to allow integration of services”. And just as unsurprisingly Amazon’s Adam Selipsky, who’s company has nothing to with the previous wave but finds itself in leadership position WRT to Cloud Computing is a lot more circumspect: “whether existing standards can be transferred to this case [of cloud computing] or if it’s a new topic is [too] early to say”. OVF is an obvious candidate. WS-Management is by far the most widely implemented of the bunch, so that gives it an edge too (it is apparently already in use for Cloud monitoring, according to this press release by an “innovation leader in automated network and systems monitoring software” that I had never heard of). Then there is the question of what IBM has in mind for WS-RT (and other specifications that the WS-RA working group is toiling on). If it’s not used as part of a Cloud API then I really don’t know what it will be used for. But selling it as such is going to be an uphill battle. CMDBf is a candidate too, as a model-neutral way to manage the configuration of a distributed system. But here I am, violating two of my own recommendations (“focus on models” and “don’t isolate config from other modeling aspects”). I guess it will take another pass to really learn…

[UPDATED 2009/5/7: Senior moment! When writing this entry I forgot that I wrote an earlier entry (in late 2007) specifically to describe the difference between “manageability” and “management integration”. So here it is, if you care for more details on this topic.]

5 Comments

Filed under Automation, Cloud Computing, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Modeling, People, Portability, REST, SML, SOAP, Specs, Standards, Utility computing, Virtualization, WS-Management, WS-ResourceCatalog, WS-ResourceTransfer

New release of Oracle Composite Application Monitor and Modeler

Version 10.2.0.5 of OCAMM (Oracle Composite Application Monitor and Modeler) was recently released. This is the ex QuickVision product from the ClearApp acquisition last year. It is doing very well in the Enterprise Manager portfolio. In this new release, it has grown to cover additional middleware and application products. So now, in addition to the existing support (J2EE, BPEL, Portal…), it lets you model and monitor deployments of the Oracle Service Bus (OSB) as well as AIA process integrations. Those metadata-rich environments fit very naturally in the OCAMM approach of discovering the distributed model from metadata and then collecting and reporting usage metrics across the whole system in the context of that model.

For a complete list of improvements over release 10.2.0.4.2, refer to the release notes, where we see this list of new features:

  • Oracle Service Bus Support: Oracle Service Bus (formerly called Aqualogic Service Bus) versions 2.6, 2.6.1, 3.0, and 10gR3 are now supported.
  • Oracle AIA Support: Oracle Application Integration Architecture (AIA) 2.2.1 and 2.3 are now supported.
  • WLS and WLP Support: WebLogic Server and WebLogic Portal version 10.3 are now supported.
  • Agent Deployment Added: Agent deployment added to CAMM Administration UI to streamline configuration of new resources
  • Testing Added in Resource Configuration: Connectivity parameter testing added in resource configuration
  • Testing Added in Repository Configuration: Connectivity parameter testing added for CAMM database repository configuration
  • EJB and Web Services Support: EJB and Web Services Annotation are now supported

You can get the bits on OTN. Here is the installation and configuration guide and here is the user’s guide.

Comments Off on New release of Oracle Composite Application Monitor and Modeler

Filed under Application Mgmt, BPEL, Everything, IT Systems Mgmt, Manageability, Mgmt integration, Middleware, Modeling, Oracle