William Vambenepe's blog

IT management in a changing IT world

Archive for the 'Specs' Category

18
Nov
2008

Who said WS-Transfer is for REST?

by William Vambenepe

One more post on the “REST over SOAP” topic, recently revived by the birth of the W3C WS Resource Access working group. Then I’ll go quiet for a bit and let people actually working on it show me why I am wrong to worry about WS-RT.

Before that, I just want to clarify one thing. People seem to assume that WS-Transfer was created as a way to support the creation of RESTful systems that communicate over SOAP. As much as I can tell, this is simply not true.

I never worked for Microsoft and I was not in the room when WS-Transfer was created. But I know what WS-Transfer was created to support: chiefly, it was WS-Management and the Devices Profile for Web Services, neither of which claims to have anything to do with REST. It’s just that they both happen to deal with resources (that word again!) that have properties and they want to access (mostly retrieve, really) the values of these properties. But in both cases, these resources have a lot more than just state. You can call all sorts of type-specific operations on them. No uniform interface. It’s not REST and it’s not trying to be REST. The Devices Profile also happens to make heavy use of WS-Discovery and I am pretty sure that UDP broadcasts aren’t a recommend Web-scale design pattern. And no “hypermedia” in sight in either spec either.

A specification is not RESTful. An application system is. And most application systems that use WS-Transfer don’t even try to be RESTful. Mocking WS-Transfer for not being as good as HTTP to support REST systems is like mocking an airplane for not being as good as your hatchback for grocery shopping. It’s true, but who cares.

So let’s not reflexively attack WS-Transfer for assumed purposes. And similarly, let’s not reflexively defend WS-Transfer as a good way to build RESTful systems.

Just to clarify, this is not meant as a defense of WS-Transfer. I think that, at least in the context of its original purpose, it should be gutted to only its GET operation. The PUT and DELETE tasks should be handled by domain-specific operations. Which would have the consequence of making it look less like a REST wannabe. But my recommendation aims at improving its applicability to the management domain, not at making it comply to an architecture style that is not (at least currently) used in that domain.

12
Nov
2008

WS Resource Access working group starting at W3C

by William Vambenepe

Things went quiet for a while, but the W3C Web Services Resource Access Working Group has finally taken life, as was announced last week. It’s a well-know PR trick to announce bad news on a Friday such that it goes undetected, is it a coincidence that W3C picked a Friday for this announcement?

As you can tell by this last remark, I have no trouble containing my enthusiasm about this new group. Which should not come as a surprise to regular readers of this blog (see this, this, this and this, chronologically).

The most obvious potential pushback against this effort is the questionable architectural need to redo over SOAP what can be done over simple HTTP. Along the lines of Erik Wilde’s “HTTP over SOAP over HTTP” post. But I don’t expect too much noise about this aspect, because even on the blogosphere people eventually get tired of repeating the same arguments. If some really wanted to put up a fight against this, it would have been done when the group was first announced, not now. That resource modeling party is over.

While I understand the “WS-Transfer is just HTTP over SOAP over HTTP” argument, this is not my problem with this group. For one thing, this group is not really about WS-Transfer, it’s about WS-ResourceTransfer (WS-RT) which adds fine-grained resource access on top of WS-Transfer. Which is not something that HTTP gives you out of the box. You may argue that this is not needed (just model your addressable resources in a fine-grained way and use “hypermedia” to navigate between them) but I don’t really buy this. At least not in the context of IT management models, which is where the whole thing started. You may be able to architect an IT management system in such RESTful way, but even if you can it’s too far away from current IT modeling practices to be practical in many scenarios (unfortunately, as it would be a great complement to an RDF-based IT model). On the other hand, I am not convinced that this fine-grained access needs to go beyond “read” (i.e. no need for “fine-grained write”).

The next concern along that “HTTP over SOAP over HTTP” line of thought might then be why build this on top of SOAP rather than on top of HTTP. I don’t really buy this one either. SOAP, through the SOAP processing model (mainly the use of headers, something that WS-RT unfortunately butchers) is better suited than HTTP for such extensions. And enough of them have already been defined that you may want to piggyback on. The main problem with SOAP is the WS-Addressing tumor that grew on it (first I thoughts it was just a wart, but then it metastatized). WS-RT is affected by it, but it’s not intrinsic to WS-RT.

Finally, it would be a little hard for me to reject SOAP-based resources access altogether, having been associated with many such systems: WSMF, WSDM/WSRF, WS-Management and even WS-RT in its pre-submission days (and my pre-Oracle days). Not that I have signed away my rights to change my mind.

So my problem with WS-RAWG is not a fundamental architectural problem. It’s not even a problem with the defects in the current version of WS-RT. They are fixable and the alternative specifications aren’t beauty queens either.

Rather, my concerns are focused on the impact on the interoperability landscape.

When WS-RT started (when I was involved in it), it was as part of a convergence effort between HP, IBM, Intel and Microsoft. With the plan to use this to unify the competing WS-Management and WSDM/WSRF stacks. Sure it was also an opportunity to improve things a bit, but 90% of the value came from the convergence/unification aspect, not technical improvements.

With three of the four companies having given up on this, it isn’t much of a convergence anymore. Rather then paring-down the number of conflicting options that developers have to chose from (a choice that usually results in “I won’t pick either sine there is no consensus, I’ll just do it my own way”), this effort is going to increase it. One more candidate. WS-Management is not going to go away, and it’s pretty likely that in W3C WS-RT will move further away from it.

Not to mention the fact that CMDBf (and its SOAP-based graph-oriented query protocol) has since emerged and is progressing towards standardization. At this point, my (notoriously buggy) crystal ball shows a mix of WS-management and CMDBf taking the prize overall. With WS-Management used to access individual resources and CMDBf used to access any kind of overall system view. Which, as a side note, means that DMTF has really taken this game over (at least in the IT management domain) from W3C and OASIS. Not that W3C really wanted to be part of the game in the first place…

30
Oct
2008

First in-depth look at Microsoft’s Oslo and the “M” modeling language

by William Vambenepe

Microsoft’s PDC is taking place this week and more details were shared with the attendees about project Oslo, an effort announced last year to drastically improve the use of models across the application lifecycle. Some code is available (I think the Quadrant code is only for PDC attendees but the Oslo SDK is available to everyone). I am not at PDC, I didn’t see any presentation and I didn’t download any code. But Microsoft has also posted technical details on MSDN and, as far as I am concerned, that’s the most time-effective way to spend a couple of hours learning about Oslo. BTW, the way they share these early design descriptions and accept to make their evolution public is admirable.

For those who only want to spend 10 minutes rather than 2 hours, here are the thoughts that came to my mind as I was reading.

Overall I am somewhat underwhelmed, but not necessarily in a bad way. I know that’s a little schizophrenic so let me explain. After hearing a lot about how Oslo was the next big thing in modeling, it is a little surprising to read a document that can be summarized as “modeling is good, so go create some SQL tables and store them in a RDBMS”. That’s the underwhelming part. But on the other hand, it is more down to earth and practically-minded than I feared. And this is just a summary, in truth there is more than just “use SQL”.

Half of the MSDN documentation basically explains how to use SQL Server to store application models (as of today, the “Developing Models for the Metadata Store” section has only one sub-section, “SQL Server Guidelines for Modeling in the Oslo Repository“). Does this mean that all .NET applications will eventually have to carry with them a deployment of SQL Server 2008 even if they don’t use it to store the their operational data? Sure there are a few extra repository services (e.g. finer-grained change auditing) but most Oslo repository services are generic SQL Server features. That section has quite a lot of T-SQL, but it’s pretty readable. It also has a lot of dependencies on following naming conventions which makes me think that directly creating T-SQL code is not the best approach.

Fortunately there is an alternative, the “M” language. It’s a schema language with a built-in constraint mechanism. I found it more data-oriented (as opposed to resource-oriented) than I expected. Even though “each model is really a set of data structures, relationships, and constraints in serialized form“, there is a lot more support for data structures and constraints than for relationships. It’s just a foreign key. Relationships aren’t items and don’t have any property (or “field” as they’re called in “M”). For example, the relationship between a student’s enrollment record and a given class can’t have, as property, the grade that the student got for that class (as in the example in section 4.1.4 of the second LC of SML). To model this in “M” you need to create another item (e.g. “courseEnrollment”) and have a relationship from the student to that item and another one from that item to the “course” item itself. Or to replace the foreign key in the student table with a complex structure that contains both the foreign key and the properties of the relationship. At the end it has the same expressiveness potential, but in a less streamlined form. I assume Microsoft took this approach for performance reasons.

I am going on a limb here, but it may also be a difference between development-time concerns and operation-time concerns. During development (all the way to testing and packaging), you can still mostly get away with a relatively simple containment structure. You care about the components of your application and how they are packaged inside or next to one another. Sure you care about who calls who outside of the deployment unit but that’s not as core a concern as getting your class dependencies right, your tests in order and your installer configured. In fact, some of the “who calls who” bindings will be only be realized at runtime. Oslo, at least so far, clearly seems more focused on development time than operations so support for a relationship-rich model may not seem critical. At operations time, on the other hand, you don’t really care so much about how things were packaged before installation. You care a lot more about who invokes who (especially for modern distributed applications), what the network layout is, what resources a ticket is attached to, etc. The model looks a lot more like a graph with complex relationships. Something that “M” doesn’t seem ideally suited for.

Except for this caveat, I like “M”. It’s not anti-XML (you can represent values as XML if you’d like) but it avoids the “the answer is XML/XSD what is the question” approach to modeling that is sometimes a little too prevalent. “M” is a much better schema language for IT systems than XSD. I especially like its approach to types. A value is not intrinsically of a given type. A type is a condition that you happen to meet or not at the current time (”take heart little field, you can be anything you want when you grow up”). As such, you can be of several types at the same time. Refined types are potatoes inside potatoes (not sure if “M” supports definition of types as unions and/or intersection of existing types, for intersection I want to write something like”type NewType : OldType1 where this in OldType2” but there is no “this” in “M”). That approach to types (and the way constraints leverage types) is reminiscent of RDF/OWL. It’s a classification more than a typification, but I understand why they didn’t want to call it “class”. The similarities with RDF/OWL don’t go any further. As I wrote earlier, “M”is very data-focused and not resource-focused: as far as I can tell “M” types are defined syntactically, not semantically (the semantics come as a consequence). For example, I don’t think that you can assert that a given item representing a person is of type “friendly” if there is no corresponding data in the item. You’d have to first create a boolean field called “friendly” and define that those that have that field set to “true” are of type “friendly”. Unlike in RDF/OWL where you can just assert that a subject is “friendly”.

Here is another reason why you can’t have “semantics-only” types: “if you do not specify the type of a field or value, M infers a type for it“. Two things don’t sound quite right to me here. First a detail: the sentence (like others in the doc) talks about “the” type of a field of value, while there can be more than one. More importantly, what’s the point of this feature? How does it help me to have my IRC nickname classified as a post code or as a password just because it happens to be made of a compatible combination of letters and numbers? Maybe it makes sense as a storage optimization, but why does it make sense to expose this to the user?

I also like the way “extents” work. The current description of that feature is pretty limited, but based on how it is used in other parts I think one of its usages is to support a non-OO equivalent to inheritance: create two extents, one for the “superclass” and one for the “subclass” where each only contains the properties/fields defined at that level. You should get both of them in order to have the full picture (all the fields). This is, if I understand it correctly, similar to something I have been (unsuccessfully so far because “XML doesn’t do it this way”) trying to sell to the DMTF CMDBf working group: model inheritance through a set of non-overlapping records rather than dealing with a type hierarchy on record types. It’s not just that it makes relational storage easier (even though it does and that’s probably why “M” does it this way), it also makes your query/select operations a lot easier to specify and implement.

All in all (and without having gone through the exercise of defining actual models in “M”), it seems like a fine schema language (except that its dependency on the CLR base types is unpractical for users outside of the Microsoft universe) but I am not sure if it is beefy enough to be a good IT management metamodel. When the document says that “the Oslo repository provides open and flexible access to the data it contains, which enables direct access to SQL Server views of the underlying data. There are no complex data access layers or APIs” it sounds better than saying “it’s just SQL, so map your model to it and if you want relationships or type inheritance just build it on top of it and quit whining”. But it is an admission of limitation at the same time as a claim of simplicity. I also smell an assumption that LINQ will provide enough hand-holding that non-SQL-savvy developers will be ok. We’ll see.

And then there is MGrammar. Things get a little confusing at that point if you try to relate MGrammar to “M”. Actually, the FAQ states that “the M language consists of three parts: MGraph, MSchema and MGrammar“. This came a bit as a surprise to me since at that point I had finished reading (not in details but not too quickly either) the “M” documentation and I hadn’t seen these names mentioned once. Looks like there is some documentation consistency issues here, but that’s hardly surprising considering this is a “hyper-early (pre-alpha)” release as Doug Purdy puts it.

I think that everything that I have referred to as “M” above is MSchema.

MGrammar is something different altogether: it’s the source of the Domain Specific Language (DSL) references we’ve been hearing in relationship with Oslo. Technically, MGrammar is a BNF on steroids plus an automatically generated parser for your syntax. Cute. I assume that “M” (i.e. MSchema) is built as MGrammar-defined DSL but I am not sure why I would care. I am all for reuse and if someone at Microsoft thought that there was something reusable in the way they defined MSchema then it’s a good thing to expose this tool. But where does it come into play in application modeling? The last thing I want is people inventing completely independent languages to describe different domains. I am all for specialization, but a common underlying metamodel is pretty nice when you have to make sense of a whole system. I don’t see any such commonality in MGrammar: as far as I can tell it can be used to define anything from PostScript to sonnets.

From the FAQ, the connection point between MGrammar and MSchema is MGraph (MGrammar languages are parsed into an MGraph, MSchema “builds on MGraph”). That’s nice, but since neither the MSchema nor the MGrammar documentation mention MGraph I don’t really know what to make of this. David Chappell’s white paper also mentions MSchema and MGrammar but not MGraph. The introduction to the MGrammar Language Specification states that “the data that results from Mg [a.k.a. MGrammar] processing is compatible with Mg’s sister language, The Oslo Modeling Language, M, which provides a SQL-compatible schema and query language that can be used to further process the underlying information“. Compatible? I need more information here. In any case, MGrammar sounds like a fun project for a techie. Who am I to deny Microsoft engineers their fun. Jokes aside, I am probably missing something here seeing how prevalent the DSL message is in all discussions of Oslo. Look at the “highlights of this book” section for the upcoming Oslo/M book from the creators of the “M” language: half of it is about the DSL support and there must be a reason beyond pure geekery. As a side note, if you buy this book you need to understand what little shelf life it will have (I can give you a good price on a lightly-used Hailstorm/”.Net my services” specification book).

Aside from the “M” language itself, there are a few models described in the documentation. One corresponds to BPMN (actually, it says that it “closely aligns with” BPPMN 1.1, does this imply that they are not quite the same?). The fact that this model supports imports from Visio is a nice feature.

The Application model (one of the places where you can see “extents” in action) scares me a little bit because I doubt that two different people would use the same “extents” to describe the same software elements. Unless of course that’s being done for them by a pre-defined mapping to their development framework (.NET) enacted by their common development tool (Visual Studio). Which may be the assumption. Yet, the Application model is defined in generic terms, not Microsoft-specific (with a couple of slip-ups, like a WebApplicationModule being defined as a “Web application (module) implemented by IIS or WAS“. Maybe I’ll feel better about the generic applicability of this Application model when I see a full-fledged description (e.g. including relationship semantics as captured in foreign key field names) and an example.

At the bottom of that Application model, there is a lonely “Manageable” type to use if you have a LifecycleState field. This reinforces my impression that despite the claims to link development time with operational time, a lot of the focus to date has been on the former rather than the latter.

The ServiceModel model will look familiar to people familiar with SCA and is presumably complementary to the WorkflowModel and WorkflowServiceModel models, both of which are directly mapped to Windows Workflow Foundation. I guess that’s where Oslo and Dublin touch one another. I am still glad they are now clearly separated.

There is also a “Quadrant” model which concerns me a bit (it seems to be used to store customization of the Quadrant UI which, while convenient to store straight in the repository, doesn’t strike me as necessarily belonging there).

At this point, the question is not whether Microsoft can build Oslo as it is currently defined. SQL Server 2008 already exists, the usage guidelines aren’t unrealistic and even the “M-to-T-SQL” translation doesn’t seem too hard for Microsoft to implement (the SDK presumable already contains an implementation). I have no doubt they can deliver the system they describe. What I don’t know is whether and how it will be actually useful.

Describing “M” in details is good. Describing how the repository is implemented on top of SQL Server 2008 is interesting but not so relevant. What I’d like to see is a description of how all this gets used. How does it change the Visual Studio experience? How does it change the installation process/format? How does it support round-tripping between lifecycle stages (e.g. if the developer changes the workflow model, does that original BPMN model get consequently updated)? How does it relate to SLAs and policies? How does it apply to application monitoring? How does it apply to configuration management, to the change process? Etc. In short, what’s the Oslo ecosystem going to be.

These questions aren’t completely ignored in the MSDN documentation, but they are dispensed with in a couple of pages: “Application Development and Lifecycle Improvements” and “IT Operations Benefits“. The former states, for example, that “having the Oslo repository act as a central location for these models also enables a connection between the design and implementation models. This connection helps prevent these models from becoming disconnected during the development process“. Which all sounds good but is just a set of assertions that we have heard many times before (not just from Microsoft). How do “M” and the Oslo repository really make this true?

On the “IT Operations Benefits” side, things are equally blurry: “the Oslo repository can store all types of machine and application configuration data. When consistently updated, this configuration data is a catalog of the current state of all monitored machines and applications in the environment“. Notice the “when consistently updated” hand wave. That’s kind of the crux if you really want to manage across the lifecycle. How will they achieve this consistency? By centralizing all changes through a model-driven controller a la SDM/SML? Through ongoing discovery and/or change notifications? By relying on good old ITIL/MOF processes?

The FAQ declares that “having a common approach does not necessarily correlate to one physical store, but more of a federated model and we believe that some of the new Repository, along with existing investments in both Configuration Management Database (CMDB) and Team Foundation Server (TFS), will form the foundation for a common Microsoft metadata strategy and should be supported across our set of products“. OK, but who is the source of truth for application configuration data? The Oslo repository or the CMDB? Is one the desired state and the other the observed state? Does the CMDB go back to simply being a Service Desk (and if so, does the Oslo repository take on the responsibility to enforce change processes, something that requires more than the security model in Oslo)? If the CMDB is still going to use SML as its metamodel, how do you efficiently federate across such different metamodels as SML (i.e. XSD + schematron + relationships) and “M”?

Lots of questions remaining. What will Oslo have turned into in a few years? A business process design/implementation/monitoring suite (there is a strong workflow feel to many parts)? A generic drag-and-drop programming environment (”the fact that entire features are already described by models means that for a wide array of application and component categories you can start using visual tools to design and implement your components“)? A control center for end to end application management? All of the above? Nothing?

This was just a quick brain dump after reading the documents. Actually, I just realized it somehow got pretty long (congrats if you’re still reading). I hope this post is not too disorganized. Oslo is an interesting effort, but, as Microsoft is first to admit, it’s at a very early stage. I am just surprised that this first release spends so much time on the “how” rather than the “what”. Maybe it’s just because I only got my information from the MSDN documentation. We’ll see when more content from PDC finds its way online. I just want the slides, watching recorded presentations is rarely time-efficient (and you can expect them to require Silverlight).

Speaking of Silverlight, there is this new site on Oslo if you think watching some videos is worth installing Silverlight. Those screenshots don’t motivate me sufficiently.

[UPDATED 2008/10/30: Rather than going to bed I Googled around a bit and found a  post by Martin Fowler that answers some of my questions about MGrammar, MGraph and MSchema. MGraph is for instances, MSchema is for types. It answers some plumbing question, but I still have questions about expected usage and relevance to applications modeling.]

[UPDATED 2008/10/30: I also found the recordings and slides from past PDC sessions. Nice job Microsoft for this quick turnaround time, even if you require Sliverlight and/or the PPTX viewer. The sessions are:

  • TL23 A Lap around "Oslo" (Doug Purdy, Vijaye Raji)
  • TL27 "Oslo": The Language (Don Box, David Langworthy)
  • TL18 "Oslo": Customizing and Extending the Visual Design Experience (Don Box, Florian Voss)
  • TL28 "Oslo": Repository and Models (Chris Sells)

The first two sessions (deliverd Tuesday) have a replay and slides, the others should, I assume, follow soon.]

[UPDATED 2008/11/3: A nice overview of Oslo by Aaron Skonnard. Unlike most other Oslo articles over the last week, this one tries to paint the (yet-to-be-realized) full picture of the Oslo ecoystem. He mentions that "other Microsoft products and technologies are expected to build on Oslo to provide other runtimes. A few that have already been announced include Microsoft System Center (Operations Manager) and Team Foundation Server (TFS) in Visual Studio Team System". It's interesting that he qualifies System Center to be more specifically "operations manager" rather than "configuration manager" but I wouldn't read too much into it at this point.]

29
Oct
2008

CMDBf work in progress

by William Vambenepe

The DMTF CMDBf working group (of which I am part) has released a work in progress version of the CMDBf specification. The changes from the submitted version are minor. It’s mostly a move to the DMTF template. More important (but not drastic) changes should appear in the next release.

14
Oct
2008

Reviewing DMTF OVF as a “preliminary standard”

by William Vambenepe

OVF 1.0.0d is out as a “preliminary standard” so I gave it a quick read over the weekend. Things have not changed much since the “work in progress” document published this summer, which itself wasn’t a big change from the original specification. As I wrote in the review of the “work in progress”, the DMTF tightened the language of the  specification more than it added features.

Since there aren’t too many technical changes (see the end of this post if you’re interested in a few), the interesting discussion is about the marketing of this specification. And boy does it have wings on that front. The level of visibility the specification has received is pretty amazing, especially considering that it doesn’t really do that much technically. But you wouldn’t know it by reading all the announcements about OVF:

  • VMWare supports OVF packaging (which version?) with its new VMWare Studio.
  • Citrix uses OVF in Kensho to create a platform-agnostic VM management.
  • An Open Source “implementation” of OVF has been created. I put “implementation” between quotes because since OVF per se doesn’t do much its implementation is mostly a specialized command line editor for its XML descriptor. It requires a a vendor-specific runtime for deployment/activation. This is not a criticism of the open source project BTW, just a statement of fact about the spec.
  • Enomaly lists “OVF format support” on its roadmap for Q1 2009.
  • Microsoft support for OVF in products is supposedly “on the board” which doesn’t mean very much but their overall marketing/PR response to OVF has been surprisingly positive for a standard that they don’t control.

I have criticized the DMTF marketing efforts in the past (“give away pens and key chains”) but I must admit that, to the extent that DMTF had a significant role in promoting OVF adoption (in addition to marketing efforts directly from the vendors), it is a very nice marketing success. Well done, and so much for my cynicism. OVF may also have benefited from all the interest in the general topic of virtualization/cloud standards (the “cloud” association is silly, of course, but as we’ve just seen I am not a marketing genius) and the fact that there isn’t much else to talk about on these topics. So by default OVF becomes the name to put on your “standards” banner. Right place at the right time for the vendors behind it.

Speaking of the vendors, I have no insight into the functioning of the OVF working group, but judging by the specification’s foreword VMware is throwing plenty of resources at DMTF: it employs the working group chair and both co-editors, which is pretty atypical in my experience in standards efforts. People are usually sensitive to appearances of one company having disproportionate influence and try to distribute responsibilities around, at least on paper. Add to this VMWare’s recent ramp-up at the DMTF board level. They seem to know what they want. And indeed I can see how the industry leader would want some basic level of standardization, but not too much, which is currently just what OVF offers. We’ll see what’s next in store, if anything.

The specification itself is not marketing-free. According to line 122, “it supports the full range of virtual hard disk formats used for hypervisors today, and it is extensible, which will allow it to accommodate formats that may arise in the future”. Sure, in the same way that my car fully supports passengers of all nationalities (and is extensible enough to transport citizens of yet-to-be created countries - and maybe even other planets, as long as they come with buttocks to sit on). Since OVF doesn’t really do anything with the virtual hard disk formats, it can “support” pretty much any such format.

Speaking of extensibility, OVF clearly tries to have a good story there. Section 7.3 tries to move away from the usual “hey, it’s XML, you can add elements/attributes anywhere” approach towards the definition of new “sections”. This seems a bit drastic. Time will tell if this is visionary or short-sighted. OVF also plans to move towards “an extension model based on the design of the open content model in XML Schema 1.1″. I am not following XSD 1.1 too closely, but it is wise for OVF to not build too much dependency on it at least for now. And it seems to me that an extension model is not something that you plan to “plan [...] to add” but rather something you need to define from the start (sounds like the good old “the next version will add versioning support”, or “no keyboard detected, press F8 to continue”).

But after all this comes what looks to me, from an extensibility perspective, like a big no-no: using (section 8.1) simple strings (e.g. “vmx-4″, “xen-3″) to represent types of virtual systems. You’d think that in 2008 people would have heard about URIs as a way to allow extensibility and prevent name clashes. On further reading, this doesn’t seem to be the fault of OVF as they get this property (vssd:VirtualSystemType) straight out of the politely named DMTF SVP (System Virtualization Profile) specification, itself a preliminary standard. But that’s not much of an excuse because I suspect large overlap of participation between the two groups and in any case you don’t have to take dependencies on something that’s not right (speaking as someone who authored several specs that took a dependency on WS-Addressing, I shouldn’t give lessons). In any case, I am not on top of all virtualization-related work in DMTF but it seems to me that if they are not going to use URIs then someone should step up and maintain a registry of these identifying “virtual system type” strings.

BTW, when left to its own device OVF does a better job. For example, it properly uses URIs to identify the virtual disk format (section 5.2).

One of the few new features is the addition of the ovf:bound attribute on virtual hardware element items (section 8.3) to specify whether the item description represents the normal, minimal or maximal allocation. My heads spins a bit when trying to apply this metadata to the rasd:Limit property (with ovf:bound=”min” the value of the rasd:Limit element would represent the minimal value of the maximum quantity or resources that will be granted, which takes some parsing effort), but I think it more or less squares out.

The final standard should not differ greatly from this version, so at this point we pretty much know what OVF will be technically. The real question is how it will be used and what, if anything, is going to come to complement it.

[UPDATED 2008/10/14: Good timing. OVF-loving Kensho just launched.]

23
Sep
2008

State modeling: party over, go home now.

by William Vambenepe

Is the Northwest weather softening Savas? Is it the food? I just read the “how do I model state? let me count the ways” article that he, Ian Foster, Paul Watson and Mark McKeown published in the September 2008 Communications of the ACM. In the article, the authors attempt to recap (and advance?) the 5 years-old debate between the WSRF, HTTP-only and “no convention” (e.g. Zen-SOAP as used in CMIS) approaches to interacting with stateful resources over the Web. If you were anywhere near OGF (then called GGF) around 2003, you know what I am talking about. And you remember how heated the arguments were. There was something about this subject (or maybe it was the people involved) that consistently generated great showmanship (and some bruised egos) in the debates.

With that in mind, reading this article felt like watching a Chinese opera adaptation of Apocalypse Now. Or listening to Heavy Metal with the base dialed down to zero.

This would have been a very useful article to have in 2003. At the time, it would have clearly framed the question, shown the overwhelming similarities and small differences between the approaches and allowed people to see that there wasn’t actually that much to debate at a fundamental level, but mainly practical considerations to juggle. It may have prevented the quasi-religious war that erupted.

It took a while, but that period of religious war is well over now and we are firmly in the “I’ve heard you, you’ve heard me, do what you want I’ll do what I want” stage. WSRF people are still doing WSRF (or equivalent like WSRT). REST people are HTTPing right and left. They don’t meet much but when they do they don’t bump shoulders anymore. And in a way this article is a good illustration of this much more dispassionate environment.

So why am I complaining? Because these fights were fun! At least from a spectator’s point of view, but I suspect that Savas and the gang had plenty of fun too (not sure about the other side who, at least at first, expected “why are you throwing away OGSI” kind of pushback rather than this more radical-sounding response).

I printed this ACM article a little bit on the off chance that it would provide some new way to look at the problem, one that hadn’t emerged in the past five years. But in retrospect I think my true motivation was that I expected it to capture, like in the days, some of the entertainment value of a radio talk show. Instead, the excitement level in this article is in the league of NPR’s StarDate astronomy report.

I feel cheated. I haven’t learned anything new and I haven’t been entertained either. This article feels like the end of the party, when the bottles are being put away, the lights are flickering and bad music is playing to nudge the last guests out of the house.

Now that I am grumpy, I guess I have to point out a few highly questionable statements in the article in retribution:

“Fortunately, there seems to be industry support for an integration of the WS-Transfer and WS-RF approaches, based on a WS-Transfer substrate - the WS-ResourceTransfer specification.” See the last two paragraphs of this entry.

“Support for WS-Addressing has since become quasi-universal, and now few find its use objectionable.” Time to pull out the Victor Hugo quote I have been saving for a special occasion: “Et s’il n’en reste qu’un, je serai celui-là“. But frankly I very much doubt that I am the only one still shaking his head sadly in contemplation of WS-Addressing.

In fact, Stu agrees with me on this (see item #6a in his list of disagreements with the article). Looks like he too was made a bit grumpy by the article, for different reasons.

There is one more debatable choice in this article, and it’s more serious than the two above. It introduces an arbitrary difference between the WS-Transfer and HTTP approaches. Compare the third lines of tables 4 and 5 (retrieving the status of a specific job). According to the article, WS-Transfer gives you the choice between two options:

  • retrieve the entire state of the job and fish for the status field inside of it (the approach in table 4), or
  • “a new operation (for example GetEPRtoPart) is defined that requests that a new state representation be exposed, through a different EPR, representing parts of the original state representation”

The way it works for HTTP, on the other hand is through an “application-specific convention” (in this example, appending “/status” at the end of the URL).

Except there is no reason why this third approach cannot be used in the WS-Transfer scenario. The article says that  “in WS-Transfer, the same effect [accessing a subset of the resource state] can be achieved, but only by defining an auxiliary operation that returns an EPR to a desired subset”. What, pray tell, prevents a WS-Transfer implementation from having an “application-specific convention” just like the HTTP kids next door? It can be at the URL level (e.g. adding “/status”). Or at the EPR reference parameter level. The latter is actually exactly what WS-Management does, using the wsman:SelectorSet header. It does not, as the article claims, define a special operation to get these fine-grained EPR. It uses an application convention to do so (which, in the case of WS-Management, happens to be “whatever Windows implements”, but that’s a different debate).

By the way, this question of “convention over specification” is where I don’t quite follow Stu (see his point #4 in his aforementioned list of disagreements) and his invocation of the “hypermedia constraint”. I don’t see how any of the four specifications he calls to the rescue (HTML form submission, XForms submission options, Atompub service documents and URI templates) would prevent me from having to have an application-specific agreement about how to retrieve the state (as opposed to another subset of the representation, like the creation date). URI templates, for example, might support how this agreement is expressed but it doesn’t replace it.

The article does a pretty good job at showing how close the alternatives are (even though, as illustrated above, it still portrays them as more different than they need to be). I am not saying it’s a bad article for the Communications of the ACM. I am saying that the Communications of the ACM is a bad medium for one of the few nerdy debates that have genuine entertainment value.

[UPDATED 2008/10/2: Jim Webber, Savas Parastatidis and Ian Robinson provide a full REST example for InfoQ: how to GET a cup of coffee. Includes state considerations discussed in the ACM article.]

18
Sep
2008

Last call for SML and SML-IF

by William Vambenepe

The SML working group at W3C has published the “last call” working draft of version 1.1 of the SML and SML-IF (”IF” stands for “interchange format”) specifications. You have until October 3rd to tell them what you think.

With all the Oslo fun, the OMG embrace and the silence from System Center there are more questions than answers about the use of SML at Microsoft. But the Eclipse COSMOS project (IBM and friends) is, as far as I know, valiantly going forward with the store/validator implementation. Which may or may not be the same codebase as what was used for the recent CMDBf interop demo (I am not sure how the SML and CDMBf implementations in COSMOS are articulated).

The COSMOS group also recently published an overview of SML. It doesn’t try to tell you why you’d want to use SML but it’s a good and succint description of what SML is technically (from an XML developer’s perspective).

17
Sep
2008

Here be (XML) dragons

by William Vambenepe

Spoiler alert: if you like to learn things the hard way, don’t follow this link. It points to a clear description of all the problems, frustrations, disillusions and “ah ah!” moments that are ahead of you as you start to use XML and grow into an expert.

If, on the other hand, you like to be fully prepared and informed when you choose a technology and if you don’t mind sacrificing some adventure and excitement in the process, then you owe it to yourself to read Erik Wilde and Robert Glushko’s XML Fever article. Even if you already consider yourself an XML expert. Especially if you do.

I knew I would like it when I read this in the introduction:

Advanced strains of XML fever often take hold after exposure to the proliferation of more complex and esoteric XML-based technologies layered on top of it. These advanced diseases are harder to catch, but they are also harder to remedy because people who have caught these advanced strains tend to congregate with others with the same diseases and they are continually reinfecting each other.

Oh yes they do. And they speak with such authority that they infect others around them. People who don’t even understand these “more complex and esoteric XML-based technologies” end up being convinced of their magical properties and the need to use them.

I am not going to attempt to summarize the article because it is too tightly packed with great content to be summarized without being butchered. The “tree trauma” section alone could probably save the world billions of dollars in lost productivity if it was widely read.  I’ll just quote a few sections to motivate you to go read the whole thing.

Tree tremors. Whereas tree trauma (discussed earlier) is a basic strain of XML fever caused by the various flavors of trees in XML technologies, tree tremors are a more serious condition afflicting victims trying to manage data in XML that is not inherently tree-structured. The most common causes are data models requiring nontree graph structures and document models needing overlapping structures. In both cases, mapping these models to XML’s tree model results in XML structures that cannot conveniently represent the application-level model.

(…)

The choice of schema languages, however, is more often determined by available tool support and acquired habits than by a thorough analysis of what would be the most appropriate language.

(…)

Triple shock. While RDF itself is simple, large datasets easily contain millions of triples (for truly large datasets this can go up to billions), and managing and querying such a big dataset can become a considerable challenge. If the schema of these large datasets is simple, but ontology overkill has set in and it has been reformulated as an ontology, handling this dataset may become considerably harder, without any immediate benefit.

This is true not just for RDF (a graph model that can be serialized in XML) but for any non-tree model that can be serialized in XML (which is to say any model one can think of). Including every graph model.

Maybe it would help if the article stated more clearly that it’s ok to serialize such a model as XML (e.g. for transmission) as long as you don’t process it (at the application level) as XML. As long as it gets accessed using an API and concepts that are aligned with the semantics of the model.

Imagine that you are receiving an RDF dataset over the wire. You could (if your app runs on the network card rather than in CPU) process it as a bunch of electrical impulses, but that wouldn’t be very convenient. You could process it as a bunch of bits, but that’s still hard. You could process it as a character stream but that’s not that much better. You could process it as XML but that’s still no great. Or you could process it as RDF triplets and be home on time to have dinner with your family. It’s not the fact that it is represented as XML at some point that’s the problem, it’s the fact that your application processes it as XML. Said in another way, just because it makes sense to store it or to send it over the network in XML doesn’t mean that you have to process it as XML in your application.

There is at least one more problem (not covered by the article) that people will eventually run into. You’d think that XML technologies are a consistent and complementary set. Not true. The lack of consistency is illustrated by the “tree trauma” section of the article. But there is also a complementarity problem, in the sense that there are large gaps between the specifications, as anyone who has tried to serialize an XPath nodeset has found out.

As the article points out, all this doesn’t mean that XML is bad or useless. XML technologies can be very useful, but for not for all tasks.

11
Sep
2008

CMIS, APP, Zen-SOAP and WS-KitchenSink: some data points

by William Vambenepe

The recent release of an early draft of a content management specification (CMIS, for Content Management Interoperability Services) provides an interesting perspective on not just SOAP-versus-REST but also Zen-SOAP versus WS-KitchenSink.

I know little about content management and I have no comment about the specification from that respect. Others have better informed opinions on that aspect.

What is of interest to me, and where I have some experience, is the way the spec-defined operations are bound to underlying protocols. Here is the way the specification is structured: Part I describes the data model and the operations exposed by all the services. Part II comes in two flavors: a REST binding (based on APP, the Atom Publishing Protocol) and a Web services binding (based on SOAP).

This is the first time, to my knowledge, that someone (who presumably isn’t a participant in the SOAP/REST religious war but simply wants to get something done) describes two ways to achieve a real-life task, using either APP or SOAP. I expect that this will attract a lot of attention and provide data in the SOAP versus REST debate.

But this is not what I want to write about. I’ll just point out that the REST binding specification somehow is twice as long as the SOAP binding specification, which I find intriguing but not necessarily meaningful (things are looking good for your bet Sanjiva).

What really caught my attention is how SOAP is used in CMIS. You can hardly tell it’s SOAP. CMIS just defines XML messages to be used as payload for requests and responses. You would be excused for forgetting halfway through your implementation that you’re supposed to wrap those in a SOAP envelope. Headers are a no-show. The specification says it uses SOAP faults but it actually goes out of its way to avoid the existing elements for fault code and fault message and instead invent its own. The only SOAP feature it really uses is MTOM.

Except for the MTOM part, this reminds me of what SOAP was at the beginning of the decade, before any header had been defined (other than those used as illustration in the SOAP specification itself). I want to call it Zen-SOAP, by opposition to the WS-KitchenSink approach in which even simple, synchronous, clear-text, request-response SOAP exchanges somehow get saddled with a half dozen WS-Addressing headers before they’ve even left the gate (did I mention that I don’t like WS-Addressing?).

Another comedian in the WS-KitchenSink theater troupe is the WS-Transfer stack and especially WS-ResourceTransfer (WS-RT). Unless I read too much into this draft of CMIS, its content is devastating in two ways for WS-ResourceTransfer: in one fell swoop it shows that the specification is mostly useless and it destroys the argument that WS-ResourceTransfer needs to be stand-alone as opposed to just a part of WS-Management.

In “who needs XPath fragment-level PUT?”, I tried to make the case that the use of XPath in WS-RT to do fine-grained updates is a case of over-engineering. That there is no real need for it. Still, in that article I try to think of cases where the feature might be justified. I came up with two and I wrote that “one is if the resource actually is a document (as opposed to having its state represented by a document). For example, a wiki page”. But I dismissed it because wiki-land is REST country. I didn’t think of it at the time, but there is an “enterprise” version of wiki, a world in which, presumably, SOAP is well-regarded: Content Management Systems. Surely, if there is a domain that needs a fine-grained SOAP-based document editing protocol it’s the CMS world.

Today’s release of CMIS demolishes this use case with two punches to the guts:

  • They do have a query language, but it is SQL-based, not XPath-based.
  • The query is only used for reads, not for updates. Updates are done through specialized operations (addObjectToFolder, moveObject, updateProperties, createRelationship…).

This goes beyond not using a generic fine-grained update mechanism. It also goes against using any generic GET/SET operation. The blow reaches all the way to WS-Transfer. For all this, CMIS comes out a much simpler specification and it also frees itself from the web of dependencies (on specifications at different stages of standardization) that has plagued specifications that use WS-Transfer and will plague WS-Federation for using WS-RT.

It will be interesting to see what happens when the WS-* architects and Microsoft and IBM get hold of the CMIS specification and of its authors in their companies. I am especially worried about the fate of the IBM CMIS authors. The recent news about Oslo show that the XML people at Microsoft are a lot more willing to put the XML tools back in the box when needed.

In truth, the CMIS authors do appear to need some help from the SOAP experts in their companies, if only to fix the way they use SOAP faults and to help the poor soul who put this comment in the WSDL:

<!– had to use include - .net wsdl.exe code generator doesn’t seem to like imports on the schema –>

But they might be getting more “suggestions” than they bargained for. In the same way that the WS-Federation folks were going on their own merry way until it was “suggested” to them by someone (who probably had an agenda) to use WS-RT. I’ll try to keep an eye on how CMIS evolves.

In the meantime, I find in CMIS data points that reinforce my opinion that WS-Transfer should be absorbed by WS-Management, WS-MeX and WS-Federation should return to defining their own operations and WS-RT should be left to die (or, for a more positive spin, be used as inspiration in the next version of WS-Management).

[UPDATED 2008/10/02: Roy Fielding doesn't like the so-called-RESTful binding. Sam Ruby cautiously defends it. Links via Billy Cripe.]

10
Sep
2008

Oslo, blog posts and my crystal ball

by William Vambenepe

There is more and more information coming out about Oslo in anticipation of the Microsoft PDC in October.

David Chappell recorded a video about it last month. More recently Doug Purdy and Don Box each posted a short description of Oslo. Don describes the goal of Oslo as “simplify the process of developing, deploying, and managing software”. But when he lists ancestor technologies to illustrate that “Microsoft has been moving in this direction for over a decade now”, they are all about development, not management: COM type libraries, .NET metadata attributes, XAML. Interesting that neither SDM nor SML gets a mention. Neither did SCA by the way, but I wasn’t really expecting that one… :-)

Maybe the I am the only one looking for a SDM/SML echo here, just because I came to hear of Oslo through the DSI angle. Am I wrong to see Oslo as an enabler for DSI? This eWeek article doesn’t have anything to do with IT management. Reading it, Oslo is all about allowing people to write code through drag and drop. Yawn. And Don Box endorses the article.

Maybe it’s just me (an IT management guy more than a software development guy) but I don’t care so much about how the application model is created. I care a lot more about what it allows you to do in terms of IT management. Please don’t make me pull out the often-quoted figure about the percentage of IT budget spent on operations versus development/licensing. The eWeek piece fails to excite me, but fortunately David Chappell’s video interview is a lot more aligned with my thinking, so I still hold hopes for Oslo as an IT management enabler. Here is my approximate transcript of an example that David provides (at around 4:20) in the video:

“If someone comes to you and says i’ve got this business process and the SLA is not being met, what do you do? You’ve got to trace this through the right business process and the right application that supports that part of the process and find the machine it runs on and maybe look at the workflow that implements it and maybe look at the services that it provides. This involves talking to business analysts, or the IT pros or the architect or the developer, all of whom have their own view of the world, their own tools, their own prospective. The repository provides a common place to store all this stuff, to link it all together, and with a visual editor to have a common tool that lets you actually go through and answer this kind of questions.”

Now you’re talking.

And if Oslo is not the new blood of DSI, then what is? The DSI story is getting dated, SML is fading in our memories and of the three parts that supposedly compose DSI (”virtualized infrastructure, design for operations, and knowledge-driven management”), only virtualization is actually represented on the list of technologies on the DSI home page. Has DSI turned into just allowing System Center to manage a hypervisor? I still hold hopes that the Oslo data is going to spice things up there. It would be good for the industry at large, not just Microsoft.

I won’t be at the PDC but it will be interesting to see what filters out of these sessions. The first session in the list adds management of hybrid application systems (hybrid as in “cloud/on-premise combination” or “software+services” as Microsoft calls it), to the long “can do” list for Oslo. Impressive, if there is some meat behind the abstract. I think this task is often overlooked in discussions around management aspects of Cloud computing (see “the new, interesting thing is going to be the IT infrastructure to manage your usage of utility computing services as well as their interactions with your in-house software” in this previous entry).

Yes, I am reading way too much into session abstracts, but while I am at it I can’t help noticing that there is a lot of SQL and very little XML/XSD/XPath mentioned there. Even though one of the presenters is Gudge, the only person I have ever met who fully understands XSD (actually even he doesn’t, I’ve seen him in the WS-I days have to refer to… his book).

Even though I am sure we’ll be told that SML can be built on top of Oslo, the SQL orientation won’t make that so easy (I want to see how to build XSD+Schematron validation on top of a relational store using Oslo’s drag and drop development tool). And it puts Microsoft on a different architectural direction from IBM, who, as far as I can tell, thinks that the world is a big XML document. Neither is the most appropriate for IT management models. I prefer a graph model and associated graph queries along the lines of SPARQL or CMDBf.

But that’s just late-night idle speculations on my part (aka “blogging”). Let’s see what comes out in October.

[UPDATED 2008/9/10: Interesting timing. Microsoft is joining OMG, home of UML and BPMN. Coming next: a submission of a "new version" of UML and BPMN that happens to contain the extensions and tweaks that Microsoft made to them in the process of implementing Oslo. This, BTW, is the final nail in the SML coffin (SML isn't even mentioned in the press release).]

Categories