William Vambenepe's blog

IT management in a changing IT world

Archive for the 'SML' Category

30
Oct
2008

First in-depth look at Microsoft’s Oslo and the “M” modeling language

by William Vambenepe

Microsoft’s PDC is taking place this week and more details were shared with the attendees about project Oslo, an effort announced last year to drastically improve the use of models across the application lifecycle. Some code is available (I think the Quadrant code is only for PDC attendees but the Oslo SDK is available to everyone). I am not at PDC, I didn’t see any presentation and I didn’t download any code. But Microsoft has also posted technical details on MSDN and, as far as I am concerned, that’s the most time-effective way to spend a couple of hours learning about Oslo. BTW, the way they share these early design descriptions and accept to make their evolution public is admirable.

For those who only want to spend 10 minutes rather than 2 hours, here are the thoughts that came to my mind as I was reading.

Overall I am somewhat underwhelmed, but not necessarily in a bad way. I know that’s a little schizophrenic so let me explain. After hearing a lot about how Oslo was the next big thing in modeling, it is a little surprising to read a document that can be summarized as “modeling is good, so go create some SQL tables and store them in a RDBMS”. That’s the underwhelming part. But on the other hand, it is more down to earth and practically-minded than I feared. And this is just a summary, in truth there is more than just “use SQL”.

Half of the MSDN documentation basically explains how to use SQL Server to store application models (as of today, the “Developing Models for the Metadata Store” section has only one sub-section, “SQL Server Guidelines for Modeling in the Oslo Repository“). Does this mean that all .NET applications will eventually have to carry with them a deployment of SQL Server 2008 even if they don’t use it to store the their operational data? Sure there are a few extra repository services (e.g. finer-grained change auditing) but most Oslo repository services are generic SQL Server features. That section has quite a lot of T-SQL, but it’s pretty readable. It also has a lot of dependencies on following naming conventions which makes me think that directly creating T-SQL code is not the best approach.

Fortunately there is an alternative, the “M” language. It’s a schema language with a built-in constraint mechanism. I found it more data-oriented (as opposed to resource-oriented) than I expected. Even though “each model is really a set of data structures, relationships, and constraints in serialized form“, there is a lot more support for data structures and constraints than for relationships. It’s just a foreign key. Relationships aren’t items and don’t have any property (or “field” as they’re called in “M”). For example, the relationship between a student’s enrollment record and a given class can’t have, as property, the grade that the student got for that class (as in the example in section 4.1.4 of the second LC of SML). To model this in “M” you need to create another item (e.g. “courseEnrollment”) and have a relationship from the student to that item and another one from that item to the “course” item itself. Or to replace the foreign key in the student table with a complex structure that contains both the foreign key and the properties of the relationship. At the end it has the same expressiveness potential, but in a less streamlined form. I assume Microsoft took this approach for performance reasons.

I am going on a limb here, but it may also be a difference between development-time concerns and operation-time concerns. During development (all the way to testing and packaging), you can still mostly get away with a relatively simple containment structure. You care about the components of your application and how they are packaged inside or next to one another. Sure you care about who calls who outside of the deployment unit but that’s not as core a concern as getting your class dependencies right, your tests in order and your installer configured. In fact, some of the “who calls who” bindings will be only be realized at runtime. Oslo, at least so far, clearly seems more focused on development time than operations so support for a relationship-rich model may not seem critical. At operations time, on the other hand, you don’t really care so much about how things were packaged before installation. You care a lot more about who invokes who (especially for modern distributed applications), what the network layout is, what resources a ticket is attached to, etc. The model looks a lot more like a graph with complex relationships. Something that “M” doesn’t seem ideally suited for.

Except for this caveat, I like “M”. It’s not anti-XML (you can represent values as XML if you’d like) but it avoids the “the answer is XML/XSD what is the question” approach to modeling that is sometimes a little too prevalent. “M” is a much better schema language for IT systems than XSD. I especially like its approach to types. A value is not intrinsically of a given type. A type is a condition that you happen to meet or not at the current time (”take heart little field, you can be anything you want when you grow up”). As such, you can be of several types at the same time. Refined types are potatoes inside potatoes (not sure if “M” supports definition of types as unions and/or intersection of existing types, for intersection I want to write something like”type NewType : OldType1 where this in OldType2” but there is no “this” in “M”). That approach to types (and the way constraints leverage types) is reminiscent of RDF/OWL. It’s a classification more than a typification, but I understand why they didn’t want to call it “class”. The similarities with RDF/OWL don’t go any further. As I wrote earlier, “M”is very data-focused and not resource-focused: as far as I can tell “M” types are defined syntactically, not semantically (the semantics come as a consequence). For example, I don’t think that you can assert that a given item representing a person is of type “friendly” if there is no corresponding data in the item. You’d have to first create a boolean field called “friendly” and define that those that have that field set to “true” are of type “friendly”. Unlike in RDF/OWL where you can just assert that a subject is “friendly”.

Here is another reason why you can’t have “semantics-only” types: “if you do not specify the type of a field or value, M infers a type for it“. Two things don’t sound quite right to me here. First a detail: the sentence (like others in the doc) talks about “the” type of a field of value, while there can be more than one. More importantly, what’s the point of this feature? How does it help me to have my IRC nickname classified as a post code or as a password just because it happens to be made of a compatible combination of letters and numbers? Maybe it makes sense as a storage optimization, but why does it make sense to expose this to the user?

I also like the way “extents” work. The current description of that feature is pretty limited, but based on how it is used in other parts I think one of its usages is to support a non-OO equivalent to inheritance: create two extents, one for the “superclass” and one for the “subclass” where each only contains the properties/fields defined at that level. You should get both of them in order to have the full picture (all the fields). This is, if I understand it correctly, similar to something I have been (unsuccessfully so far because “XML doesn’t do it this way”) trying to sell to the DMTF CMDBf working group: model inheritance through a set of non-overlapping records rather than dealing with a type hierarchy on record types. It’s not just that it makes relational storage easier (even though it does and that’s probably why “M” does it this way), it also makes your query/select operations a lot easier to specify and implement.

All in all (and without having gone through the exercise of defining actual models in “M”), it seems like a fine schema language (except that its dependency on the CLR base types is unpractical for users outside of the Microsoft universe) but I am not sure if it is beefy enough to be a good IT management metamodel. When the document says that “the Oslo repository provides open and flexible access to the data it contains, which enables direct access to SQL Server views of the underlying data. There are no complex data access layers or APIs” it sounds better than saying “it’s just SQL, so map your model to it and if you want relationships or type inheritance just build it on top of it and quit whining”. But it is an admission of limitation at the same time as a claim of simplicity. I also smell an assumption that LINQ will provide enough hand-holding that non-SQL-savvy developers will be ok. We’ll see.

And then there is MGrammar. Things get a little confusing at that point if you try to relate MGrammar to “M”. Actually, the FAQ states that “the M language consists of three parts: MGraph, MSchema and MGrammar“. This came a bit as a surprise to me since at that point I had finished reading (not in details but not too quickly either) the “M” documentation and I hadn’t seen these names mentioned once. Looks like there is some documentation consistency issues here, but that’s hardly surprising considering this is a “hyper-early (pre-alpha)” release as Doug Purdy puts it.

I think that everything that I have referred to as “M” above is MSchema.

MGrammar is something different altogether: it’s the source of the Domain Specific Language (DSL) references we’ve been hearing in relationship with Oslo. Technically, MGrammar is a BNF on steroids plus an automatically generated parser for your syntax. Cute. I assume that “M” (i.e. MSchema) is built as MGrammar-defined DSL but I am not sure why I would care. I am all for reuse and if someone at Microsoft thought that there was something reusable in the way they defined MSchema then it’s a good thing to expose this tool. But where does it come into play in application modeling? The last thing I want is people inventing completely independent languages to describe different domains. I am all for specialization, but a common underlying metamodel is pretty nice when you have to make sense of a whole system. I don’t see any such commonality in MGrammar: as far as I can tell it can be used to define anything from PostScript to sonnets.

From the FAQ, the connection point between MGrammar and MSchema is MGraph (MGrammar languages are parsed into an MGraph, MSchema “builds on MGraph”). That’s nice, but since neither the MSchema nor the MGrammar documentation mention MGraph I don’t really know what to make of this. David Chappell’s white paper also mentions MSchema and MGrammar but not MGraph. The introduction to the MGrammar Language Specification states that “the data that results from Mg [a.k.a. MGrammar] processing is compatible with Mg’s sister language, The Oslo Modeling Language, M, which provides a SQL-compatible schema and query language that can be used to further process the underlying information“. Compatible? I need more information here. In any case, MGrammar sounds like a fun project for a techie. Who am I to deny Microsoft engineers their fun. Jokes aside, I am probably missing something here seeing how prevalent the DSL message is in all discussions of Oslo. Look at the “highlights of this book” section for the upcoming Oslo/M book from the creators of the “M” language: half of it is about the DSL support and there must be a reason beyond pure geekery. As a side note, if you buy this book you need to understand what little shelf life it will have (I can give you a good price on a lightly-used Hailstorm/”.Net my services” specification book).

Aside from the “M” language itself, there are a few models described in the documentation. One corresponds to BPMN (actually, it says that it “closely aligns with” BPPMN 1.1, does this imply that they are not quite the same?). The fact that this model supports imports from Visio is a nice feature.

The Application model (one of the places where you can see “extents” in action) scares me a little bit because I doubt that two different people would use the same “extents” to describe the same software elements. Unless of course that’s being done for them by a pre-defined mapping to their development framework (.NET) enacted by their common development tool (Visual Studio). Which may be the assumption. Yet, the Application model is defined in generic terms, not Microsoft-specific (with a couple of slip-ups, like a WebApplicationModule being defined as a “Web application (module) implemented by IIS or WAS“. Maybe I’ll feel better about the generic applicability of this Application model when I see a full-fledged description (e.g. including relationship semantics as captured in foreign key field names) and an example.

At the bottom of that Application model, there is a lonely “Manageable” type to use if you have a LifecycleState field. This reinforces my impression that despite the claims to link development time with operational time, a lot of the focus to date has been on the former rather than the latter.

The ServiceModel model will look familiar to people familiar with SCA and is presumably complementary to the WorkflowModel and WorkflowServiceModel models, both of which are directly mapped to Windows Workflow Foundation. I guess that’s where Oslo and Dublin touch one another. I am still glad they are now clearly separated.

There is also a “Quadrant” model which concerns me a bit (it seems to be used to store customization of the Quadrant UI which, while convenient to store straight in the repository, doesn’t strike me as necessarily belonging there).

At this point, the question is not whether Microsoft can build Oslo as it is currently defined. SQL Server 2008 already exists, the usage guidelines aren’t unrealistic and even the “M-to-T-SQL” translation doesn’t seem too hard for Microsoft to implement (the SDK presumable already contains an implementation). I have no doubt they can deliver the system they describe. What I don’t know is whether and how it will be actually useful.

Describing “M” in details is good. Describing how the repository is implemented on top of SQL Server 2008 is interesting but not so relevant. What I’d like to see is a description of how all this gets used. How does it change the Visual Studio experience? How does it change the installation process/format? How does it support round-tripping between lifecycle stages (e.g. if the developer changes the workflow model, does that original BPMN model get consequently updated)? How does it relate to SLAs and policies? How does it apply to application monitoring? How does it apply to configuration management, to the change process? Etc. In short, what’s the Oslo ecosystem going to be.

These questions aren’t completely ignored in the MSDN documentation, but they are dispensed with in a couple of pages: “Application Development and Lifecycle Improvements” and “IT Operations Benefits“. The former states, for example, that “having the Oslo repository act as a central location for these models also enables a connection between the design and implementation models. This connection helps prevent these models from becoming disconnected during the development process“. Which all sounds good but is just a set of assertions that we have heard many times before (not just from Microsoft). How do “M” and the Oslo repository really make this true?

On the “IT Operations Benefits” side, things are equally blurry: “the Oslo repository can store all types of machine and application configuration data. When consistently updated, this configuration data is a catalog of the current state of all monitored machines and applications in the environment“. Notice the “when consistently updated” hand wave. That’s kind of the crux if you really want to manage across the lifecycle. How will they achieve this consistency? By centralizing all changes through a model-driven controller a la SDM/SML? Through ongoing discovery and/or change notifications? By relying on good old ITIL/MOF processes?

The FAQ declares that “having a common approach does not necessarily correlate to one physical store, but more of a federated model and we believe that some of the new Repository, along with existing investments in both Configuration Management Database (CMDB) and Team Foundation Server (TFS), will form the foundation for a common Microsoft metadata strategy and should be supported across our set of products“. OK, but who is the source of truth for application configuration data? The Oslo repository or the CMDB? Is one the desired state and the other the observed state? Does the CMDB go back to simply being a Service Desk (and if so, does the Oslo repository take on the responsibility to enforce change processes, something that requires more than the security model in Oslo)? If the CMDB is still going to use SML as its metamodel, how do you efficiently federate across such different metamodels as SML (i.e. XSD + schematron + relationships) and “M”?

Lots of questions remaining. What will Oslo have turned into in a few years? A business process design/implementation/monitoring suite (there is a strong workflow feel to many parts)? A generic drag-and-drop programming environment (”the fact that entire features are already described by models means that for a wide array of application and component categories you can start using visual tools to design and implement your components“)? A control center for end to end application management? All of the above? Nothing?

This was just a quick brain dump after reading the documents. Actually, I just realized it somehow got pretty long (congrats if you’re still reading). I hope this post is not too disorganized. Oslo is an interesting effort, but, as Microsoft is first to admit, it’s at a very early stage. I am just surprised that this first release spends so much time on the “how” rather than the “what”. Maybe it’s just because I only got my information from the MSDN documentation. We’ll see when more content from PDC finds its way online. I just want the slides, watching recorded presentations is rarely time-efficient (and you can expect them to require Silverlight).

Speaking of Silverlight, there is this new site on Oslo if you think watching some videos is worth installing Silverlight. Those screenshots don’t motivate me sufficiently.

[UPDATED 2008/10/30: Rather than going to bed I Googled around a bit and found a  post by Martin Fowler that answers some of my questions about MGrammar, MGraph and MSchema. MGraph is for instances, MSchema is for types. It answers some plumbing question, but I still have questions about expected usage and relevance to applications modeling.]

[UPDATED 2008/10/30: I also found the recordings and slides from past PDC sessions. Nice job Microsoft for this quick turnaround time, even if you require Sliverlight and/or the PPTX viewer. The sessions are:

  • TL23 A Lap around "Oslo" (Doug Purdy, Vijaye Raji)
  • TL27 "Oslo": The Language (Don Box, David Langworthy)
  • TL18 "Oslo": Customizing and Extending the Visual Design Experience (Don Box, Florian Voss)
  • TL28 "Oslo": Repository and Models (Chris Sells)

The first two sessions (deliverd Tuesday) have a replay and slides, the others should, I assume, follow soon.]

[UPDATED 2008/11/3: A nice overview of Oslo by Aaron Skonnard. Unlike most other Oslo articles over the last week, this one tries to paint the (yet-to-be-realized) full picture of the Oslo ecoystem. He mentions that "other Microsoft products and technologies are expected to build on Oslo to provide other runtimes. A few that have already been announced include Microsoft System Center (Operations Manager) and Team Foundation Server (TFS) in Visual Studio Team System". It's interesting that he qualifies System Center to be more specifically "operations manager" rather than "configuration manager" but I wouldn't read too much into it at this point.]

18
Sep
2008

Last call for SML and SML-IF

by William Vambenepe

The SML working group at W3C has published the “last call” working draft of version 1.1 of the SML and SML-IF (”IF” stands for “interchange format”) specifications. You have until October 3rd to tell them what you think.

With all the Oslo fun, the OMG embrace and the silence from System Center there are more questions than answers about the use of SML at Microsoft. But the Eclipse COSMOS project (IBM and friends) is, as far as I know, valiantly going forward with the store/validator implementation. Which may or may not be the same codebase as what was used for the recent CMDBf interop demo (I am not sure how the SML and CDMBf implementations in COSMOS are articulated).

The COSMOS group also recently published an overview of SML. It doesn’t try to tell you why you’d want to use SML but it’s a good and succint description of what SML is technically (from an XML developer’s perspective).

10
Sep
2008

Oslo, blog posts and my crystal ball

by William Vambenepe

There is more and more information coming out about Oslo in anticipation of the Microsoft PDC in October.

David Chappell recorded a video about it last month. More recently Doug Purdy and Don Box each posted a short description of Oslo. Don describes the goal of Oslo as “simplify the process of developing, deploying, and managing software”. But when he lists ancestor technologies to illustrate that “Microsoft has been moving in this direction for over a decade now”, they are all about development, not management: COM type libraries, .NET metadata attributes, XAML. Interesting that neither SDM nor SML gets a mention. Neither did SCA by the way, but I wasn’t really expecting that one… :-)

Maybe the I am the only one looking for a SDM/SML echo here, just because I came to hear of Oslo through the DSI angle. Am I wrong to see Oslo as an enabler for DSI? This eWeek article doesn’t have anything to do with IT management. Reading it, Oslo is all about allowing people to write code through drag and drop. Yawn. And Don Box endorses the article.

Maybe it’s just me (an IT management guy more than a software development guy) but I don’t care so much about how the application model is created. I care a lot more about what it allows you to do in terms of IT management. Please don’t make me pull out the often-quoted figure about the percentage of IT budget spent on operations versus development/licensing. The eWeek piece fails to excite me, but fortunately David Chappell’s video interview is a lot more aligned with my thinking, so I still hold hopes for Oslo as an IT management enabler. Here is my approximate transcript of an example that David provides (at around 4:20) in the video:

“If someone comes to you and says i’ve got this business process and the SLA is not being met, what do you do? You’ve got to trace this through the right business process and the right application that supports that part of the process and find the machine it runs on and maybe look at the workflow that implements it and maybe look at the services that it provides. This involves talking to business analysts, or the IT pros or the architect or the developer, all of whom have their own view of the world, their own tools, their own prospective. The repository provides a common place to store all this stuff, to link it all together, and with a visual editor to have a common tool that lets you actually go through and answer this kind of questions.”

Now you’re talking.

And if Oslo is not the new blood of DSI, then what is? The DSI story is getting dated, SML is fading in our memories and of the three parts that supposedly compose DSI (”virtualized infrastructure, design for operations, and knowledge-driven management”), only virtualization is actually represented on the list of technologies on the DSI home page. Has DSI turned into just allowing System Center to manage a hypervisor? I still hold hopes that the Oslo data is going to spice things up there. It would be good for the industry at large, not just Microsoft.

I won’t be at the PDC but it will be interesting to see what filters out of these sessions. The first session in the list adds management of hybrid application systems (hybrid as in “cloud/on-premise combination” or “software+services” as Microsoft calls it), to the long “can do” list for Oslo. Impressive, if there is some meat behind the abstract. I think this task is often overlooked in discussions around management aspects of Cloud computing (see “the new, interesting thing is going to be the IT infrastructure to manage your usage of utility computing services as well as their interactions with your in-house software” in this previous entry).

Yes, I am reading way too much into session abstracts, but while I am at it I can’t help noticing that there is a lot of SQL and very little XML/XSD/XPath mentioned there. Even though one of the presenters is Gudge, the only person I have ever met who fully understands XSD (actually even he doesn’t, I’ve seen him in the WS-I days have to refer to… his book).

Even though I am sure we’ll be told that SML can be built on top of Oslo, the SQL orientation won’t make that so easy (I want to see how to build XSD+Schematron validation on top of a relational store using Oslo’s drag and drop development tool). And it puts Microsoft on a different architectural direction from IBM, who, as far as I can tell, thinks that the world is a big XML document. Neither is the most appropriate for IT management models. I prefer a graph model and associated graph queries along the lines of SPARQL or CMDBf.

But that’s just late-night idle speculations on my part (aka “blogging”). Let’s see what comes out in October.

[UPDATED 2008/9/10: Interesting timing. Microsoft is joining OMG, home of UML and BPMN. Coming next: a submission of a "new version" of UML and BPMN that happens to contain the extensions and tweaks that Microsoft made to them in the process of implementing Oslo. This, BTW, is the final nail in the SML coffin (SML isn't even mentioned in the press release).]

10
Jul
2008

Who needs XPath fragment-level PUT?

by William Vambenepe

WS-Management and WS-ResourceTransfer (WS-RT) both provide a mechanism to modify the XML representation of the state of a resource in a fine-grained way. The mechanisms differ a bit: WS-Management defines a SOAP header and distinguishes PUT from DELETE at the WS-Transfer operation level, while WS-RT uses the SOAP body and tunnels “modes” (remove, modify, insert) on top of the PUT WS-Transfer operation. But in their complete form both use XPath to point to any arbitrary nodeset and update it.

WS-ResourceProperties (WS-RP) takes a simpler approach. While it too supports XPath-driven retrieval of the content, it doesn’t attempt to provide an XPath-like level of flexibility when it comes to updating the content. All it offers is SET, INSERT, UPDATE and DELETE operations at the level of a property (a top-level child of the XML representation) and nothing more granular.

In this respect at least, WS-RP makes a better choice than its competitor and its aspiring successor.

First, XPath-driven updates sound easy but in fact are hard to specify. Not surprisingly, the current specifications do a pretty incomplete job at it. They often seem to assume that the XPath used to target the value to change returns only one node, but nothing guarantees this. If it picks up more than one node, do you replace all these nodes by the new values as a block (the new values get inserted once, presumably at the location of the first selected node) or do you replace each selected node by all the new values (in which case they get duplicated as needed)? Also, the specifications say nothing about what constitutes compatibility between the targeted nodes and the replacement nodes. One might assume that a “don’t be stupid” approach is all that’s needed. But there is no obvious line between “stupid” and “useful”. Does a request to replace a text node by an attribute node make sense? Not in a strongly-typed world, but a more forgiving implementation might just insert the text value of the attribute in the place of the text node to get to a valid result. What about replacing an element by a text node? Some may reject it for incompatible types but, unless the schema prevents mixed content, it may well result in a perfectly valid document. All in all, specifying a reliable way to edit XML is a pretty hairy task. Much harder than reading XML. It requires very careful considerations that have very little do with on-the-wire protocol considerations. Which is why doing this as part of a SOAP specification is a strange choice. The XQuery group is much more qualified for this. There must be a reason why that group decided to punt on this until they had taken care of the easier “read” case.

Second, it’s usually not all that useful anyway. Which is why the lack of precision in WS-Management’s specification of the fragment PUT haven’t really been a problem so far: people haven’t fully implemented that feature. A lot of the implementations are backed by a CIMOM, an MBean or some other OO store. In these stores, the exposed granularity is typically at the attribute level. The interactions used by programmers and consoles are also at that level. The XPath-driven update is then only used as a mechanism to update many properties at once (rather than going deep into individual properties) but that’s using a machine gun to kill a fly. The WS-RP approach supports these use cases without calling on XPath.

Third, XPath-driven PUT is really hard to implement unless your back-end store happens to be an XML database. You may end up having to write your own XPath parser and interpreter, an exercise during which you will face some impedance mismatches. Your back-end store may not have notions of property order for example, or attribute versus element. How do you handle these XPath instructions? And what kind of interoperability results from implementers having to make these decisions on their own? Implementing XPath selection on a GET is a lot simpler. All it assumes is that there is an XML serialization of the result, on which you can run the XPath expression before shipping it out. That XML serialization is a given in the SOAP world already. But doing an XPath-driven PUT injects XML considerations in your store itself, not just in the communication path.

Those are the practical reasons. In short, it makes the specifications at best complex and at worst non-interoperable, for a feature that is rarely needed. That should be enough already, but there are some architectural reasons to stay away too.

WS-Transfer is sometimes sold for REST over SOAP. And fragment-level WS-Transfer (what WS-Management and WS-RT do) is then REST on steroids. Sorry, not true. REST on crack if anything.

I am not a REST expert, but I know enough to understand that “everything has a URI” really means “anything meaningful has a URI”. It’s the difference between a crystal structure and a pile of mud. REST lets you interact directly with any node in the crystal, but there is a limited number of entities that are considered worthwhile of being a node. There is design involved (sorry, you can’t suddenly fire your architects, as attractive as that sounds). You can’t point to the space between two nodes in the crystal. XPath-on-top-of-WS-Transfer, on the other hand, lets you plunge your spoon anywhere in the pile of mud and scoop out whatever happens to be there.

Let’s take a look at WS-Federation (here is the latest draft), the only specification in a standard body that I know of that is currently using WS-RT. Whether it’s a wise choice or not for them, from a governance perspective, is a separate topic that I won’t cover here (answer: no. oops).

From a technical perspective, it is interesting to see how they went about using WS-RT PUT. They use it to update pseudonyms. But even though there is an XML representation for the pseudonyms, they don’t want to allow users to update any arbitrary part of that XML. So they create a specific dialect (the fed:FilterPseudonyms defined in section 6.1) that lets you, based on semantics that are meaningful in the specific domain covered by the specification, point to pseudonyms.

I believe most potential users of WS-RT PUT are in the same case as WS-Federation and are better served by a domain-specific way to identify entities of interest. At least the WS-Federation authors realized it rather than saying “great, WS-RT XPath fragment PUT gives us all this flexibility for free” and settling their implementers with the impossible task of producing interoperable implementations. Of course this begs the question of why WS-Federation uses WS-RT in the first place. A charitable interpretation is to pin this on overzealous re-use of all things WS-*. A more cynical interpretation sees this as a contrived precedent manufactured in an attempt “prove” that WS-RT provides features of general use rather than specific to the management domain.

Having described at length why XPath-driven updates aren’t as useful as they may seem, I can still think of two cases where a such a generic mechanism to modify an XML document could be useful. One is if the resource actually is a document (as opposed to having its state represented by a document). For example, a wiki page. But I haven’t exactly noticed wiki creators and users clamoring for wiki-over-SOAP, have you? The other situation is if you have a true model-driven system that is supported by a comprehensive system description and validation framework. The kind of thing that SML is trying to deliver. By using Schematron (rather than just XSD which is very limited in its expressivity beyond mere syntactical validation) to provide model validation. This would, in theory, allow the requester to validate the updated model before sending the change request. The change would still be validated on the receiver side (either explicitly or implicitly because a non-valid new model would simply fail when applied to the system), but the existence of the validation framework guarantees a high rate of successs (the sender would rarely send non-valid change requests). That’s very nice and exciting, but we don’t have this. SML is, as far as I can see, going nowhere fast in terms of adoption. Standardizing a model exchange protocol for that use case is, at this point in time, premature. Maybe one day.

14
Jun
2008

More clues on the Oslo/SCA/SML trail: it’s “D”

by William Vambenepe

I just found out that I completly missed some interesting information about Oslo-related efforts at Microsoft. Back in February, Mary-Jo Foley reported on a new modeling language (code-name “D”, apparently) that is part of this initiative. And more recently she reported that David Chappell gave a presentation about Oslo (and more generally Microsoft’s SOA plans) at TechEd. He reportedly said that we should expect a new “schema language” (which Mary-Jo thinks is “D”). What I want to know is what its relationship is with SML/SDM and SCA.

Mary-Jo might not know about SCA and SML but I know that David does. He wrote this white paper about SCA and an article arguing that “Microsoft Should Not Support SCA” (based on an a questionable assessment that SCA is only about portability). He and I also had a little back-and-forth about SCA, SML and Microsoft in the comments section of his post. Unfortunately, David hasn’t blogged about Microsoft’s SOA strategy for a while for us non-TechEd people.

In addition to Mary-Jo’s report, the only information I was about to quickly dig out about David’s presentation is this blog post on Microsoft’s Israel site. Looks like David gave the same presentation at TechEd Israel 2008. Anyone who understands Hebrew cares to translate the blog? Fortunately there is a two-minutes video (also available here) in which we can hear David talk (in English). During the second of the two minutes you’ll hear and see something that could come straight out of a SCA presentation…

For some reason, David’s TechEd Israel presentation doesn’t seem to be listed here and TechEd online tells me that “Featured videos are unavailable at this time”. That’s both for IT Professionals and Developers. But of course they forced me to install Silverlight before telling me that.

[UPDATED 2008/8/11: Here is a 14 minutes video interview of David Chappell providing an update on Oslo.]

09
Jun
2008

Recent IT management announcements

by William Vambenepe

There were a few announcements relevant to the evolution of IT management over the last week. The most interesting is VMware’s release of the open-source (BSD license) VI SDK, a Java API to manage a host system and the virtual machines that run on it. Interesting that they went the way of a language-specific API. The alternatives, to complement/improve their existing web services SDK, would have been: define CIM classes and implement a WBEM provider (using CIM-HTTP and/or WS-Management), use WS-Management but without the CIM part (define the model as native XML, not XML-from-CIM), use a RESTful HTTP-driven interface to that same native XML model or, on the more sci-fi side, go the MDA way with a controller from which you retrieve the observed state and to which you specify the desired state. The Java API approach is the easiest one for developers to use, as long as they can access the Java ecosystem and they are mainly concerned with controlling the VMWare entities. If the management application also deals with many other resources (like the OS that runs in the guest machines or the hardware under the host, both of which are likely to have CIM models), a more model-centric approach could be more handy. The Java API of course has an underlying model (described here), but the interface itself is not model-centric. So what with all the DMTF-love that VMWare has been displaying lately (OVF submission, board membership, hiring of the DMTF president…). Should we expect a more model-friendly version of this API in the future? How does this relate to the DMTF SVPC working group that recently released some preliminary profiles? The choice to focus on beefing-up the Java-centric management story (which includes Jython, as VMWare was quick to point out) rather than the platform-agnostic, on-the-wire-interop side might be seen by the more twisted minds as a way to not facilitate Microsoft’s “manage VMWare today to replace it tomorrow” plan any more than necessary.

Speaking of Microsoft, in unrelated news we also got a heartbeat from them on the Oslo project: a tech preview of some of the components is scheduled for October. When Oslo was announced, there was a mix of “next gen BizTalk” aspects and “developer-driven DSI” aspects. From this report, the BizTalk part seems to be dominating. No word on use of SML.

And finally, SOA Software (who was previously called Digital Evolution and who acquired Blue Titan, Flamenco and LogicLibrary, in case you’re trying to keep track) has released a “SOA Development Governance Product”. Nothing too exciting from what I can see on InfoQ about it, but that’s a pretty superficial evaluation so don’t let me stop you. Am I the only one who twitches whenever “federation” is used to mean at worst “import” or at best “synchronization”? Did CMDBf start that trend? BTW, is it just an impression or did SOA Software give InfoQ a list of the questions they wanted to be asked?

14
May
2008

WS-ManagementHammer: don’t do it but if you are going to do it anyway then…

by William Vambenepe

With the IBM/Microsoft/Intel/HP WSDM/WS-Management convergence now implicitly (if not yet officially) dead, it will be interesting to see what IBM is going to do with WSRF. WSRF is being used today, rarely explicitly but rather in an embedded fashion. People who use WSDM use it, people who use CDDLM use it, people who use the Globus Toolkit use it, etc. IBM could write off the convergence work (WS-ResourceTransfer, which was published as a draft, and WS-ResourceEnumeration and WS-EventNotification which were never published) and stick to using the existing WSRF specifications when they need the corresponding functionality. That’s what I hope they do.

Alternatively, they could decide to get the forceps out of the drawer. They can create a new, IBM-friendly (e.g. Fujitsu, CA, Cisco…) private consortium to take over the unfinished drafts (if the IBM/Microsoft/Intel/HP legal agreement allows this) or start new ones. Or they could go directly to W3C, OASIS or OGF and push for a new working group to do the work in the open (and since no-one else would really care about this work IBM should have relatively free hands there, the way Microsoft did in DMTF when IBM chose to boycott WS-Management). Why W3C would care and why OASIS or OGF would want to start commitees to obsolete their existing work is a separate question.

While I hope that IBM doesn’t try to push another pile of WS-* resouce management specifications on an industry that already has too many, if they do I hope that at least they’ll do it right. And that means doing away with the approach embedded in WS-ResourceTransfer. Having personally been involved in many iterations on this problem, I hope to have some insight to contribute.

Along the lines of the age-old parental advice “don’t do it but if you are going to do it then use a condom”, here is my advice to anyone thinking of doing another iteration on the WSRF question: don’t do it but if you are going to do it then be specific about what problem you are addressing.

First, let’s separate three scenarios.

Database query

WS-ResourceTransfer should not be seen as a way to query an XML database. Use XQuery for this.

REST

While architecturally it should be possible to build RESTful applications on top of WS-Transfer’s operations, this is simply not what is happening. WS-Transfer is being used either by CIM people (who get to it via WS-Management) or by big-SOA people (who get is as part of the whole WS-* stack) and neither of them is doing anything remotely RESTful. So just leave that aside and don’t see WS-ResourceTransfer as a way to do “fine-grained REST”. No REST user is loosing sleep over WS-ResourceTransfer being in limbo.

A flexible way to interact with a complex system

This is the use case that you should focus on. You have a system made up of many parts (e.g. a composite application or a server that is made of many components) that you can represent as an XML document. The XML repesentation contains some important information about the system, but it isn’t the system. There are identified resources within the system that have lifecycles, management capabilities and internal parameters. Not everything relevant is captured in the XML model. This is why it is different from an XML database.

In general, I don’t think that XML is the best way to represent complex IT systems. It has plenty of complications that are not relevant to IT management and it doesn’t elegantly support the representation of graphs, often the most natural way to represent such a system (more on this here). CMDBf, with its graph-oriented approach, is a better choice in general. But there are plenty of areas (especially smaller, well-defined, sub-systems) in which XML formats have been defined to represent systems. SCA and SML for example.

In the case where you are dealing with such an XML-described system, then there is value in standard ways to simplify interactions with the system and its parts. But here too, we need to distinguished different patterns rather than trying to handle them all in the same way.

Filtering/sequencing of returned data

Complex IT systems can generate a lot of configuration and/or monitoring data and often you only care for a small subset. For example, an asset record has dozens of elements (lease terms, owner, assigned user…) but you may only care to retrieve the date the lease expires. When you do a GET on the record, you want to qualify it by specifying that only that date needs to be returned. That’s what WS-RP, WS-RT and the WS-Management wsman:TransferFragment header allow. In a variation of this, you want all the data but you don’t want it in one go, you want to pull it piece by piece. That’s what WS-Enumeration gives you. The problem with all these specifications is that they only offer that feature when you are retrieving the resource representation (a WS-Transfer GET or equivalent), not for other operations. But how is this different from invoking an AirlineBooking operation and saying that you only want to be sent the confirmation code, not the full itinerary, equipment type, assigned seat, etc? Bundling this inside WS-RT (or equivalent) is not helpful. A generic SOAP header that can go on any message would be more appropriate (the definition of this header would need to pay special attention to security considerations, especially if the response is signed, because it could be abused to trick the server into sending, and signing, specifically-crafted messages).

Interacting with a sub-element of the system

If you have a handle to a computer system resource and you know that it has one CPU and that this CPU is represented by the /comp:CPU element of the system, why would you need to use some out-of-band discovery mechanism to interact with that CPU? It’s right there, you can see it, you can point to it. Surely there must be a way to address operations to it directly, right? WS-Management tries to do it with its wsman:Selector mechanism, but the selectors are not tied to the model and require, effectively, a separate out-of-band agreement for addressing. There shouldn’t be a need for such an additional agreement once an agreement has already been reached on the model.

What is needed is a way, for systems that have a known XML model, to address message to subpart by using the model itself to support that addressing. Call it SOAPy mashup if you want to feel like you are part of the cool kids. I described such a mechanism a while ago. In effect, it is an improvement on wsman:Selector that an eventual new iteration of WSRF should at least consider.

In some cases, namely when the operation is a WS-Transfer GET, this capability overlaps with the “filtering of returned data” capability. One way to look at it is that you are doing a GET at the level of the overall computer system and filtering the results down to the part that represents the CPU. Another way to look at it is that you are pinpointing the message to a subset of the model (the CPU part) and doing an unmodified GET on it. It doesn’t matter how you choose to think about it. In my proposal, these two ways produce the same message. Like the wave view and particle view of a photon, that in the end, describe the same physical entity with each being the best representation for a set of situations.

The problem with WS-RT and its predecessors is that it doesn’t recognise that this is just the intersection of two orthogonal concerns (filering of output versus addressing of sub-elements) and only handles that intersection.

Interacting with a set of resources as a set

The same kind of expression (typically XPath) that lets you point at a sub-element inside of a system also lets you point at a set of such sub-elements. But even though from an XPath perspective there isn’t much of a different (the first one just happens to return a nodeset that contains only one node), from an architectural perspective it is a very different use case. If you want to support such a use case then you have handle it as such and define all the associated semantics (sequential/parallel execution, fault handling, partial completion, resource-specific permissions…). You can’t just cross your fingers and assume that you get such features “for free” just because XPath can return a nodeset.

I know that this post illustrates a way of giving free advice that virtually ensures that it gets ignored. Similar (if you’ll allow the big stretch) to the way Chirac and Villepin were arguing againt an Iraq invasion in ways that probably reinforced the Bush administration’s determination to do it. When will the world finally learn to appreciate the oh-so-slightly obnoxious undertone that is inherently French (because, let me tell you, we’re not about to loose it)? At least, when my grandchildren ask me “where were you when IBM invented WS-ManagementHammer?” I can point to this post and say “I tried to stop it, I tried”.

[UPDATED 2008/5/15: How timely! Just after publishing this I find, via Coté, what looks like another example of French abrasiveness in the systems management world: the attitude, name and the way Jeff ends with a French-language quote make it quite likely that the "Jacques" person discounting the fact that his company's SNMP agent is broken is indeed a compatriot. French obnoxiousness aside, and despite my respect for standards, my advice to Jeff is that if a given SNMP agent works with HP, IBM, BMC and CA you will probably save yourself time in the long run by finding a way to support it (even if it is not spec-compliant) rather than getting the vendor to change. There are lots of sites out there that work fine with Firefox and IE but are not compliant with Web standards. Good luck getting them all fixed.]

[UPDATED 2008/7/14: I don't really plan to turn this post into a ongoing set of updates about "French attitude" but since today is Bastille Day I'll point to this map of the world as seen from Paris. If I wasn't on strike right now, I'd explain why the commenter is wrong to assert that "French self-deprecating humour" is rare.]

06
May
2008

System Center “Cross Platform Extension”: too many distractions

by William Vambenepe

I was hoping that by the time MMS was over there would be more clarity about the “Cross Platform Extension” to System Center that Microsoft announced there. But most of the comments I have seen have focused on two non-technical aspects: Microsoft is interested in heterogeneous management and Microsoft makes use of open source. That’s also the focus of Coté’s coverage.

So what? Is it still that exciting, in 2008, to learn that Microsoft recognizes that Linux and OSS are major players in enterprise computing? If Steve Ballmer eventually gets hold of Yahoo, do you think his first priority will be to move all the servers to Windows or to build up its search and advertising audience? It’s been now 10 years since the Halloween documents came out. They can be seen as the start of Microsoft’s realization that Linux/OSS are here for good. It is not surprising to see that one of their main authors is now the driving force behind WS-Management, an effort that illustrates the acceptance of heterogeneity and the need to deal with it (on Microsoft’s terms if possible, of course). The WS-Management effort started years ago and it was a clear sign that Microsoft knew it had to tackle heterogeneous management (despite the reassuring talk that “it’s all about making Windows the most manageable platform” to HP and others). Basically, Microsoft is using WS-Management to support heterogeneity without having to do too much work: by creating an industry standard that everyone writes to and that Microsoft uses internally. Heterogeneous management is intrinsic to DSI if DSI is to be anything more than a demo.

But all of this was known before MMS 2008 to anyone who was paying attention. Instead of all this Microsoft/OSS/heterogeneous talk, I am a lot more interested in the technical aspects of the “Cross Platform Extension”.

OpenPegasus has been around for a long time, as a C++ CIMOM with a bunch of associated providers and CIM-XML interoperability over HTTP with CIM clients. I don’t know where WS-Management support was on the OpenPegasus development timeline, but even without Microsoft getting involved it would have eventually happened. And this should have been sufficient for System Center to access the CIMOM (BTW, does System Center not support CIM-XML when WS-Management is not present and if it does then what is different in practice with WS-Management?).

I can see how Microsoft would bring some extra (and much welcome) development resources for the WS-Management implementation (BTW the guys at Intel already have an open-source C implementation of WS-Management) as well as some extra marketing/visibility/distribution. Nice, but not earth-shattering. Do they bring anything else to OpenPegasus?

And what else is in the “Cross Platform Extension” in addition to an OpenPegasus WS-Management-capable CIMOM? Is there any extra modeling capability beyond CIM? Any Microsoft-specific classes? Any discovery/reconciliation capability? How much actual configuration management versus just monitoring? Security? Health models? Desired state management? Or is it just a WS-Management CIMOM? Any pointer to specific information is welcome.

Of course the underlying question is whether others than Microsoft can manage resources that have an OpenPegasus-based System Center management pack on them. The Open Management Consortium guys have talked about an open management agent. Could, against all expectations, Microsoft be the one delivering it?

In the IT management world, there are the big 4 (HP, BMC, CA and IBM), the little 4 (Zenoss, Hyperic, GroundWorks and openQRM) and the mighty 3 (Oracle, Microsoft and EMC). Sorry John, I am reclaiming the use of the “mighty” term: your “mighty 2″ (or 2.5) are really still the “little 2″ (or 2.5). At least for now.

The interesting thing is that in that industry configuration there are topics on which the little ones and the mighty ones share common interests. For example, the big 4 have a lot more management packs for all kinds of resources, built up over the years. Some standard-based mechanism that partially resets the stage helps the little ones and the mighty ones better compete against the big 4. Even better if it has an attractive (and extensible) implementation ready in the form of an agent. But let’s be clear that it takes more than a CIMOM to make a management pack. You need domains-specific expertise in the form of health models, deployment/configuration scripts and/or descriptors, configuration validation, role management etc. Thus my questions about what else (beyond CIM over WS-Management) Microsoft is bringing to the table. SML and CML are supposed to address this space, but I didn’t hear them mentioned once in the MMS coverage.

[UPDATED on 2008/5/7: Another perspective on Microsoft and open source: Microsoft Ex-Pats Developing Open Source Software Outside of Redmond]

[UPDATED 2008/5/7: I got an answer to the question about System Center support for CIM-XML: it doesn't have it. So indeed it's either WS-Management of WMI. If you're a Linux box, that means it's WS-Management.]

25
Mar
2008

Elastra and data center configuration formats

by William Vambenepe

I heard tonight for the first time of a company called Elastra. It sounds like they are trying to address a variation of the data center automation use cases covered by Opsware (now HP) and Bladelogic (now BMC). Elastra seems to be in an awareness-building phase and as far as I can tell it’s working (since I heard about them). They got to me through John’s blog. They are also using the more conventional PR channel (and in that context they follow all the cheesy conventions: you get to “unlock the value”, with “the leading provider” who gives you “a new product that revolutionizes…” etc, all before the end of the first paragraph). And while I am making fun of the PR-talk I can’t help zeroing on this quote from the CEO, who “wanted to pick up where utility computing left off – to go beyond the VM and toward virtualizing complex applications that span many machines and networks”. Does he feels the need to narrowly redefine “utility computing” (who knew that all that time “utility computing” was just referring to a single hypervisor?) as a way to justify the need for the new “cloud” buzzword (you’ll notice that I haven’t quite given up yet, this post is in the “utility computing” category and I still do not have a “cloud” category)?

The implied difference with Opsware and Bladelogic seems to be that while these incumbent (hey Bladelogic, how does it feel to be an “incumbent”?) automate data center management tasks in old boring data centers, Elastra does it in clouds. More specifically “public and private compute clouds”. I think I know roughly what a public cloud is supposed to be (e.g. EC2), but a private cloud? How is that different from a data center? Is a private cloud a data center that has the Elastra management software deployed? In that case, how is automating private clouds with Elastra different from automating data centers with Elastra? Basically it sounds like they don’t want to be seen as competing with Opsware and Bladelogic so they try to redefine the category. Which makes it easier to claim (see above) to be “the leading provider of software for designing, deploying, and managing applications in public and private compute clouds” without having the discovery or change management capabilities of Opsware (or anywhere near the same number of customers).

John seems impressed by their “public cloud” capabilities (I don’t think he has actually tested them yet though) and I trust him on that. Knowing the complexities of internal data centers, I am a lot more doubtful of the “private cloud” claims (at least if I interpret them correctly).

Anyway, I am getting carried away with some easy nitpicking on the PR-talk, but in truth it uses a pretty standard level of obfuscation/hype for this type of press release. Sad, I know.

The interesting thing (and the reason I started this blog entry in the first place) is that they seem to have created structures to capture system design (ECML) and deployment (EDML) rules. From John’s blog:

“At the core of Elastra’s architecture are the system design specifications called ECML and EDML. ECML is an XML markup language to specify a cloud design (i.e., multiple system design of firewalls, load balancers, app servers, db servers, etc…). The EDML markup provides the provisioning instructions.”

John generously adds “Elastra seems to be the first to have designed their autonomics into a standards language” which seems to assume that anything in XML is a standard. Leaving the “standard” debate aside, an XML format does tend improve interoperability and that’s a good thing.

So where are the specifications for these ECML and EDML formats? I would be very interested in reading them, but they don’t appear to be available anywhere. Maybe that would be a good first step towards making them industry standards.

I would be especially interested in comparing this to what the SML/CML effort is going after. Here are some propositions that need to be validated or disproved. Comparing SML/CML to ECML/EDML could help shade light on them:

  • SML/CML encompasses important and useful datacenter automation use cases.
  • Some level of standardization of cross-domain system design/deployment/management is needed.
  • SML/CML will be too late.
  • SML/CML will try to do too many things at once.

You can perform the same exercise with OVF. Why isn’t OVF based on SML? If you look at the benefits that could be theoretically be derived by that approach (hardware, VM, network and application configuration all in the same metamodel) it tells you about all that is attractive about SML. At the same time, if you look at the fact that OVF is happening while CML doesn’t seem to go anywhere, it tells you that the “from the very top all the way down to the very bottom” approach that SML is going after is very difficult to pull off. Especially with so many cooks in the kitchen.

And BTW, what is the relationship between ECML/EDML and OVF? I’d like to find out where the Elastra specifications land in all this. In the worst case, they are just an XML rendering of the internals of the Elastra application, mixing all domains of the IT stack. The OOXML of data center automation if you want. In the best case, it is a supple connective tissue that links stiffer domain-specific formats.

[UPDATED 2008/3/26: Elastra's "introduction to elastic programing" white paper has a few words about the relationship between OVF and EDML: "EDML builds on the foundation laid by Open Virtual Machine Format (OVF) and extends that language's capabilities to specify ways in which applications are deployed onto a Virtual Machine system". Encouraging, if still vague.]

[UPDATED 2008/3/31: A week ago I hadn't heard of Elastra and now I learn that I had been tracking the blog of its lead-architect-to-be all along! Maybe Stu will one day explain what a "private cloud" is. His description of his new company seems to confirm my impression that they are really focused (for now at least) on "public clouds" and not the Opsware-like "private clouds" automation capabilities. Maybe the "private clouds" are just in the business plan (and marketing literature) to be able to show a huge potential markets to VCs so they pony up the funds. Or maybe they really plan to go after this too. Being able to seamlessly integrate both (for mixed deployments) is the holly grail, I am just dubious that focusing on this rather than doing one or the other "right" is the best starting point for a new company. My guess is that despite the "private cloud" talk, they are really focusing on "public clouds" for now. That's what I would do anyway.]

[UPDATED on 2008/6/25: Stephen O'Grady has an interesting post about the role of standards in Cloud computing. But he only looks at it from the perspective of possible standardization of the interfaces used by today's Cloud providers. A full analysis also needs to include the role, in Cloud Computing, of standards (app runtime standards, IT management standards, system modeling standards, etc...) that started before Cloud computing was big. Not everything in Cloud computing is new. And even less is new about how it will be used. Especially if, as I expect, utility computing and on-premise computing are going to become more and more intertwined, resulting in the need to manage them as a whole. If my app is deployed at Amazon, why doesn't it (and its hosts) show up in my CMDB and in my monitoring panel? As Coté recently wrote, "as the use of cloud computing for an extension of data centers evolves, you could see a stronger linking between Hyperic’s main product, HQ and something like Cloud Status".]

04
Mar
2008

SML version 1.1 enters last call

by William Vambenepe

Time is running out if you want to provide comments on the SML specification being standardized at W3C. It entered “last call” yesterday. You can read the SML draft and the SML-IF draft.

I unsuccessfully searched for a list of changes made to the submitted version (I was a co-author so I know that one well). Failing that, a very quick scan of the current drafts didn’t reveal any major surprise. If I run into a useful summary of the changes I’ll update this post to link to it.

Categories