William Vambenepe's blog

IT management in a changing IT world

Inappropriate antibiotic treatment is another common form of buy cheap cialis generic levitra viagra misuse.There are many perspectives from which to understand and pfizer viagra online it.One drawback to the large molecule antibiotics is that they will have viagra in canada difficulty crossing membranes and travelling systemically throughout the body.Antibiotic resistance has become a serious problem in both developed and generic viagra from india nations.This leads to more frequent use of newer and more expensive free sample prescription for viagra, which in turn leads to the rise of resistance to those drugs.

Archive for the 'SML' Category

24
Apr
2009

A post-mortem on the previous IT management revolution

by William Vambenepe

Before rushing to standardize “Cloud APIs”, let’s take a look back at the previous attempt to tackle the same problem, which is one of IT management integration and automation. I am referring to the definition of specifications that attempted to use the then-emerging SOAP-based Web services framework to easily integrate IT management systems and their targets.

Leaving aside the “Cloud” spin of today and the “Web services” frenzy of yesterday, the underlying problem remains to provide IT services (mostly applications) in a way that offers the best balance of performance, availability, security and economy. Concretely, it is about being able to deploy whatever IT infrastructure and application bits need to be deployed, configure them and take any required ongoing action (patch, update, scale up/down, optimize…) to keep them humming so customers don’t notice anything bothersome and you don’t break any regulation. Or rather so that any disruption a customer sees and any mandate you violate cost you less than it would have cost to avoid them.

The realization that IT systems are moving more and more towards distributed/connected applications was the primary reason that pushed us towards the definition of Web services protocols geared towards management interactions. By providing a uniform and network-friendly interface, we hoped to make it convenient to integrate management tasks vertically (between layers of the IT stack) and horizontally (across distributed applications). The latter is why we focused so much on managing new entities such as Web services, their execution environments and their conversations. I’ll refer you to the WSMF submission that my HP colleagues and I made to OASIS in 2003 for the first consistent definition of such a management framework. The overview white paper even has a use case called “management as a service” if you’re still not convinced of the alignment with today’s Cloud-talk.

Of course there are some differences between Web service management protocols and Cloud APIs. Virtualization capabilities are more advanced than when the WS effort started. The prospect of using hosted resources is more realistic (though still unproven as a mainstream business practice). Open source component are expected to play a larger role. But none of these considerations fundamentally changes the task at hand.

Let’s start with a quick round-up and update on the most relevant efforts and their status.

Protocols

WSMF (Web Services Management Framework): an HP-created set of specifications, submitted to the OASIS WSDM working group (see below). Was subsumed into WSDM. Not only a protocol BTW, it includes a basic model for Web services-related artifacts.

WS-Manageability: An IBM-led alternative to parts of WSDM, also submitted to OASIS WSDM.

WSDM (Web Services Distributed Management): An OASIS technical committee. Produced two standards (a protocol, “Management Using Web Services” and a model of Web services, “Management Of Web Services”). Makes use of WSRF (see below). Saw a few implementations but never achieved real adoption.

OGSI (Open Grid Services Infrastructure): A GGF (the organization now known as OGF) standard to provide a service-oriented resource manipulation infrastructure for Grid computing. Replaced with WSRF.

WSRF: An OASIS technical committee which produced several standards (the main one is WS-ResourceProperties). Started as an attempt to align the GGF/OGSI approach to resource access with the IT management approach (represented by WSDM). Saw some adoption and is currently quietly in use under the cover in the GGF/OGF space. Basically replaced OGSI but didn’t make it in the IT management world because its vehicle there, WSDM, didn’t.

WS-Management: A DMTF standard, based on a Microsoft-led submission. Similar to WSDM in many ways. Won the adoption battle with it. Based on WS-Transfer and WS-Enumeration.

WS-ResourceTransfer (aka WS-RT): An attempt to reconcile the underlying foundations of WSDM and WS-Management. Stalled as a private effort (IBM, Microsoft, HP, Intel). Was later submitted to the W3C WS-RA working group (see below).

WSRA (Web Services Resource Access): A W3C working group created to standardize the specifications that WS-Management is built on (WS-Transfer etc) and to add features to them in the form of WS-RT (which was also submitted there, in order to be finalized). This is (presumably) the last attempt at standardizing a SOAP-based access framework for distributed resources. Whether the window of opportunity to do so is still open is unclear. Work is ongoing.

WS-ResourceCatalog : A discovery helper companion specification to WS-Management. Started as a Microsoft document, went through the “WSDM/WS-Management reconciliation” effort, emerged as a new specification that was submitted to DMTF in May 2007. Not heard of since.

CMDBf (Configuration Management Database Federation): A DMTF working group (and soon to be standard) that mainly defines a SOAP-based protocol to query repositories of configuration information. Not linked with (or dependent on) any of the specifications listed above (it is debatable whether it belongs in this list or is part of a new breed).

Modeling

DCML (Data Center Markup Language): The first comprehensive effort to model key elements of a data center, their relationships and their policies. Led by EDS and Opsware. Never managed to attract the major management vendors. Transitioned to an OASIS member section and died of being ignored.

SDM (System Definition Model): A Microsoft specification to model an IT system in a way that includes constraints and validation, with the goal of improving automation and better linking the different phases of the application lifecycle. Was the starting point for SML.

SML (Service Modeling Language): Currently a W3C “proposed recommendation” (soon to be a recommendation, I assume) with the same goals as SDM. It was created, starting from SDM, by a consortium of companies that eventually submitted it to W3C. No known adoption other than the Eclipse COSMOS project (Microsoft was supposed to use it, but there hasn’t been any news on that front for a while). Technically, it is a combination of XSD and Schematron. It appears dead, unless it turns out that Microsoft is indeed using it (I don’t know whether System Center is still using SDM, whether they are adopting SML, whether they are moving towards M or whether they have given up on the model-centric vision).

CML (Common Model Library): An effort by the SML authors to create a set of model elements using the SML metamodel. Appears to be dead (no news in a long time and the cml-project.org domain name that was used seems abandoned).

SDD (Solution Deployment Descriptor): An OASIS standard to define a packaging mechanism meant to simplify the deployment and configuration of software units. It is to an application archive what OVF is to a virtual disk. Little adoption that I know of, but maybe I have a blind spot on this.

OVF (Open Virtualization Format): A recently released DMTF standard. Defines a packaging and descriptor format to distribute virtual machines. It does not defined a common virtual machine format, but a wrapper around it. Seems to have some momentum. Like CMDBf, it may be best thought of as part of a new breed than directly associated with WS-Management and friends.

This is not an exhaustive list. I have left aside the eventing aspects (WS-Notification, WS-Eventing, WS-EventNotification) because while relevant it is larger discussion and this entry to too long already (see here and here for some updates from late last year on the eventing front). It also does not cover the Grid work (other than OGSI/WSRF to the extent that they intersect with the IT management world), even though a lot of the work that took place there is just as relevant to Cloud computing as the IT management work listed above. Especially CDDLM/CDL an abandoned effort to port SmartFrog to the then-hot XML standards, from which there are plenty of relevant lessons to extract.

The lessons

What does this inventory tell us that’s relevant to future Cloud API standardization work? The first lesson is that protocols are easy and models are hard. WS-Management and WSDM technically get the job done. CMDBf will be a good query language. But none of the model-related efforts listed above seem to have hit the mark of “doing the job”. With the possible exception of OVF which is promising (though the current expectations on it are often beyond what it really delivers). In general, the more focused and narrow a modeling effort is, the more successful it seems to be (with OVF as the most focused of the list and CML as the other extreme). That’s lesson learned number two: models that encompass a wide range of systems are attractive, but impossible to deliver. Models that focus on a small sub-area are the way to go. The question is whether these specialized models can at least share a common metamodel or other base building blocks (a type system, a serialization, a relationship model, a constraint mechanism, etc), which would make life easier for orchestrators. SML tries (tried?) to be all that, with no luck. RDF could be all that, but hasn’t managed to get noticed in this context. The OVF and SDD examples seems to point out that the best we’ll get is XML as a shared foundation (a type system and a serialization). At this point, I am ready to throw the towel on achieving more modeling uniformity than XML provides, and ready to do the needed transformations in code instead. At least until the next window of opportunity arrives.

I wish that rather than being 80% protocols and 20% models, the effort in the WS-based wave of IT management standards had been the other way around. So we’d have a bit more to show for our work, for example a clear, complete and useful way to capture the operational configuration of application delivery services (VPN, cache, SSL, compression, DoS protection…). Even if the actual specification turns out to not make it, its content should be able to inform its successor (in the same way that even if you don’t use CIM to model your server it is interesting to see what attributes CIM has for a server).

It’s less true with protocols. Either you use them (and they’re very valuable) or you don’t (and they’re largely irrelevant). They don’t capture domain knowledge that’s intrinsically valuable. What value does WSDM provide, for example, now that’s it’s collecting dust? How much will the experience inform its successor (other than trying to avoid the WS-Addressing disaster)? The trend today seems to be that a more direct use of HTTP (”REST”) will replace these protocols. Sure. Fine. But anyone who expects this break from the past to be a vaccination against past problems is in for a nasty surprise. Because, and I am repeating myself, it’s the model, stupid. Not the protocol. Something I (hopefully) explained in my comments on the Sun Cloud API (before I knew that caring about this API might actually become part of my day job) and something on which I’ll come back in a future post.

Another lesson is the need for clear use cases. Yes, it feels silly to utter such an obvious statement. But trust me, standards groups still haven’t gotten this. It’s not until years spent on WSDM and then WS-Management that I realized that most people were not going after management integration, as I was, but rather manageability. Where “manageability” is concerned with discovering and monitoring individual resources, while “management integration” is concerned with providing a systematic view of the environment, with automation as the goal. In other words, manageability standards can allow you to get a traditional IT management console without the need for agents. Management integration standards can allow you to coordinate your management systems and automate their orchestration. WS-Management is for manageability. CMDBf is in the management integration category. Many of the (very respectful and civilized) head-butting sessions I engaged in during the WSDM effort can be traced back to the difference between these two sets of use cases. And there is plenty of room for such disconnect in the so-loosely-defined “Cloud” world.

We have also learned (or re-learned) that arbitrary non-backward compatible versioning, e.g. for political or procedural reasons as with WS-Addressing, is a crime. XML namespaces (of the XSD and WSDL types, as well as URIs used in similar ways in specifications, e.g. to identify a dialect or profile) are tricky, because they don’t have backward compatibility metadata and because of the practice to use organizations domain names in the URI (as opposed to specification-specific names that can be easily transferred, e.g. cmdbf.org versus dmtf.org/cmdbf). In the WS-based management world, we inherited these problems at the protocol level from the generic WS stack. Our hands are more or less clean, but only because we didn’t have enough success/longevity to generate our own versioning problems, at the model level. But those would have been there had these models been able to see the light of day (CML) or see adoption (DCML).

There are also practical lessons that can be learned about the tactics and strategies of the main players. Because it looks like they may not change very much, as corporations or even as individuals. Karla Norsworthy speaks for IBM on Cloud interoperability standards in this article. Andrew Layman represented Microsoft in the post-Manifestogate Cloud patch-up meeting in New York. Winston Bumpus is driving the standards strategy at VMWare. These are all veterans of the WS-Management, WSDM and related wars collaborations (and more generally the whole WS-* effort for the first two). For the details of what there is to learn from the past in that area, you’ll have to corner me in a hotel bar and buy me a few drinks though. I am pretty sure you’d get your money’s worth (I am not a heavy drinker)…

In summary, here are my recommendations for standardizing Cloud API, based on lessons from the Web services management effort. The theme is “focus on domain models”. The line items:

  • Have clear goals for each effort. E.g. is your use case to deploy and run an existing application in a Cloud-like automated environment, or is it to create new applications that efficiently take advantage of the added flexibility. Very different problems.
  • If you want to use OVF, then beef it up to better apply to Cloud situations, but keep it focused on VM packaging: don’t try to grow it into the complete model for the entire data center (e.g. a new DCML).
  • Complement OVF with similar specifications for other domains, like the application delivery systems listed above. Informally try to keep these different specifications consistent, but don’t over-engineer it by repeating the SML attempt. It is more important to have each specification map well to its domain of application than it is to have perfect consistency between them. Discrepancies can be bridged in code, or in a later incarnation.
  • As you segment by domain, as suggested in the previous two bullets, don’t segment the models any further within each domain. Handle configuration, installation and monitoring issues as a whole.
  • Don’t sweat the protocols. HTTP, plain old SOAP (don’t call it POS) or WS-* will meet your need. Pick one. You don’t have a scalability challenge as much as you have a model challenge so don’t get distracted here. If you use REST, do it in the mindset that Tim Bray describes: “If you’re going to do bits-on-the-wire, Why not use HTTP? And if you’re going to use HTTP, use it right. That’s all.” Not as something that needs to scale to Web scale or as a rebuff of WS-*.
  • Beware of versioning. Version for operational changes only, not organizational reasons. Provide metadata to assert and encourage backward compatibility.

This is not a recipe for the ideal result but it is what I see as practically achievable. And fault-tolerant, in the sense that the failure of one piece would not negate the value of the others. As much as I have constrained expectations for Cloud portability, I still want it to improve to the extent possible. If we can’t get a consistent RDF-based (or RDF-like in many ways) modeling framework, let’s at least apply ourselves to properly understanding and modeling the important areas.

In addition to these general lessons, there remains the question of what specific specifications will/should transition to the Cloud universe. Clearly not all of them, since not all of them even made it in the “regular” IT management world for which they were designed. How many then? Not surprisingly (since IBM had a big role in most of them), Karla Norsworthy, in the interview mentioned above, asserts that “infrastructure as a service, or virtualization as a paradigm for deployment, is a situation where a lot of existing interoperability work that the industry has done will surely work to allow integration of services”. And just as unsurprisingly Amazon’s Adam Selipsky, who’s company has nothing to with the previous wave but finds itself in leadership position WRT to Cloud Computing is a lot more circumspect: “whether existing standards can be transferred to this case [of cloud computing] or if it’s a new topic is [too] early to say”. OVF is an obvious candidate. WS-Management is by far the most widely implemented of the bunch, so that gives it an edge too (it is apparently already in use for Cloud monitoring, according to this press release by an “innovation leader in automated network and systems monitoring software” that I had never heard of). Then there is the question of what IBM has in mind for WS-RT (and other specifications that the WS-RA working group is toiling on). If it’s not used as part of a Cloud API then I really don’t know what it will be used for. But selling it as such is going to be an uphill battle. CMDBf is a candidate too, as a model-neutral way to manage the configuration of a distributed system. But here I am, violating two of my own recommendations (”focus on models” and “don’t isolate config from other modeling aspects”). I guess it will take another pass to really learn…

[UPDATED 2009/5/7: Senior moment! When writing this entry I forgot that I wrote an earlier entry (in late 2007) specifically to describe the difference between "manageability" and "management integration". So here it is, if you care for more details on this topic.]

02
Mar
2009

CMDBf is a lot more and a lot less than you think

by William Vambenepe

The DMTF CMDBf working group has recently published an updated draft of its specification. The final version should follow soon and I don’t expect major changes so now is not a bad time to start thinking about what this baby can do.

Since CMDBf stands for “configuration management database federation”, you might think the obvious answer to the “what can it do” question is “build a federation of configuration management databases”. Except it’s not. Despite its name, CMDBf provides little support for federation unless you take a very loose definition of the term. The specification gives you a query language and a very simple registration interface, with a sprinkle of metadata to improve interoperability. The query language lets you talk to a CMDB to retrieve information on configuration items (CIs) that it knows about. The registration interface lets you keep a CMDB informed of changes to CIs that it may care about. If you want to build on top of this a real federation, one that scales to the type of environment that CMDBs are used for today, you have to go further than what the specification provides. What CMDBf does give you is some amount of integration between CMDBs (at the protocol level at least, not at the model level). It may not sound like much but it is a lot of progress on the current situation and the right incremental step, whether you are aiming for true federation as the end goal or not.

That’s the “a lot less than you think” part. So, what’s the “a lot more than you think” part? Good stuff all around:

CMDBf provides a metamodel that is well-suited for complex IT systems and it provides an elegant graph-oriented query language on top of it. The most convenient representation for an IT system is neither “one big XML document” nor “a sea of nodes and edges”. CMDBf gives you a middle ground: a graph model with XML leaf nodes. So you can precisely model the relationships between your IT elements using explicit relationships (with their own records), but you can also attach a well-understood piece of XML to an item as a record without having to break that XML into a bunch of tiny relationships.

I am pretty sure there are other domains, beyond IT systems, for which this would be useful. It will be interesting to see if the CMDBf specification gets considered outside of its intended scope. But these domains are more likely to end up using RDF/OWL/SPARQL instead. Not everyone has made the leap from XML as a tool to XML as a religion, which made CMDBf necessary for us. But let’s not veer into another rant.

Let’s go back instead to describing how useful CDMBf can be to IT systems management, independently of any “federation” objective. Let me put it this way: if one was to create from scratch a configuration store for IT systems they should strongly consider the CMDBf conceptual model as the base metamodel. And something along the lines of the CMDBf Query (though not necessarily through its XML serialization) as the native query language for it. Most CMDBf implementers of course are not in this situation. Rather than writing the store from scratch they will create a CMDBf wrapper/interface on their current CMDB. And that’s fine too. CMDBf will work well as an interoperability protocol. Putting aside my gripes about XPath overuse, CMDBf strikes a reasonable balance that makes it implementable on top of any back-end technology (relational, XML, RDF, in-memory objects, bags of name-value pairs…). And the query patterns it supports map well to CMDB-to-CMDB integration use cases. But it is underselling it, in my view, to restrict it to this over-the-wire interoperability scenario. CMDBf also provides a very useful foundation for local access to the CMDB. CMDBf graph queries can support powerful visualization of the content of the CMDB. They can support the definition of configuration rules. They can support in-depth inspection of relationships (e.g. fault tree).

And that may jsut be the beginning. It could take three directions after v1:

The first one, as always for a standard, is that it is ignored and becomes irrelevant. I have to reluctantly list this one first, because it is statistically the most likely for a new standard. Especially one that is not a ratification of an existing de facto standard. And one that threatens an important control point for vendors. A slight variation on this scenario is for CMDBf to succeed from a marketing perspective, as a checkmark that most vendors tick, but not as a true technology. This is the “smokescreen” scenario from Mr. Skeptic. One scenario that worries me is that CMDBf could fail because of the poor models of the CMDBs that implement it. If your IT model is not granular enough or if it matches the UI of your application more than the semantics of the IT components, then CMDBf will expose these shortcomings and probably be blamed for them (with bad models, “shoot the messenger” becomes “shoot the protocol”).

The second possible direction is that CMDBf provides enough value in integrating CMDBs that people want more and challenge the group to deliver on the “f” part, federation. That could take the form of a combination of:

  • better integration with other protocols (mostly from the WS-Management family, like WS-Enumeration and WS-Eventing),
  • reconciliation support (here are ways to address it),
  • some model transformations or canonical models,
  • some optimizations in the query mechanism for distributed queries (e.g. data partition rules).

The third possible direction (not exclusive) is for CMDBf to become the basis for a standard rule language for IT models. Yeah, another one (remember SML?). SPIN and SML show us how a generic query language can be used to support configuration rules. I very much like SPIN but it requires adopting RDF as a metamodel, which is a hard sell in XML-land. SML suffers technically from being too reliant on an inappropriate validation tool (XSD) and treating relationships as a second thought rather than an integral part of the model. Which is fine in many areas (EMF does it too), but not, in my view, when modeling IT systems.

If we are not going to use RDF/SPIN then let’s copy them. We can use the CMDBf metamodel (graph-based) where SPIN uses RDF. We can use the CMDBf query language (graph-oriented) where SPIN uses SPARQL. Since CMDBf queries use XPath, we see some commonalities with SML (which uses XPath through Schematron). But in CMDBf XPath is scoped to the leaf nodes of the graph, not the entire model as it is in SML. In other words, SML adds relationship traversal to XPath, while CMDBf adds XPath to its relationship-aware queries. It’s a matter of who’s on top. It sounds academic but it isn’t.

Does the industry really want standardized, re-usable configuration rules? SML/CML seem to say no. The push towards Cloud interop, on the other hand, begs for it. At least if you believe in programming your environment in a way that is partialy declarative rather than entirely procedural.

[UPDATED 2009/3/5: Rob England (a.k.a. Mr. Skeptic as I refer to him above) provides a geek-to-English translation for this post. Neat!]

12
Jan
2009

A new SPIN on enriching a model with domain knowledge (constraints and inferences)

by William Vambenepe

Back when I was at HP and we got involved with what turned into SML (now a W3C candidate recommendation), we tried to make a case for the specification to be based on RDF/OWL rather than XML/XSD/Schematron. It was a strange situation from a technical perspective because RDF is a better foundation for an IT model than XML, but on the other hand XSD/Schematron is a better choice for validation than OWL. OWL is focused on inference, not validation (because of both fundamental design choices, e.g. the open world assumption, and language expressiveness limitations).

So our options were to either use the right way to represent the system (RDF) combined with the wrong way to capture constraints (OWL) or to use the wrong way to represent the system (XML) combined with the right way to constrain it (mostly Schematron, with some limited help from XSD). At the end, of course, this subtle technical debate was crushed under the steamroller of vendor politics and RDF never got a fair chance anyway.

The point of this little background story is to describe the context in which I read this announcement from Holger Knublauch of TopQuadrant: the new version of their TopBraid Composer tool introduces SPIN, a way to complement OWL with a SPARQL-based constraint checking and inference mechanism.

This relates to SML in two ways.

First, there are similarities in the approach: Schematron leverages the XPath language, used to query XML, to create validation rules. SML then marries Schematron with XSD, for a more powerful validation mechanism. Compare this to SPIN: SPIN leverages the SPARQL query language, used to query RDF, to create validation/inference rules. SPIN also marries this with OWL, for a more powerful validation/inference mechanism.

But beyond the mirroring structures of SPIN and SML, the most interesting thing is that it looks like SPIN could nicely solve the conundrum, described above, of RDF being the right foundation for modeling IT systems but OWL being the wrong constraint mechanism. SPIN may do a better job than SML at what SML is aiming to do (validation rules). And at the same time, you get “for free” (or as close to “for free” as you can get with software, which is still far from “free”) a pretty powerful inference mechanism. The most powerful I know of, short of using a general programming language to capture your inference rules (and good luck with maintaining these rules).

This may sound like sci-fi, but it’s the next logical step for IT configuration standardization. Let’s look at where we are today:

  • SML (at W3C) is an attempt to standardize the expression of constraints.
  • CMDBf (at DMTF) is standardizing how the model content is queried (and, to some limited extent at this point, federated).
  • And recently IBM authored a proposal for a reconciliation specification for items in the model and sent it to an Eclipse group (COSMOS).

But once you tackle reconciliation, you are already half-way into inferencing territory. At least if you want to reconcile between models, not just between instances expressed in the same model. Because the models may not be defined at the same level of granularity, and before you can reconcile items you need to infer finer-grained entities in your coarser-grained model (or vice-versa) so that you can reconcile apples with apples.

Today, inferencing for IT models is done as part of the “discovery packs” that you can buy along with your IT management model repository. But not very well, in general. Because the way you write such a discovery module for the HP Universal CMDB is very different from how you write it for the BMC CMDB, IBM’s CCMDB or as a plug-in for Oracle Enterprise Manager or Microsoft System Center. Not to mention the smaller, more specialized, players. As a result, there is little incentive for 3rd party domain experts to put work into capturing inference rules since the work cannot be widely leveraged.

I am going a bit off-topic here, but one interesting thing about standardization of inferencing for IT management, if it happens, is that it is going to be very hard to not use RDF, OWL and some flavor of SPARQL (SPIN or equivalent) there. And once you do that, the XML-based constraint mechanisms (SML or others) are going to be in for a rough ride. After resisting the RDF stack for constraints, queries and basic reconciliation (because the added value was supposedly not “worth the cost” for each of these separately), the XML dam might get a crack for inferencing. And once RDF starts to trickle through that crack, the whole dam is going to come down in a big wave. Just to be clear, this is a prophetic long-term vision, not a prediction for 2009 (unfortunately).

In the meantime, I’d like to take this SPIN feature a… spin (sorry) when I find some time. We’ll see if I can install the new beta of TopBraid composer despite having used up, a year ago, my evaluation license of the earlier version of the product. Despite what I had hopped at some point, this is not directly applicable to my current work, so I am not sure I want to buy a license. But who knows, SPIN may turn out to be the change that eventually puts RDF back on my “day job” list (one can dream)…

It’s also nice that Holger took the pain to deliver SPIN not just as a feature of his product but also as a stand-alone specification, which should make it pretty easy for anyone who has a SPARQL engine handy to support it. Hopefully the next step will be for him to clarify the IP terms for the specification and to decide whether or not he wants to eventually submit it for standardization. Maybe to the W3C SML working group? :-) I’d have a hard time resisting joining if he did.

30
Oct
2008

First in-depth look at Microsoft’s Oslo and the “M” modeling language

by William Vambenepe

Microsoft’s PDC is taking place this week and more details were shared with the attendees about project Oslo, an effort announced last year to drastically improve the use of models across the application lifecycle. Some code is available (I think the Quadrant code is only for PDC attendees but the Oslo SDK is available to everyone). I am not at PDC, I didn’t see any presentation and I didn’t download any code. But Microsoft has also posted technical details on MSDN and, as far as I am concerned, that’s the most time-effective way to spend a couple of hours learning about Oslo. BTW, the way they share these early design descriptions and accept to make their evolution public is admirable.

For those who only want to spend 10 minutes rather than 2 hours, here are the thoughts that came to my mind as I was reading.

Overall I am somewhat underwhelmed, but not necessarily in a bad way. I know that’s a little schizophrenic so let me explain. After hearing a lot about how Oslo was the next big thing in modeling, it is a little surprising to read a document that can be summarized as “modeling is good, so go create some SQL tables and store them in a RDBMS”. That’s the underwhelming part. But on the other hand, it is more down to earth and practically-minded than I feared. And this is just a summary, in truth there is more than just “use SQL”.

Half of the MSDN documentation basically explains how to use SQL Server to store application models (as of today, the “Developing Models for the Metadata Store” section has only one sub-section, “SQL Server Guidelines for Modeling in the Oslo Repository“). Does this mean that all .NET applications will eventually have to carry with them a deployment of SQL Server 2008 even if they don’t use it to store the their operational data? Sure there are a few extra repository services (e.g. finer-grained change auditing) but most Oslo repository services are generic SQL Server features. That section has quite a lot of T-SQL, but it’s pretty readable. It also has a lot of dependencies on following naming conventions which makes me think that directly creating T-SQL code is not the best approach.

Fortunately there is an alternative, the “M” language. It’s a schema language with a built-in constraint mechanism. I found it more data-oriented (as opposed to resource-oriented) than I expected. Even though “each model is really a set of data structures, relationships, and constraints in serialized form“, there is a lot more support for data structures and constraints than for relationships. It’s just a foreign key. Relationships aren’t items and don’t have any property (or “field” as they’re called in “M”). For example, the relationship between a student’s enrollment record and a given class can’t have, as property, the grade that the student got for that class (as in the example in section 4.1.4 of the second LC of SML). To model this in “M” you need to create another item (e.g. “courseEnrollment”) and have a relationship from the student to that item and another one from that item to the “course” item itself. Or to replace the foreign key in the student table with a complex structure that contains both the foreign key and the properties of the relationship. At the end it has the same expressiveness potential, but in a less streamlined form. I assume Microsoft took this approach for performance reasons.

I am going on a limb here, but it may also be a difference between development-time concerns and operation-time concerns. During development (all the way to testing and packaging), you can still mostly get away with a relatively simple containment structure. You care about the components of your application and how they are packaged inside or next to one another. Sure you care about who calls who outside of the deployment unit but that’s not as core a concern as getting your class dependencies right, your tests in order and your installer configured. In fact, some of the “who calls who” bindings will be only be realized at runtime. Oslo, at least so far, clearly seems more focused on development time than operations so support for a relationship-rich model may not seem critical. At operations time, on the other hand, you don’t really care so much about how things were packaged before installation. You care a lot more about who invokes who (especially for modern distributed applications), what the network layout is, what resources a ticket is attached to, etc. The model looks a lot more like a graph with complex relationships. Something that “M” doesn’t seem ideally suited for.

Except for this caveat, I like “M”. It’s not anti-XML (you can represent values as XML if you’d like) but it avoids the “the answer is XML/XSD what is the question” approach to modeling that is sometimes a little too prevalent. “M” is a much better schema language for IT systems than XSD. I especially like its approach to types. A value is not intrinsically of a given type. A type is a condition that you happen to meet or not at the current time (”take heart little field, you can be anything you want when you grow up”). As such, you can be of several types at the same time. Refined types are potatoes inside potatoes (not sure if “M” supports definition of types as unions and/or intersection of existing types, for intersection I want to write something like”type NewType : OldType1 where this in OldType2” but there is no “this” in “M”). That approach to types (and the way constraints leverage types) is reminiscent of RDF/OWL. It’s a classification more than a typification, but I understand why they didn’t want to call it “class”. The similarities with RDF/OWL don’t go any further. As I wrote earlier, “M”is very data-focused and not resource-focused: as far as I can tell “M” types are defined syntactically, not semantically (the semantics come as a consequence). For example, I don’t think that you can assert that a given item representing a person is of type “friendly” if there is no corresponding data in the item. You’d have to first create a boolean field called “friendly” and define that those that have that field set to “true” are of type “friendly”. Unlike in RDF/OWL where you can just assert that a subject is “friendly”.

Here is another reason why you can’t have “semantics-only” types: “if you do not specify the type of a field or value, M infers a type for it“. Two things don’t sound quite right to me here. First a detail: the sentence (like others in the doc) talks about “the” type of a field of value, while there can be more than one. More importantly, what’s the point of this feature? How does it help me to have my IRC nickname classified as a post code or as a password just because it happens to be made of a compatible combination of letters and numbers? Maybe it makes sense as a storage optimization, but why does it make sense to expose this to the user?

I also like the way “extents” work. The current description of that feature is pretty limited, but based on how it is used in other parts I think one of its usages is to support a non-OO equivalent to inheritance: create two extents, one for the “superclass” and one for the “subclass” where each only contains the properties/fields defined at that level. You should get both of them in order to have the full picture (all the fields). This is, if I understand it correctly, similar to something I have been (unsuccessfully so far because “XML doesn’t do it this way”) trying to sell to the DMTF CMDBf working group: model inheritance through a set of non-overlapping records rather than dealing with a type hierarchy on record types. It’s not just that it makes relational storage easier (even though it does and that’s probably why “M” does it this way), it also makes your query/select operations a lot easier to specify and implement.

All in all (and without having gone through the exercise of defining actual models in “M”), it seems like a fine schema language (except that its dependency on the CLR base types is unpractical for users outside of the Microsoft universe) but I am not sure if it is beefy enough to be a good IT management metamodel. When the document says that “the Oslo repository provides open and flexible access to the data it contains, which enables direct access to SQL Server views of the underlying data. There are no complex data access layers or APIs” it sounds better than saying “it’s just SQL, so map your model to it and if you want relationships or type inheritance just build it on top of it and quit whining”. But it is an admission of limitation at the same time as a claim of simplicity. I also smell an assumption that LINQ will provide enough hand-holding that non-SQL-savvy developers will be ok. We’ll see.

And then there is MGrammar. Things get a little confusing at that point if you try to relate MGrammar to “M”. Actually, the FAQ states that “the M language consists of three parts: MGraph, MSchema and MGrammar“. This came a bit as a surprise to me since at that point I had finished reading (not in details but not too quickly either) the “M” documentation and I hadn’t seen these names mentioned once. Looks like there is some documentation consistency issues here, but that’s hardly surprising considering this is a “hyper-early (pre-alpha)” release as Doug Purdy puts it.

I think that everything that I have referred to as “M” above is MSchema.

MGrammar is something different altogether: it’s the source of the Domain Specific Language (DSL) references we’ve been hearing in relationship with Oslo. Technically, MGrammar is a BNF on steroids plus an automatically generated parser for your syntax. Cute. I assume that “M” (i.e. MSchema) is built as MGrammar-defined DSL but I am not sure why I would care. I am all for reuse and if someone at Microsoft thought that there was something reusable in the way they defined MSchema then it’s a good thing to expose this tool. But where does it come into play in application modeling? The last thing I want is people inventing completely independent languages to describe different domains. I am all for specialization, but a common underlying metamodel is pretty nice when you have to make sense of a whole system. I don’t see any such commonality in MGrammar: as far as I can tell it can be used to define anything from PostScript to sonnets.

From the FAQ, the connection point between MGrammar and MSchema is MGraph (MGrammar languages are parsed into an MGraph, MSchema “builds on MGraph”). That’s nice, but since neither the MSchema nor the MGrammar documentation mention MGraph I don’t really know what to make of this. David Chappell’s white paper also mentions MSchema and MGrammar but not MGraph. The introduction to the MGrammar Language Specification states that “the data that results from Mg [a.k.a. MGrammar] processing is compatible with Mg’s sister language, The Oslo Modeling Language, M, which provides a SQL-compatible schema and query language that can be used to further process the underlying information“. Compatible? I need more information here. In any case, MGrammar sounds like a fun project for a techie. Who am I to deny Microsoft engineers their fun. Jokes aside, I am probably missing something here seeing how prevalent the DSL message is in all discussions of Oslo. Look at the “highlights of this book” section for the upcoming Oslo/M book from the creators of the “M” language: half of it is about the DSL support and there must be a reason beyond pure geekery. As a side note, if you buy this book you need to understand what little shelf life it will have (I can give you a good price on a lightly-used Hailstorm/”.Net my services” specification book).

Aside from the “M” language itself, there are a few models described in the documentation. One corresponds to BPMN (actually, it says that it “closely aligns with” BPPMN 1.1, does this imply that they are not quite the same?). The fact that this model supports imports from Visio is a nice feature.

The Application model (one of the places where you can see “extents” in action) scares me a little bit because I doubt that two different people would use the same “extents” to describe the same software elements. Unless of course that’s being done for them by a pre-defined mapping to their development framework (.NET) enacted by their common development tool (Visual Studio). Which may be the assumption. Yet, the Application model is defined in generic terms, not Microsoft-specific (with a couple of slip-ups, like a WebApplicationModule being defined as a “Web application (module) implemented by IIS or WAS“. Maybe I’ll feel better about the generic applicability of this Application model when I see a full-fledged description (e.g. including relationship semantics as captured in foreign key field names) and an example.

At the bottom of that Application model, there is a lonely “Manageable” type to use if you have a LifecycleState field. This reinforces my impression that despite the claims to link development time with operational time, a lot of the focus to date has been on the former rather than the latter.

The ServiceModel model will look familiar to people familiar with SCA and is presumably complementary to the WorkflowModel and WorkflowServiceModel models, both of which are directly mapped to Windows Workflow Foundation. I guess that’s where Oslo and Dublin touch one another. I am still glad they are now clearly separated.

There is also a “Quadrant” model which concerns me a bit (it seems to be used to store customization of the Quadrant UI which, while convenient to store straight in the repository, doesn’t strike me as necessarily belonging there).

At this point, the question is not whether Microsoft can build Oslo as it is currently defined. SQL Server 2008 already exists, the usage guidelines aren’t unrealistic and even the “M-to-T-SQL” translation doesn’t seem too hard for Microsoft to implement (the SDK presumable already contains an implementation). I have no doubt they can deliver the system they describe. What I don’t know is whether and how it will be actually useful.

Describing “M” in details is good. Describing how the repository is implemented on top of SQL Server 2008 is interesting but not so relevant. What I’d like to see is a description of how all this gets used. How does it change the Visual Studio experience? How does it change the installation process/format? How does it support round-tripping between lifecycle stages (e.g. if the developer changes the workflow model, does that original BPMN model get consequently updated)? How does it relate to SLAs and policies? How does it apply to application monitoring? How does it apply to configuration management, to the change process? Etc. In short, what’s the Oslo ecosystem going to be.

These questions aren’t completely ignored in the MSDN documentation, but they are dispensed with in a couple of pages: “Application Development and Lifecycle Improvements” and “IT Operations Benefits“. The former states, for example, that “having the Oslo repository act as a central location for these models also enables a connection between the design and implementation models. This connection helps prevent these models from becoming disconnected during the development process“. Which all sounds good but is just a set of assertions that we have heard many times before (not just from Microsoft). How do “M” and the Oslo repository really make this true?

On the “IT Operations Benefits” side, things are equally blurry: “the Oslo repository can store all types of machine and application configuration data. When consistently updated, this configuration data is a catalog of the current state of all monitored machines and applications in the environment“. Notice the “when consistently updated” hand wave. That’s kind of the crux if you really want to manage across the lifecycle. How will they achieve this consistency? By centralizing all changes through a model-driven controller a la SDM/SML? Through ongoing discovery and/or change notifications? By relying on good old ITIL/MOF processes?

The FAQ declares that “having a common approach does not necessarily correlate to one physical store, but more of a federated model and we believe that some of the new Repository, along with existing investments in both Configuration Management Database (CMDB) and Team Foundation Server (TFS), will form the foundation for a common Microsoft metadata strategy and should be supported across our set of products“. OK, but who is the source of truth for application configuration data? The Oslo repository or the CMDB? Is one the desired state and the other the observed state? Does the CMDB go back to simply being a Service Desk (and if so, does the Oslo repository take on the responsibility to enforce change processes, something that requires more than the security model in Oslo)? If the CMDB is still going to use SML as its metamodel, how do you efficiently federate across such different metamodels as SML (i.e. XSD + schematron + relationships) and “M”?

Lots of questions remaining. What will Oslo have turned into in a few years? A business process design/implementation/monitoring suite (there is a strong workflow feel to many parts)? A generic drag-and-drop programming environment (”the fact that entire features are already described by models means that for a wide array of application and component categories you can start using visual tools to design and implement your components“)? A control center for end to end application management? All of the above? Nothing?

This was just a quick brain dump after reading the documents. Actually, I just realized it somehow got pretty long (congrats if you’re still reading). I hope this post is not too disorganized. Oslo is an interesting effort, but, as Microsoft is first to admit, it’s at a very early stage. I am just surprised that this first release spends so much time on the “how” rather than the “what”. Maybe it’s just because I only got my information from the MSDN documentation. We’ll see when more content from PDC finds its way online. I just want the slides, watching recorded presentations is rarely time-efficient (and you can expect them to require Silverlight).

Speaking of Silverlight, there is this new site on Oslo if you think watching some videos is worth installing Silverlight. Those screenshots don’t motivate me sufficiently.

[UPDATED 2008/10/30: Rather than going to bed I Googled around a bit and found a  post by Martin Fowler that answers some of my questions about MGrammar, MGraph and MSchema. MGraph is for instances, MSchema is for types. It answers some plumbing question, but I still have questions about expected usage and relevance to applications modeling.]

[UPDATED 2008/10/30: I also found the recordings and slides from past PDC sessions. Nice job Microsoft for this quick turnaround time, even if you require Sliverlight and/or the PPTX viewer. The sessions are:

  • TL23 A Lap around "Oslo" (Doug Purdy, Vijaye Raji)
  • TL27 "Oslo": The Language (Don Box, David Langworthy)
  • TL18 "Oslo": Customizing and Extending the Visual Design Experience (Don Box, Florian Voss)
  • TL28 "Oslo": Repository and Models (Chris Sells)

The first two sessions (deliverd Tuesday) have a replay and slides, the others should, I assume, follow soon.]

[UPDATED 2008/11/3: A nice overview of Oslo by Aaron Skonnard. Unlike most other Oslo articles over the last week, this one tries to paint the (yet-to-be-realized) full picture of the Oslo ecoystem. He mentions that "other Microsoft products and technologies are expected to build on Oslo to provide other runtimes. A few that have already been announced include Microsoft System Center (Operations Manager) and Team Foundation Server (TFS) in Visual Studio Team System". It's interesting that he qualifies System Center to be more specifically "operations manager" rather than "configuration manager" but I wouldn't read too much into it at this point.]

18
Sep
2008

Last call for SML and SML-IF

by William Vambenepe

The SML working group at W3C has published the “last call” working draft of version 1.1 of the SML and SML-IF (”IF” stands for “interchange format”) specifications. You have until October 3rd to tell them what you think.

With all the Oslo fun, the OMG embrace and the silence from System Center there are more questions than answers about the use of SML at Microsoft. But the Eclipse COSMOS project (IBM and friends) is, as far as I know, valiantly going forward with the store/validator implementation. Which may or may not be the same codebase as what was used for the recent CMDBf interop demo (I am not sure how the SML and CDMBf implementations in COSMOS are articulated).

The COSMOS group also recently published an overview of SML. It doesn’t try to tell you why you’d want to use SML but it’s a good and succint description of what SML is technically (from an XML developer’s perspective).

10
Sep
2008

Oslo, blog posts and my crystal ball

by William Vambenepe

There is more and more information coming out about Oslo in anticipation of the Microsoft PDC in October.

David Chappell recorded a video about it last month. More recently Doug Purdy and Don Box each posted a short description of Oslo. Don describes the goal of Oslo as “simplify the process of developing, deploying, and managing software”. But when he lists ancestor technologies to illustrate that “Microsoft has been moving in this direction for over a decade now”, they are all about development, not management: COM type libraries, .NET metadata attributes, XAML. Interesting that neither SDM nor SML gets a mention. Neither did SCA by the way, but I wasn’t really expecting that one… :-)

Maybe the I am the only one looking for a SDM/SML echo here, just because I came to hear of Oslo through the DSI angle. Am I wrong to see Oslo as an enabler for DSI? This eWeek article doesn’t have anything to do with IT management. Reading it, Oslo is all about allowing people to write code through drag and drop. Yawn. And Don Box endorses the article.

Maybe it’s just me (an IT management guy more than a software development guy) but I don’t care so much about how the application model is created. I care a lot more about what it allows you to do in terms of IT management. Please don’t make me pull out the often-quoted figure about the percentage of IT budget spent on operations versus development/licensing. The eWeek piece fails to excite me, but fortunately David Chappell’s video interview is a lot more aligned with my thinking, so I still hold hopes for Oslo as an IT management enabler. Here is my approximate transcript of an example that David provides (at around 4:20) in the video:

“If someone comes to you and says i’ve got this business process and the SLA is not being met, what do you do? You’ve got to trace this through the right business process and the right application that supports that part of the process and find the machine it runs on and maybe look at the workflow that implements it and maybe look at the services that it provides. This involves talking to business analysts, or the IT pros or the architect or the developer, all of whom have their own view of the world, their own tools, their own prospective. The repository provides a common place to store all this stuff, to link it all together, and with a visual editor to have a common tool that lets you actually go through and answer this kind of questions.”

Now you’re talking.

And if Oslo is not the new blood of DSI, then what is? The DSI story is getting dated, SML is fading in our memories and of the three parts that supposedly compose DSI (”virtualized infrastructure, design for operations, and knowledge-driven management”), only virtualization is actually represented on the list of technologies on the DSI home page. Has DSI turned into just allowing System Center to manage a hypervisor? I still hold hopes that the Oslo data is going to spice things up there. It would be good for the industry at large, not just Microsoft.

I won’t be at the PDC but it will be interesting to see what filters out of these sessions. The first session in the list adds management of hybrid application systems (hybrid as in “cloud/on-premise combination” or “software+services” as Microsoft calls it), to the long “can do” list for Oslo. Impressive, if there is some meat behind the abstract. I think this task is often overlooked in discussions around management aspects of Cloud computing (see “the new, interesting thing is going to be the IT infrastructure to manage your usage of utility computing services as well as their interactions with your in-house software” in this previous entry).

Yes, I am reading way too much into session abstracts, but while I am at it I can’t help noticing that there is a lot of SQL and very little XML/XSD/XPath mentioned there. Even though one of the presenters is Gudge, the only person I have ever met who fully understands XSD (actually even he doesn’t, I’ve seen him in the WS-I days have to refer to… his book).

Even though I am sure we’ll be told that SML can be built on top of Oslo, the SQL orientation won’t make that so easy (I want to see how to build XSD+Schematron validation on top of a relational store using Oslo’s drag and drop development tool). And it puts Microsoft on a different architectural direction from IBM, who, as far as I can tell, thinks that the world is a big XML document. Neither is the most appropriate for IT management models. I prefer a graph model and associated graph queries along the lines of SPARQL or CMDBf.

But that’s just late-night idle speculations on my part (aka “blogging”). Let’s see what comes out in October.

[UPDATED 2008/9/10: Interesting timing. Microsoft is joining OMG, home of UML and BPMN. Coming next: a submission of a "new version" of UML and BPMN that happens to contain the extensions and tweaks that Microsoft made to them in the process of implementing Oslo. This, BTW, is the final nail in the SML coffin (SML isn't even mentioned in the press release).]

10
Jul
2008

Who needs XPath fragment-level PUT?

by William Vambenepe

WS-Management and WS-ResourceTransfer (WS-RT) both provide a mechanism to modify the XML representation of the state of a resource in a fine-grained way. The mechanisms differ a bit: WS-Management defines a SOAP header and distinguishes PUT from DELETE at the WS-Transfer operation level, while WS-RT uses the SOAP body and tunnels “modes” (remove, modify, insert) on top of the PUT WS-Transfer operation. But in their complete form both use XPath to point to any arbitrary nodeset and update it.

WS-ResourceProperties (WS-RP) takes a simpler approach. While it too supports XPath-driven retrieval of the content, it doesn’t attempt to provide an XPath-like level of flexibility when it comes to updating the content. All it offers is SET, INSERT, UPDATE and DELETE operations at the level of a property (a top-level child of the XML representation) and nothing more granular.

In this respect at least, WS-RP makes a better choice than its competitor and its aspiring successor.

First, XPath-driven updates sound easy but in fact are hard to specify. Not surprisingly, the current specifications do a pretty incomplete job at it. They often seem to assume that the XPath used to target the value to change returns only one node, but nothing guarantees this. If it picks up more than one node, do you replace all these nodes by the new values as a block (the new values get inserted once, presumably at the location of the first selected node) or do you replace each selected node by all the new values (in which case they get duplicated as needed)? Also, the specifications say nothing about what constitutes compatibility between the targeted nodes and the replacement nodes. One might assume that a “don’t be stupid” approach is all that’s needed. But there is no obvious line between “stupid” and “useful”. Does a request to replace a text node by an attribute node make sense? Not in a strongly-typed world, but a more forgiving implementation might just insert the text value of the attribute in the place of the text node to get to a valid result. What about replacing an element by a text node? Some may reject it for incompatible types but, unless the schema prevents mixed content, it may well result in a perfectly valid document. All in all, specifying a reliable way to edit XML is a pretty hairy task. Much harder than reading XML. It requires very careful considerations that have very little do with on-the-wire protocol considerations. Which is why doing this as part of a SOAP specification is a strange choice. The XQuery group is much more qualified for this. There must be a reason why that group decided to punt on this until they had taken care of the easier “read” case.

Second, it’s usually not all that useful anyway. Which is why the lack of precision in WS-Management’s specification of the fragment PUT haven’t really been a problem so far: people haven’t fully implemented that feature. A lot of the implementations are backed by a CIMOM, an MBean or some other OO store. In these stores, the exposed granularity is typically at the attribute level. The interactions used by programmers and consoles are also at that level. The XPath-driven update is then only used as a mechanism to update many properties at once (rather than going deep into individual properties) but that’s using a machine gun to kill a fly. The WS-RP approach supports these use cases without calling on XPath.

Third, XPath-driven PUT is really hard to implement unless your back-end store happens to be an XML database. You may end up having to write your own XPath parser and interpreter, an exercise during which you will face some impedance mismatches. Your back-end store may not have notions of property order for example, or attribute versus element. How do you handle these XPath instructions? And what kind of interoperability results from implementers having to make these decisions on their own? Implementing XPath selection on a GET is a lot simpler. All it assumes is that there is an XML serialization of the result, on which you can run the XPath expression before shipping it out. That XML serialization is a given in the SOAP world already. But doing an XPath-driven PUT injects XML considerations in your store itself, not just in the communication path.

Those are the practical reasons. In short, it makes the specifications at best complex and at worst non-interoperable, for a feature that is rarely needed. That should be enough already, but there are some architectural reasons to stay away too.

WS-Transfer is sometimes sold for REST over SOAP. And fragment-level WS-Transfer (what WS-Management and WS-RT do) is then REST on steroids. Sorry, not true. REST on crack if anything.

I am not a REST expert, but I know enough to understand that “everything has a URI” really means “anything meaningful has a URI”. It’s the difference between a crystal structure and a pile of mud. REST lets you interact directly with any node in the crystal, but there is a limited number of entities that are considered worthwhile of being a node. There is design involved (sorry, you can’t suddenly fire your architects, as attractive as that sounds). You can’t point to the space between two nodes in the crystal. XPath-on-top-of-WS-Transfer, on the other hand, lets you plunge your spoon anywhere in the pile of mud and scoop out whatever happens to be there.

Let’s take a look at WS-Federation (here is the latest draft), the only specification in a standard body that I know of that is currently using WS-RT. Whether it’s a wise choice or not for them, from a governance perspective, is a separate topic that I won’t cover here (answer: no. oops).

From a technical perspective, it is interesting to see how they went about using WS-RT PUT. They use it to update pseudonyms. But even though there is an XML representation for the pseudonyms, they don’t want to allow users to update any arbitrary part of that XML. So they create a specific dialect (the fed:FilterPseudonyms defined in section 6.1) that lets you, based on semantics that are meaningful in the specific domain covered by the specification, point to pseudonyms.

I believe most potential users of WS-RT PUT are in the same case as WS-Federation and are better served by a domain-specific way to identify entities of interest. At least the WS-Federation authors realized it rather than saying “great, WS-RT XPath fragment PUT gives us all this flexibility for free” and settling their implementers with the impossible task of producing interoperable implementations. Of course this begs the question of why WS-Federation uses WS-RT in the first place. A charitable interpretation is to pin this on overzealous re-use of all things WS-*. A more cynical interpretation sees this as a contrived precedent manufactured in an attempt “prove” that WS-RT provides features of general use rather than specific to the management domain.

Having described at length why XPath-driven updates aren’t as useful as they may seem, I can still think of two cases where a such a generic mechanism to modify an XML document could be useful. One is if the resource actually is a document (as opposed to having its state represented by a document). For example, a wiki page. But I haven’t exactly noticed wiki creators and users clamoring for wiki-over-SOAP, have you? The other situation is if you have a true model-driven system that is supported by a comprehensive system description and validation framework. The kind of thing that SML is trying to deliver. By using Schematron (rather than just XSD which is very limited in its expressivity beyond mere syntactical validation) to provide model validation. This would, in theory, allow the requester to validate the updated model before sending the change request. The change would still be validated on the receiver side (either explicitly or implicitly because a non-valid new model would simply fail when applied to the system), but the existence of the validation framework guarantees a high rate of successs (the sender would rarely send non-valid change requests). That’s very nice and exciting, but we don’t have this. SML is, as far as I can see, going nowhere fast in terms of adoption. Standardizing a model exchange protocol for that use case is, at this point in time, premature. Maybe one day.

14
Jun
2008

More clues on the Oslo/SCA/SML trail: it’s “D”

by William Vambenepe

I just found out that I completly missed some interesting information about Oslo-related efforts at Microsoft. Back in February, Mary-Jo Foley reported on a new modeling language (code-name “D”, apparently) that is part of this initiative. And more recently she reported that David Chappell gave a presentation about Oslo (and more generally Microsoft’s SOA plans) at TechEd. He reportedly said that we should expect a new “schema language” (which Mary-Jo thinks is “D”). What I want to know is what its relationship is with SML/SDM and SCA.

Mary-Jo might not know about SCA and SML but I know that David does. He wrote this white paper about SCA and an article arguing that “Microsoft Should Not Support SCA” (based on an a questionable assessment that SCA is only about portability). He and I also had a little back-and-forth about SCA, SML and Microsoft in the comments section of his post. Unfortunately, David hasn’t blogged about Microsoft’s SOA strategy for a while for us non-TechEd people.

In addition to Mary-Jo’s report, the only information I was about to quickly dig out about David’s presentation is this blog post on Microsoft’s Israel site. Looks like David gave the same presentation at TechEd Israel 2008. Anyone who understands Hebrew cares to translate the blog? Fortunately there is a two-minutes video (also available here) in which we can hear David talk (in English). During the second of the two minutes you’ll hear and see something that could come straight out of a SCA presentation…

For some reason, David’s TechEd Israel presentation doesn’t seem to be listed here and TechEd online tells me that “Featured videos are unavailable at this time”. That’s both for IT Professionals and Developers. But of course they forced me to install Silverlight before telling me that.

[UPDATED 2008/8/11: Here is a 14 minutes video interview of David Chappell providing an update on Oslo.]

09
Jun
2008

Recent IT management announcements

by William Vambenepe

There were a few announcements relevant to the evolution of IT management over the last week. The most interesting is VMware’s release of the open-source (BSD license) VI SDK, a Java API to manage a host system and the virtual machines that run on it. Interesting that they went the way of a language-specific API. The alternatives, to complement/improve their existing web services SDK, would have been: define CIM classes and implement a WBEM provider (using CIM-HTTP and/or WS-Management), use WS-Management but without the CIM part (define the model as native XML, not XML-from-CIM), use a RESTful HTTP-driven interface to that same native XML model or, on the more sci-fi side, go the MDA way with a controller from which you retrieve the observed state and to which you specify the desired state. The Java API approach is the easiest one for developers to use, as long as they can access the Java ecosystem and they are mainly concerned with controlling the VMWare entities. If the management application also deals with many other resources (like the OS that runs in the guest machines or the hardware under the host, both of which are likely to have CIM models), a more model-centric approach could be more handy. The Java API of course has an underlying model (described here), but the interface itself is not model-centric. So what with all the DMTF-love that VMWare has been displaying lately (OVF submission, board membership, hiring of the DMTF president…). Should we expect a more model-friendly version of this API in the future? How does this relate to the DMTF SVPC working group that recently released some preliminary profiles? The choice to focus on beefing-up the Java-centric management story (which includes Jython, as VMWare was quick to point out) rather than the platform-agnostic, on-the-wire-interop side might be seen by the more twisted minds as a way to not facilitate Microsoft’s “manage VMWare today to replace it tomorrow” plan any more than necessary.

Speaking of Microsoft, in unrelated news we also got a heartbeat from them on the Oslo project: a tech preview of some of the components is scheduled for October. When Oslo was announced, there was a mix of “next gen BizTalk” aspects and “developer-driven DSI” aspects. From this report, the BizTalk part seems to be dominating. No word on use of SML.

And finally, SOA Software (who was previously called Digital Evolution and who acquired Blue Titan, Flamenco and LogicLibrary, in case you’re trying to keep track) has released a “SOA Development Governance Product”. Nothing too exciting from what I can see on InfoQ about it, but that’s a pretty superficial evaluation so don’t let me stop you. Am I the only one who twitches whenever “federation” is used to mean at worst “import” or at best “synchronization”? Did CMDBf start that trend? BTW, is it just an impression or did SOA Software give InfoQ a list of the questions they wanted to be asked?

14
May
2008

WS-ManagementHammer: don’t do it but if you are going to do it anyway then…

by William Vambenepe

With the IBM/Microsoft/Intel/HP WSDM/WS-Management convergence now implicitly (if not yet officially) dead, it will be interesting to see what IBM is going to do with WSRF. WSRF is being used today, rarely explicitly but rather in an embedded fashion. People who use WSDM use it, people who use CDDLM use it, people who use the Globus Toolkit use it, etc. IBM could write off the convergence work (WS-ResourceTransfer, which was published as a draft, and WS-ResourceEnumeration and WS-EventNotification which were never published) and stick to using the existing WSRF specifications when they need the corresponding functionality. That’s what I hope they do.

Alternatively, they could decide to get the forceps out of the drawer. They can create a new, IBM-friendly (e.g. Fujitsu, CA, Cisco…) private consortium to take over the unfinished drafts (if the IBM/Microsoft/Intel/HP legal agreement allows this) or start new ones. Or they could go directly to W3C, OASIS or OGF and push for a new working group to do the work in the open (and since no-one else would really care about this work IBM should have relatively free hands there, the way Microsoft did in DMTF when IBM chose to boycott WS-Management). Why W3C would care and why OASIS or OGF would want to start commitees to obsolete their existing work is a separate question.

While I hope that IBM doesn’t try to push another pile of WS-* resouce management specifications on an industry that already has too many, if they do I hope that at least they’ll do it right. And that means doing away with the approach embedded in WS-ResourceTransfer. Having personally been involved in many iterations on this problem, I hope to have some insight to contribute.

Along the lines of the age-old parental advice “don’t do it but if you are going to do it then use a condom”, here is my advice to anyone thinking of doing another iteration on the WSRF question: don’t do it but if you are going to do it then be specific about what problem you are addressing.

First, let’s separate three scenarios.

Database query

WS-ResourceTransfer should not be seen as a way to query an XML database. Use XQuery for this.

REST

While architecturally it should be possible to build RESTful applications on top of WS-Transfer’s operations, this is simply not what is happening. WS-Transfer is being used either by CIM people (who get to it via WS-Management) or by big-SOA people (who get is as part of the whole WS-* stack) and neither of them is doing anything remotely RESTful. So just leave that aside and don’t see WS-ResourceTransfer as a way to do “fine-grained REST”. No REST user is loosing sleep over WS-ResourceTransfer being in limbo.

A flexible way to interact with a complex system

This is the use case that you should focus on. You have a system made up of many parts (e.g. a composite application or a server that is made of many components) that you can represent as an XML document. The XML repesentation contains some important information about the system, but it isn’t the system. There are identified resources within the system that have lifecycles, management capabilities and internal parameters. Not everything relevant is captured in the XML model. This is why it is different from an XML database.

In general, I don’t think that XML is the best way to represent complex IT systems. It has plenty of complications that are not relevant to IT management and it doesn’t elegantly support the representation of graphs, often the most natural way to represent such a system (more on this here). CMDBf, with its graph-oriented approach, is a better choice in general. But there are plenty of areas (especially smaller, well-defined, sub-systems) in which XML formats have been defined to represent systems. SCA and SML for example.

In the case where you are dealing with such an XML-described system, then there is value in standard ways to simplify interactions with the system and its parts. But here too, we need to distinguished different patterns rather than trying to handle them all in the same way.

Filtering/sequencing of returned data

Complex IT systems can generate a lot of configuration and/or monitoring data and often you only care for a small subset. For example, an asset record has dozens of elements (lease terms, owner, assigned user…) but you may only care to retrieve the date the lease expires. When you do a GET on the record, you want to qualify it by specifying that only that date needs to be returned. That’s what WS-RP, WS-RT and the WS-Management wsman:TransferFragment header allow. In a variation of this, you want all the data but you don’t want it in one go, you want to pull it piece by piece. That’s what WS-Enumeration gives you. The problem with all these specifications is that they only offer that feature when you are retrieving the resource representation (a WS-Transfer GET or equivalent), not for other operations. But how is this different from invoking an AirlineBooking operation and saying that you only want to be sent the confirmation code, not the full itinerary, equipment type, assigned seat, etc? Bundling this inside WS-RT (or equivalent) is not helpful. A generic SOAP header that can go on any message would be more appropriate (the definition of this header would need to pay special attention to security considerations, especially if the response is signed, because it could be abused to trick the server into sending, and signing, specifically-crafted messages).

Interacting with a sub-element of the system

If you have a handle to a computer system resource and you know that it has one CPU and that this CPU is represented by the /comp:CPU element of the system, why would you need to use some out-of-band discovery mechanism to interact with that CPU? It’s right there, you can see it, you can point to it. Surely there must be a way to address operations to it directly, right? WS-Management tries to do it with its wsman:Selector mechanism, but the selectors are not tied to the model and require, effectively, a separate out-of-band agreement for addressing. There shouldn’t be a need for such an additional agreement once an agreement has already been reached on the model.

What is needed is a way, for systems that have a known XML model, to address message to subpart by using the model itself to support that addressing. Call it SOAPy mashup if you want to feel like you are part of the cool kids. I described such a mechanism a while ago. In effect, it is an improvement on wsman:Selector that an eventual new iteration of WSRF should at least consider.

In some cases, namely when the operation is a WS-Transfer GET, this capability overlaps with the “filtering of returned data” capability. One way to look at it is that you are doing a GET at the level of the overall computer system and filtering the results down to the part that represents the CPU. Another way to look at it is that you are pinpointing the message to a subset of the model (the CPU part) and doing an unmodified GET on it. It doesn’t matter how you choose to think about it. In my proposal, these two ways produce the same message. Like the wave view and particle view of a photon, that in the end, describe the same physical entity with each being the best representation for a set of situations.

The problem with WS-RT and its predecessors is that it doesn’t recognise that this is just the intersection of two orthogonal concerns (filering of output versus addressing of sub-elements) and only handles that intersection.

Interacting with a set of resources as a set

The same kind of expression (typically XPath) that lets you point at a sub-element inside of a system also lets you point at a set of such sub-elements. But even though from an XPath perspective there isn’t much of a different (the first one just happens to return a nodeset that contains only one node), from an architectural perspective it is a very different use case. If you want to support such a use case then you have handle it as such and define all the associated semantics (sequential/parallel execution, fault handling, partial completion, resource-specific permissions…). You can’t just cross your fingers and assume that you get such features “for free” just because XPath can return a nodeset.

I know that this post illustrates a way of giving free advice that virtually ensures that it gets ignored. Similar (if you’ll allow the big stretch) to the way Chirac and Villepin were arguing againt an Iraq invasion in ways that probably reinforced the Bush administration’s determination to do it. When will the world finally learn to appreciate the oh-so-slightly obnoxious undertone that is inherently French (because, let me tell you, we’re not about to loose it)? At least, when my grandchildren ask me “where were you when IBM invented WS-ManagementHammer?” I can point to this post and say “I tried to stop it, I tried”.

[UPDATED 2008/5/15: How timely! Just after publishing this I find, via Coté, what looks like another example of French abrasiveness in the systems management world: the attitude, name and the way Jeff ends with a French-language quote make it quite likely that the "Jacques" person discounting the fact that his company's SNMP agent is broken is indeed a compatriot. French obnoxiousness aside, and despite my respect for standards, my advice to Jeff is that if a given SNMP agent works with HP, IBM, BMC and CA you will probably save yourself time in the long run by finding a way to support it (even if it is not spec-compliant) rather than getting the vendor to change. There are lots of sites out there that work fine with Firefox and IE but are not compliant with Web standards. Good luck getting them all fixed.]

[UPDATED 2008/7/14: I don't really plan to turn this post into a ongoing set of updates about "French attitude" but since today is Bastille Day I'll point to this map of the world as seen from Paris. If I wasn't on strike right now, I'd explain why the commenter is wrong to assert that "French self-deprecating humour" is rare.]

Categories