Category Archives: CMDB

Illustrative algorithm for CMDBf 1.0 Query operation

When I posted an algorithm for the server side implementation of a CMDBf Query call for version 0.95 of the specification, the interoperability testing session based on that version was over and I was pretty sure no-one but those of us who participated in that session would write an implementation of 0.95. But I published the algorithm anyway since I thought it was helpful to anyone who wanted to understand the specification in depth, even if they were not implementing it. Now that 1.o is out, there is a much higher probability of people implementing the specification, so I figured it would be worth updating the algorithm to take into account the changes from 0.95 to 1.0. So here it is.

One caveat. This algorithm assumes that the query request does not make use of the xpathExpression element because, as I have explained in my review of CMDBf 1.0, I don’t think interoperability is achievable on this feature in the current state of the specification.

As a note of caution, the previous version of the algorithm was backed by my implementation of CMDBf 0.95 for the interoperability testing, so I felt pretty confident about it. For this version of the algorithm I have not written a corresponding implementation and I have not done interoperability testing with anyone, it’s just based on my reading of the specification. The handling of depthLimit in particular is a little tricky and needs to be validated by implementation (what with creating a bunch of dummy item and relationship templates with temporary names and later going back to the original template names), please let me know if you find it flawed.

And, as previously, this is in no way an optimal implementation strategy. It is the most direct and obvious set of steps that I can come up with to implement the Query call in a way that exactly conforms to the specification. There are lots of ways to make this go faster, such as the ones I mentioned in a previous post (e.g. breaking out of loops once an instance has been removed, or not recalculating L1 and L2 over and over again for relationships in the same working set that share a source/target) plus new ones such as being smarter than my brute-force approach to handling depthLimit (in step 2).

All this said, here is the algorithm:

1) for each itemTemplate, calculate the set of all items (including relationships since they are a subclass of item) that obey the instanceIdConstraint and recordConstraint elements in the template (if present). Call this the working set for the itemTemplate.
2) for each relationshipTemplate RT that has a depthLimit element:

2.1) for i ranging from 1 to the value of maxIntermediateItems for RT:

2.1.1) create an itemTemplate that is an exact copy of the itemTemplate referenced by RT’s sourceTemplate, except that it has a new, unique temporary id (keep a record linking that new id to the id of the original source itemTemplate).
2.1.2) create an itemTemplate that is an exact copy of the itemTemplate referenced by RT’s targetTemplate, except that it has a new, unique, temporary id (keep a record linking that new id to the id of the original target itemTemplate).
2.1.3) for j ranging 1 from i:

2.1.3.1) create an itemTemplate that is an exact copy of the itemTemplate referenced by RT’s intermediateItemTemplate, except that it has a new, unique, temporary id (keep a record linking that new id to the id of the original intermediary itemTemplate).
2.1.3.2) create a relationshipTemplate that is an exact copy of RT, except that its source is the itemTemplate created in the previous iteration of the current loop (or the itemTemplate created in step 2.1.1 if j=1), its target is the itemTemplate created in the previous step and it has a new, unique, temporary id (keep a record linking that new id to RT’s id).

2.1.4) create a relationshipTemplate that is an exact copy of RT, except that its source is the last itemTemplate created in the 2.1.3 loop, its target is the itemTemplate created in 2.1.2 and it has a new, unique, temporary id (keep a record linking that new id to RT’s id).

3) for each relationshipTemplate calculate the set of all relationships that obey the instanceIdConstraint and recordConstraint elements in the template (if present). Call this the working set for the relationshipTemplate.
4) set need_to_loop = true
5) while (need_to_loop == true)

5.1) set need_to_loop = false
5.2) for each relationshipTemplate RT

5.2.1) let ITsource be the itemTemplate that is referenced as sourceTemplate by RT. Calculate the set of all items (including relationships since they are a subclass of item) that obey at least one of the instanceIdConstraint elements in ITsource (assuming there is at least one such element) and all the recordConstraint elements in ITsource. Call this the working set for ITsource.
5.2.2) let ITtarget be the itemTemplate that is referenced as targetTemplate by RT. Calculate the set of all items (including relationships since they are a subclass of item) that obey at least one of the instanceIdConstraint elements in ITtarget (assuming there is at least one such element) and all the recordConstraint elements in ITtarget. Call this the working set for ITtarget.
5.2.3) for each relationship R in the working set for RT

5.2.3.1) if the source of R is not in the working set for ITsource, then remove R from the RT working set
5.2.3.2) if the target of R is not in the working set for ITtarget, then remove R from the RT working set
5.2.3.3) if RT has a source/@minimum or a source/@maximum attribute

5.2.3.3.1) find the list L1 of all relationships in the working set for RT that have the same source as R
5.2.3.3.2) if RT has source/@minimum and the cardinality of L1 is less than this minimum then remove all relationships in L1 from the RT working set
5.2.3.3.3) if RT has source/@maximum and the cardinality of L1 is more than this maximum then remove all relationships in L1 from the RT working set

5.2.3.4) if RT has a target/@minimum or a target/@maximum attribute

5.2.3.4.1) find the list L2 of all relationships in the working set for RT that have the same target as R
5.2.3.4.2) if RT has target/@minimum and the cardinality of L2 is less than this minimum then remove all relationships in L2 from the RT working set
5.2.3.4.3) if RT has target/@maximum and the cardinality of L2 is more than this maximum then remove all relationships in L2 from the RT working set

5.3) for each itemTemplate IT:

5.3.1) let sourceRTset be the set of all relationshipTemplates that references IT as its sourceTemplate
5.3.2) let targetRTset be the set of all relationshipTemplates that references IT as its targetTemplate
5.3.3) for each item I in the IT working set

5.3.3.1) for each relationshipTemplate sourceRT in sourceRTset, if there is no relationship in the working set for sourceRT that uses I as its source, remove I from the IT working set and set need_to_loop to true
5.3.3.2) for each relationshipTemplate targetRT in targetRTset, if there is no relationship in the working set for targetRT that uses I as its source, remove I from the IT working set and set need_to_loop to true

6) process the eventual contentSelector elements and/or the @suppressFromResult attributes on the templates that have matching items/relationships in the response to remove or pair down items and relationships as requested
7) package the resulting items and relationships in a way that conforms to the CMDBf response message format (including putting each item in the <nodes> element with the appropriate @templateId attribute and putting each relationship in the <edges> element with the appropriate @templateId).
8) replace all the temporary template ids (from step 2) that appear in templateId attributes in the response with the original ids of the items and template based on the records that were kept in step 2.

Just to clarify things, what I do in step 2 is simply make explicit all the itemTemplates and relationshipTemplates that are made implicit by the depthLimit element, so that we can provide with a simpler algorithm after that assumes that all relationshipTemplate correspond to direct relationships (no intermediary). And in step 8 I hide the fact that this took place.

[UPDATED 2009/5/1: For some reason this entry is attracting a lot of comment spam, so I am disabling comments. Contact me if you’d like to comment.]

8 Comments

Filed under CMDB, CMDB Federation, CMDBf, Graph query, IT Systems Mgmt, Pseudo-algorithm, Query, Specs, Standards, Tech

Review of the CMDBf specification version 1.0

Having read the recently released CMDBf 1.0 specification over the weekend, I see several improvements since 0.95, including:

  • the introduction of depthLimit
  • the lastModified metadata element
  • the ability to specify more than one instanceId in a template
  • the ability to advertise what parts of the specification you implement
  • the definition of faults

But while 1.0 is more complete than 0.95, I think it makes it harder to achieve interoperability. Here are the main friction points for interop:

New role for XPath

The xpathExpression element (which replaces xpath1Selector) changes in two very important ways. First, rather than being limited to XPath 1.0, it now also allows XPath 2.0. Support for this is a lot harder to achieve for people who don’t use XML as the backend format for their data. Considering the current state of adoption of XPath 2.0 and the low level of XML complexity exposed by most CMDB models, I don’t think it was opportune to bring this into CMDBf yet. And my guess is that most implementations will stay away from this. But there is a second change, less obvious but even more problematic. XPath is not just another constraint mechanism for a CMDBf template anymore, one that returns a boolean result indicating whether the instance meets the constraint or not, as it used to be in 0.95. It is now an alternative selection and filtering mechanism that lives in parallel to all the other elements in a template (and can’t mix with them). Overall, I think this change goes too far in the direction of turning a shared agreement to exchange data in XML into an assumption that the internal data models are all based on XML. And the killer with regards to interoperability is that the specification says nothing about how the resulting node sets are serialized in the response. There may be a serialization for the XPath 2.0 model, but there is no such thing for XPath 1.0 and I don’t see in the current state of the specification how two implementations have any chance to interoperate when using this feature.

Introduction of linkDepth

As I mentioned earlier, linkDepth is a very useful addition (even though it pales in comparison to the inferencing capabilities that could have been derived from basing CMDBf on RDF). But it is also a complicated feature. The intermediateItemTemplate attribute is a good re-use of the existing plumbing, but it needs at least a detailed example. I trust that the group will generate one once they’ve caught their breath from putting out the specification.

Service capability metadata

There is a new section (#6) to provide ways to describe what CMDBf features an implementation supports. But it is a very granular representation. Basically, for every feature you can describe if you support it or not. So someone may describe that they support everything inside propertyValue, except for the “like” operator. And someone else might support all the operators but not the caseSensitive modifier. That might be ok for human consumptions, but automated scenarios rely on pre-programmed queries and that is made very hard by all the possible combinations of options. What we need is a few well-defined profiles that people implement fully. Starting of course with a profile that rules out xpathExpression.

Record metadata

This new version introduces metadata on records. While recordId and lastModified are probably well understood and interoperably usable I am a bit more dubious about whether baselineId and snapshotId are going to be interoperable across vendors based on their limited description in the specification. The nice thing is that this metadata can not only be returned but also searched on. Well, at least that’s the intent. But this goes through the recordMetadata attribute on propertyValue which, while present in the pseudo-schema, is missing in the XSD…

The contentSelector element

This new element is more flexible that the propertySubsetDirective element that it replaces. In addition to specifying what properties you want returned it also allows you to specify that you only want certain record types and/or that you only want the record(s) that were used to satisfy constraints in the template. Those are nice additions, but the way the second part is implemented (through the use of the matchedRecords attribute) seems to assume that only one record in the instance was used to match all the constraints in the template. This is not necessarily the case, an instance can be selected by having different records match the different constraints in the template as long as it has at least one matching record per constraint (line 765 says “the item satisfies all the constraints”, not “a record of the item satisfies all the constraints” and you can also see this in the example in section 4.2 where the records mentioned on lines 637 and 639 don’t have to be the same). So do you return all records that have a role in matching the template, or only those (if there is any) that matches all the constraints on their own as the text seems to imply? And if several record combinations inside an instance can be used to match the constraints in a template, do I return all of them or can I just pick any subset that matches? Also, how can I say that I want all records that established the template match, independently of their type? There doesn’t seem to be a way to do this, or is it by putting a contentSelector element with no child element and the matchedRecords attribute set to false? There won’t be much interoperability on this feature until all this is clarified.

Relationships as items

A major change between 0.95 and 1.0 is that now a relationship can match an itemTemplate. For example, if you ask for all items that were modified during the last 24 hours you will get all the items and all the relationships that meet that criteria while in the previous version you’d have to explicitly request the relationships with a relationshipTemplate if you wanted to get them too). There is a good case to be made for either view and the one that works best largely depends on your backend implementation technology (RDF, objects, SQL, CIM…). But the important thing is for the spec to be clear and on this point I think the change wasn’t made explicit enough in the query section of the specification. If Van hadn’t called my attention to this on his blog, I would have missed this important change.

Security boilerplate

There is a person at IBM (probably located in a well-stoked underground bunker in upstate NY) who has instilled the fear of god in all IBM employees (at least all those who author publicly available specifications) and forces them to include a boilerplate “security considerations” section everywhere. I have co-authored several documents with IBM employees and it never fails, even thought it doesn’t add anything useful to the specification. You should see the look of fear on the face of the IBM employees when someone else suggests doing without it. We somehow managed to sneak one such slimmer specification past the IBMers with CMDBf 0.95 but I see that this has been “corrected” in 1.0. I hope that whatever painful punishment Scott, Jacob, Andrew and Mark (or their families and pets) were subjected to in the process by the IBM security ogre wasn’t too cruel. Sure, this doesn’t really impact interoperability, but now that I don’t work for a company that makes money from ink anymore, I have even less patience for this bloating.

OK, that’s enough back seat driving for now. Hopefully the standards group that will take over the specification will address all these questions. In the context of the entire specification, these are pretty small issues and mostly easy to fix. And the CMDBf group can go on to address the hard issues of federation (including security-related issues that abound in this field if one really wants to tackle them). The current specification is a useful graph-oriented query language that is a good match for CMDB data. But it’s really just a query language (plus a simple registration system).

[UPDATE: while updating the CMDBf query algorithm, I noticed another small error: maxIntermediateItems is an attribute in the pseudo-schema but an element in the schema. Something else to fix in the next version.]

3 Comments

Filed under CMDB, CMDB Federation, CMDBf, Everything, Graph query, IT Systems Mgmt, ITIL, Query, Specs, Standards, Tech