Illustrative algorithm for CMDBf 1.0 Query operation

When I posted an algorithm for the server side implementation of a CMDBf Query call for version 0.95 of the specification, the interoperability testing session based on that version was over and I was pretty sure no-one but those of us who participated in that session would write an implementation of 0.95. But I published the algorithm anyway since I thought it was helpful to anyone who wanted to understand the specification in depth, even if they were not implementing it. Now that 1.o is out, there is a much higher probability of people implementing the specification, so I figured it would be worth updating the algorithm to take into account the changes from 0.95 to 1.0. So here it is.

One caveat. This algorithm assumes that the query request does not make use of the xpathExpression element because, as I have explained in my review of CMDBf 1.0, I don’t think interoperability is achievable on this feature in the current state of the specification.

As a note of caution, the previous version of the algorithm was backed by my implementation of CMDBf 0.95 for the interoperability testing, so I felt pretty confident about it. For this version of the algorithm I have not written a corresponding implementation and I have not done interoperability testing with anyone, it’s just based on my reading of the specification. The handling of depthLimit in particular is a little tricky and needs to be validated by implementation (what with creating a bunch of dummy item and relationship templates with temporary names and later going back to the original template names), please let me know if you find it flawed.

And, as previously, this is in no way an optimal implementation strategy. It is the most direct and obvious set of steps that I can come up with to implement the Query call in a way that exactly conforms to the specification. There are lots of ways to make this go faster, such as the ones I mentioned in a previous post (e.g. breaking out of loops once an instance has been removed, or not recalculating L1 and L2 over and over again for relationships in the same working set that share a source/target) plus new ones such as being smarter than my brute-force approach to handling depthLimit (in step 2).

All this said, here is the algorithm:

1) for each itemTemplate, calculate the set of all items (including relationships since they are a subclass of item) that obey the instanceIdConstraint and recordConstraint elements in the template (if present). Call this the working set for the itemTemplate.
2) for each relationshipTemplate RT that has a depthLimit element:

2.1) for i ranging from 1 to the value of maxIntermediateItems for RT:

2.1.1) create an itemTemplate that is an exact copy of the itemTemplate referenced by RT’s sourceTemplate, except that it has a new, unique temporary id (keep a record linking that new id to the id of the original source itemTemplate).
2.1.2) create an itemTemplate that is an exact copy of the itemTemplate referenced by RT’s targetTemplate, except that it has a new, unique, temporary id (keep a record linking that new id to the id of the original target itemTemplate).
2.1.3) for j ranging 1 from i:

2.1.3.1) create an itemTemplate that is an exact copy of the itemTemplate referenced by RT’s intermediateItemTemplate, except that it has a new, unique, temporary id (keep a record linking that new id to the id of the original intermediary itemTemplate).
2.1.3.2) create a relationshipTemplate that is an exact copy of RT, except that its source is the itemTemplate created in the previous iteration of the current loop (or the itemTemplate created in step 2.1.1 if j=1), its target is the itemTemplate created in the previous step and it has a new, unique, temporary id (keep a record linking that new id to RT’s id).

2.1.4) create a relationshipTemplate that is an exact copy of RT, except that its source is the last itemTemplate created in the 2.1.3 loop, its target is the itemTemplate created in 2.1.2 and it has a new, unique, temporary id (keep a record linking that new id to RT’s id).

3) for each relationshipTemplate calculate the set of all relationships that obey the instanceIdConstraint and recordConstraint elements in the template (if present). Call this the working set for the relationshipTemplate.
4) set need_to_loop = true
5) while (need_to_loop == true)

5.1) set need_to_loop = false
5.2) for each relationshipTemplate RT

5.2.1) let ITsource be the itemTemplate that is referenced as sourceTemplate by RT. Calculate the set of all items (including relationships since they are a subclass of item) that obey at least one of the instanceIdConstraint elements in ITsource (assuming there is at least one such element) and all the recordConstraint elements in ITsource. Call this the working set for ITsource.
5.2.2) let ITtarget be the itemTemplate that is referenced as targetTemplate by RT. Calculate the set of all items (including relationships since they are a subclass of item) that obey at least one of the instanceIdConstraint elements in ITtarget (assuming there is at least one such element) and all the recordConstraint elements in ITtarget. Call this the working set for ITtarget.
5.2.3) for each relationship R in the working set for RT

5.2.3.1) if the source of R is not in the working set for ITsource, then remove R from the RT working set
5.2.3.2) if the target of R is not in the working set for ITtarget, then remove R from the RT working set
5.2.3.3) if RT has a source/@minimum or a source/@maximum attribute

5.2.3.3.1) find the list L1 of all relationships in the working set for RT that have the same source as R
5.2.3.3.2) if RT has source/@minimum and the cardinality of L1 is less than this minimum then remove all relationships in L1 from the RT working set
5.2.3.3.3) if RT has source/@maximum and the cardinality of L1 is more than this maximum then remove all relationships in L1 from the RT working set

5.2.3.4) if RT has a target/@minimum or a target/@maximum attribute

5.2.3.4.1) find the list L2 of all relationships in the working set for RT that have the same target as R
5.2.3.4.2) if RT has target/@minimum and the cardinality of L2 is less than this minimum then remove all relationships in L2 from the RT working set
5.2.3.4.3) if RT has target/@maximum and the cardinality of L2 is more than this maximum then remove all relationships in L2 from the RT working set

5.3) for each itemTemplate IT:

5.3.1) let sourceRTset be the set of all relationshipTemplates that references IT as its sourceTemplate
5.3.2) let targetRTset be the set of all relationshipTemplates that references IT as its targetTemplate
5.3.3) for each item I in the IT working set

5.3.3.1) for each relationshipTemplate sourceRT in sourceRTset, if there is no relationship in the working set for sourceRT that uses I as its source, remove I from the IT working set and set need_to_loop to true
5.3.3.2) for each relationshipTemplate targetRT in targetRTset, if there is no relationship in the working set for targetRT that uses I as its source, remove I from the IT working set and set need_to_loop to true

6) process the eventual contentSelector elements and/or the @suppressFromResult attributes on the templates that have matching items/relationships in the response to remove or pair down items and relationships as requested
7) package the resulting items and relationships in a way that conforms to the CMDBf response message format (including putting each item in the <nodes> element with the appropriate @templateId attribute and putting each relationship in the <edges> element with the appropriate @templateId).
8) replace all the temporary template ids (from step 2) that appear in templateId attributes in the response with the original ids of the items and template based on the records that were kept in step 2.

Just to clarify things, what I do in step 2 is simply make explicit all the itemTemplates and relationshipTemplates that are made implicit by the depthLimit element, so that we can provide with a simpler algorithm after that assumes that all relationshipTemplate correspond to direct relationships (no intermediary). And in step 8 I hide the fact that this took place.

[UPDATED 2009/5/1: For some reason this entry is attracting a lot of comment spam, so I am disabling comments. Contact me if you’d like to comment.]

8 Comments

Filed under CMDB, CMDB Federation, CMDBf, Graph query, IT Systems Mgmt, Pseudo-algorithm, Query, Specs, Standards, Tech

8 Responses to Illustrative algorithm for CMDBf 1.0 Query operation

  1. Hi,

    I just had a quick scan through the specification which to be honest I have not really tracked and I am somewhat disappointed that there is not a clear distinction between the information model and the managed model within the CMDB. This is something I introduced at OpenView during a time I was contracted to HP for solution design. I had though when I left that others would have taken this up and maybe so but apparently not within this specification which seems more focused on the data source adaptors. I need to really read this in greater depth.

    In my opinion the biggest mistake made is not having this distinction as it then opens up the possibilities of other data sources that are more more dynamic, transient and less managed though very important when performing incident and problem management. In my model the actual runtime state of systems and components which can bear some resemblance to the static and managed configuration is also accessible to ITIL processes. In fact I based my whole design of JXInsight JVMInsight on this.

    I introduced this in version 4.1 (http://www.jinspired.com/products/jxinsight/new-in-4.1.html) and further extended it in JXInsight 5.1 (http://www.jinspired.com/products/jxinsight/new-in-5.0.html).

    It is like merging of delta debugging with configuration management where configuration reflects the software execution model and not just the files and records that parameterize the operations of systems and processes.

    I intend on writing up a blog entry that explains this and shows how CMDB concepts can be brought into the developers world.

    kind regards,

    William Louth
    JXInsight Product Architect
    CTO, JINSPIRED

    http://www.jinspired.com

  2. vbp

    I am intrigued by the “information model” vs. “managed model” distinction that you make. I am not sure I understand what you really mean by that, but it seems to echo some of my thinking about connected, task-specialized, models (as opposed to modeling anything you can think of, including the kitchen sink, with the hope that someone can make use of that data). I’ll read the promised entry on your blog about bringing “CMDB concepts into the developers world” with interest when it’s there. I’d be great if you could come back and post another comment here to let us know when the entry is ready, my guess is that other readers would be interested too.
    I am not sure when you consulted for OpenView. Too bad we didn’t meet back then (even though I was less focused on application management than I am now).

  3. Hi William,

    The information model represents the complete state of the universe (a IT domain lets say). This state can be highly dynamic (volatile) and static and spread across many repositories (information stores). One should not constrain a repository to represent some persistent store such as a registry, database or file system. A repository could have a life cycle that is as short as the lifespan of an operating system process.

    The managed model is a subset of the information model that is placed under control in terms of a change management process. It is possible to have nested or multiple manage models with each having a sphere of control that is tied to the life cycle of an entity including an IT management domain, CI, or even an incident. At the end of the day a diagnostic image generated in the event of an error within a production system is really a snapshot (and subset) of the information model. Some of the information model might relate to items and aspects of the current manage model(s).

    The problems with current CMDB systems is that they do not distinguished between what is actual managed and what is unmanaged but and possibly pulled on demand from federated repositories (note I said repositories rather than federated cmdb’s). Now this distinction is actually good for end users of systems including service desk users trying to ascertain lets say the the system state of an users logging a service call related to his/her laptop. The hardware units might be part of the manage model but not necessarily the software packages but knowing the software packages installed might be useful within the context of that particular service call but not within the overall IT management. The software packages belong to the information model but are not placed directly under change management because the overhead of its management is deemed higher than any perceived benefits. Now when this issue arises today within organizations adopting ITIL with the current crop of solutions the immediate reaction from the change and configuration management team is to place all software packages installed on laptops in the CMDB. The CDBM starts to grow and grow and all those export and import scripts keep ITIL consultants busy all day long. But I understand we (IT management) do need to track lets say the provisioning of new software on the users laptop but the problem is that the CMDB designers always want to model every instance of the CI type which means all other user laptops. The solution is to create a manage model which represents the current state (hardware and software packages) and proposed state (transition) and tie its management sphere to the life cycle of the service call and associated work orders. This lessens the overall workload of the global management model and makes change management model local and hopefully efficient.

    I am rambling a bit here but I hope I have started to paint my vision which is simpler (at least from a process) and more scalable.

    Kind regards,

    William

  4. I really should have reviewed the last posting for spelling and grammar mistakes.

    Let me try once more with regard to the second last paragraph. The managed model of the particular user laptop is hidden (encapsulated) within the entity it is associated. This allows us to model more CI types but be more selective in the instances as well as the life time of the (change) management sphere.

    Simply – management models need to be contextual. We need different types of models with different levels of control and life cycles that reflect their usage within the context of the particular IT management organization and balance overhead with value.

    regards,

    William

  5. Hi,

    When I worked at HP OpenView in the ServiceDesk team in Amsterdam I was in fact hired solely for performance engineering and testing. At that time (2003) I had very little experience of ITIL but I had spent a large amount of time at Borland thinking about possible designs for service management solutions for Java enterprise applications. During my performance work which focused on issues with the design of the HP model I came to the conclusion that the HP approach was fundamentally flawed. This was probably easier for me to see because I did not have the baggage that others had inherited from their initial attempts at a CMDB offering in the ITSM product OpenView had acquired.

    Following on from this I looked in detail at the other management domains and how HP had tried to tackled them and came to a similar conclusion – the interpretation of ITIL was wrong (not that ITIL is prefect) and the translation to a product design was poor. I became the official party pooper for the whole of open view – the joys and sorrows of a vision.

    regards,

    William

  6. vbp

    Yes, I couldn’t help noticing an integration with the OpenView “object server” in the page describing JXInsight 4.1 that you linked to. I wasn’t directly involved in this but I remember lots of discussions around it some time ago. Things have changed quite a lot at HP Software with the Peregrine and Mercury acquisitions, both technically and strategically. You would probably not recognize the organization. Sounds like you’re now in position to prove your ideas with your own product and company. I’ll keep an eye on JInspired.

  7. Pingback: William Vambenepe’s blog » Blog Archive » An interesting business process query language

  8. Pingback: William Vambenepe’s blog » Blog Archive » Review of the CMDBf specification version 1.0