10 things stopping you from becoming a data-driven organisation.

Man penning through an iPda looking at data (for an unknwon reason)
Data. Pleasure or Pain? Photo by Adeolu Eletu on Unsplash

Recently I was asked whether I had any advice or ‘best practice’ to organisations seeking to become ‘data driven’. This isn’t my long-term specialism, so I was flattered, but not really the right person to declare ‘best practice’ and responded so. But then I reflected a while on all the interesting work I had done for clients in this area (which turns out to be a fair bit) and I realised that some of the really difficult things I’ve tussled with, might be helpful lessons for others, so I pulled them together here.

These 10 things are not a manuscript of wisdom from a textbook, but they’re real-world lessons.

Lessons from the pain. Photo by Gemma Chua-Tran on Unsplash
  1. The problem is almost certainly behavioural and cultural, but the preference will be to think it’s the technology and poor data (or predecessors!).
  2. There’s no point having data and generating insight if there isn’t an operating model and decision culture that can consume it.
  3. Remember; “Without data, you’re just another person with an opinion”. Then remember that it’s 2020 and an opinion is often the only thing people care about. So plan for this reality and be sure you anticipate an emotional response.
  4. Don’t let people think they’re going to jump to ‘Data Driven’. There’s a bunch of stages that I think need to be acknowledge (Data informed, Data led…!) even if not transitioned through.
  5. The 4 levels of data analytics are fundamentally true in my experience. People will want predictive analytics without the basics (descriptive analytics) under control. That’s not to say that you cannot move onwards with what you have and provide some valuable analytics based on partial or sketchy data. The issues materialise though, because organisational focus now shifts to the new and exciting and forgets to fix the foundations of your data… and you know what they say about a house built on sand…
  6. Kill the endless, existential conversations defining the difference between data, information, knowledge etc. They’re broadly a waste of time (and assuming you use time to deliver value, then perhaps this is poor expenditure).
  7. Treat data as an asset, manage it so.
  8. Understand the underpinning meta-model that each person holds in their head. Whether or not anyone wants to talk it, it will exist, so you might as well manage that correctly.
  9. Don’t confuse the data with the system that stores it.
  10. Teach people to understand the basic difference between the ‘thing’ and the ‘fact about the thing’ (i.e. dimensions/entities/types/classes vs properties/attributes/fields). If you don’t pay attention to this, your data makes no sense.

For a bonus point, I’d also make the case for having a decent understanding of the motives. Typically I see an expectation that it will improve decision quality (it might) but that’s only a small fraction of what you should be thinking about. When moving towards a data-driven organisation, there can be improvements in Decision Efficiency, Decision Repeatability and (my favourite) Decision Velocity.

Creating quick and beautiful UML diagrams with UMLet

If you have the need to quickly pull together a simple view for a diagram, document or drawing, there’s no need to spin up a modelling tool or wrestle with Vision (or more likely wrestle with your IT department to get them to install Vision for you!) Look no further than UMLet (That’s https://www.umlet.com/ ) 

I’ve been using it for years but just thought it a good idea to mention it here.

It’s pretty simple to get the hang of, and using text based programming makes it really easy to copy and paste areas of your diagram. Take a look at a simple Sequence Diagram I pulled together in about 5 minutes:

Which have the underlying code of:

title: Combining Private and Public Primitives
SUBMITTER~id1|SERVICE (Private)~id2|SERVICE (Public)~id3|SIP (Private)~id4|SIP (Public)~id5|COMPOSITOR (Public)~id6|EVALUATOR (Public)~id7

iframe{:Registering a Private Service with a Private SIP (Out of Band)

id2->id4:id2,id4:Service Contract
id4->id4:id4,id4:Register and store
id2->id4:id2,id4:Service Contract (update TTL, Location)

iframe{:Registering a Public Service with a Public SIP (Out of Band)

id3->id5:id3,id5:Service Contract
id5->id5:id5,id5:Register and store
id3->id5:id3,id5:Service Contract (update TTL, Location)

iframe{:Finding available SERVICEs
id1->id4:Request for Service Contract
id4->id4:id4: Find the Service Contract
id4->id1:Service Contract

id1->id5:Request for Service Contract
id5->id5:id5: Find the Service Contract
id5->id1:Service Contract

iframe{:Using SERVICEs and evalutating risk
id1->id2:Input (to be kept private)
id2->id2:id2: Evaluate Input for risk
id2->id1:Service Result (a)

id1->id3:Input (to can be used in public)
id3->id3:id3: Evaluate Input for risk
id3->id1:Service Result (b)

id1->id6:Service result (a) + Existing Risk Profile
id6->id6:id6: Combined Service Result and Risk Profile
id6->id1:Risk Profile

id1->id6:Service result (a) + Existing Risk Profile
id6->id6:id6: Combined Service Result and Risk Profile
id6->id1:Risk Profile

id1->id7:Risk Profile
id7->id7:id7: Evaluate Risk Profile



Measuring the unmeasurable

As systems engineers, we are often required to quantify and measure certain concepts that initially appear too abstract to get a handle on. This is often a problem at initial stages of the systems engineering lifecycle and particularly at project start up or engineering mobilisation phases of the project lifecycle. Customers (and comparative internal stakeholders with similar interests such as project control) will start making requests of the engineering team along the lines of “how secure is the solution”, or “how modifiable is it”? Whilst one would hope that any requirements team worth their salt has agreed a decent requirements set that are well parameterised, there will always be idealistic high level requirements that feel insufficiently defined and immeasurable.

Whilst the inexperienced engineer might make initial judgements based around convoluted methods of pseudo-assessment, there are a number of approaches that might be better suited and are worth examination. By pseudo-assessment, I refer to methods used to elicit approximations of quantification and measurement based on either subjective views from experienced Subject Matter Experts (SMEs) or reflective judgements based on measurements taken on related areas. “It is highly secure because we have built it in accordance with the RMADS” or “It is easily modifiable because it has a component based architecture” for example.

Described here is a formal method of quantifying abstract qualities such as information security, reliability or data quality and, where appropriate, applying metrics to those areas. The seasoned systems engineer will no doubt shrug off such methods as obvious, but not only do they deserve explicit mention (and thus this text) but perhaps clarification and where possible, references to real world area in which they can be used. This work is not my own, it is mainly based on a paper by Pontus Johnson, Lars Nordstrom and Robert Lagerstrom from the Royal Institute of Technology, Sweden. I came across it in the publication “Enterprise Interoperability – New Challenges and Approaches” , published by Springer which will set you back a little over a hundred pounds at current UK prices. For the “real” version (including the maths), see their paper titled “Formalizing Analysis of Enterprise Architecture”. My interpretation (or bastardisation!) is a personal account of some of the concepts and I do not claim to be the authority on this (disclaimer over!).

The description here is a considerably less formal than the paper from which it came, and no doubt will be criticised for this dumbed down description however, this serves only to re highlight its use and perhaps make it more accessible. If the reader enjoys getting involved in the maths, they are welcome to go and access the paper and produce their own interpretation. In fact this is encouraged.


Architecture Theory Diagrams

When looking to paramterise and measure an abstract property, a reasonable approach would be to examine what “goes into” that property to make it what it is. A simple format for this method would be as follows:

1)            Decompose the abstract property into sub properties

2)            Try to quantify and measure the sub properties

3)            Aggregate the answered properties according to a schema to answer the initial abstract property.

This method makes a number of important assumptions:

1)            You believe that the abstract property can be formally decomposed to suitable properties

2)            You trust that the composition of the sub properties fully describe the abstract property

It will be noted that the method does not rely on the ability for sub properties of the abstract property to be sufficiently parameterised and measurable because step 2 of formal decomposition method is (theoretically infinitely) recursive and thus such sub properties will be found.

The Architecture Theory Diagram (ATD) approach extends this approach in a number of useful ways.

First, the ATD method formalises the nomenclature of abstract property decomposition by providing us with the following terms:

An Operationalised Property is property for which it is believed to be practicably possible to get a credible measure. That is to say that for the abstract property Information Security, Operationalised Properties might be properties such as Link Encrypted or Firewall installed (clearly both Boolean enumerated attributes).

Intermediate Properties are neither abstract nor operationalised. These properties exist only to serve the purpose of providing useful decomposition steps between the abstract properties and the operationalised properties.

Definitional Relations merely illustrate that a property is defined by its sub properties. This is broadly equivalent to a Composition relation in UML. The interesting use of the definitional relationship here is that they are to carry weightings. That is to say that each operational property is not equal and can be weighted according to its influence on the parent property. (Before anyone comments that the notation uses a UML aggregation symbol… this is the notation given for ATDs!)


Operationalised Properties are then given property values, and this is where we receive even more flexibility. The values assigned, are derived from expert opinion, direct measurement or otherwise and values are enumerated according to a suitable schema. The “plausibility” factor is the belief that we have the property carries the value attributed (Dempster-Shafer Theory).

Hopefully you will quickly see that the following steps are to start aggregating the values back up the decomposition to enable a value to be calculated for the abstract property. The accumulation of value to the abstract property is according to the weighted definitional relations and from this the maths gets quite complicated. I shall make no attempt to explain it… partly because at points it is beyond me but, if this has started to give you a flavor of the “art of the possible” then I strongly encourage you to look for the paper (or get in touch) and use it for your own purposes.

The strength of this method is that it gains suitably credible values for abstract properties and can be backed up by some useful maths to do the computation for you. The weighted definitional relations and incorporation of Dempster-Shafer theory supply the useful format for compiling these values into a useful measure of the abstract property.

I would certainly encourage anyone that has a use to explore this method, or adapt it for their purposes and, as always, I would welcome comment, feedback or thoughts.