This article explores what Data means and how understanding this transforms it’s use in business and the sciences. If you work in any numerate discipline or business, the chances are you must have heard about data so much by now you almost think you know what data is.
We hear about data analysts and how they differ from data scientists. We hear about how important it is for a company to have a data strategy and why data is so important to beat the current (as of the writing of this article) Covid-19 virus pandemic. We keep hearing about how data is important in making decisions on everything from Policy to business to healthcare. However, what data actually means gets little attention in the mainstream. It’s assumed knowledge.
DATA ORIGINATING FROM THE WORLD AROUND US
To understand what data is and what it means to the human mind, we must delve into cognitive neuroscience and how the mind perceives things. Data is an abstraction of the world around us. The amount of data that can be abstracted from any object or phenomenon, in reality, is only limited to the amount of knowledge we have of it.
Let’s take the human thumb for example. We can measure its dimensions (length, width and height) and measure it on a scale such as centimetres or millimetres. We can measure its colour and put then on a scale such as RGB for example. We can measure its texture and put that on a scale such as coarseness. We can measure the length of the fingernail from the nailbed to the tip of the finger and compare it to the length of the overall thumb. That would be a proportion. We can measure the number of ligaments holding the thumb joint together.
We can look at the pattern of concentric circles on the skin of the person and understand that each person’s pattern is different, and this constitutes the unique human fingerprint. We could look at and count the bones, the tendons the surface area of the skin, the contours on it. The list could go on. The list would be as long as the amount of knowledge we have of the particular object, in this case the thumb itself.
DATA AS PERCEIVED BY THE HUMAN MIND
What we can see therefore is that data is not innate to the subject of the observation but also depends on the prior knowledge base of the user. The knowledge base of the user has a direct effect on the type of data that can be abstracted from a given observation. This is captured in the brilliant book “On Looking – 11 walks with expert eyes” by Alexandra Horowitz.
11 Experts walking the same road would notice completely different aspects of reality depending on their prior knowledge. A botanist may notice the plants, an architect the buildings, a civil engineer the construction of roads etc etc.
How does this understanding of data affect businesses? Well for a start, we must realize that the data available to us about a particular scenario is a “snapshot” of reality. Just as a photographer who takes a picture decides to leave out vast amounts of the scene, similarly the data we capture about a given scenario is a small fraction of the “reality” in front of us.
So, an airline deciding whether to fly a particular route based on projected demand on that route must always realise that the projection will be wrong. It may be wrong by a large amount, or it may be by a small amount. But it will be wrong. Similarly, projections of weather can always be wrong to a certain amount, large or small depends on the accuracy of prediction. This understanding is especially important.
Therefore, qualitative data (“news about a situation”) rather than quantitative facts, tacit knowledge (otherwise known as gut feel or intuition) often plays a huge amount of importance in decision making in the sciences and in business. This is what great CEO’s like Steve Jobs are known for, who says: “Intuition is a very powerful thing,” he told writer Walter Isaacson, “more powerful than intellect.”
Authors like Daniel Kahneman have shown this “intuition” only works in a chosen field of expertise where the person has built up the scaffolding of tacit knowledge of the subject area. This is what is called “expert intuition” where there is a high level of regularity such as a chess game or predicting a partner’s behaviour.
Thirdly, an interdisciplinary approach to collecting data is very important. A maxim of management is that if you can’t measure it, you can’t manage it as Peter Drucker notes. John D Rockefeller was a pioneer in this field of measuring data for his business when he started Standard Oil. He would in his own words navigate through numbers:- “I charted my course by figures, nothing but figures.”
Waqas Ahmed, writes in his book, The Polymath, that it is important to develop an interdisciplinary worldview where we understand different subjects and come to a wholistic perspective on any topic at hand.
Similarly, Charlie Munger of Berkshire Hathaway fame speaks of a lattice of mental models which pick ideas from different fields and weave them together into a coherent framework of mental models.
This holistic perspective is really important in looking at any business situation to then collect data to interpret it from many different angles.
DATA AND THE DIFFERING NARRATIVES IT GENERATES
Finally, we need to realise is that the same data can generate different narratives. Good examples of this are politics or stock picking whereby with nearly the same amount of information available, people pick different or varying amounts of stock or different political opinions depending on how heavily they weigh certain narratives in their heads.
A good example of this would be that the Total CEO Christophe de Margerie who died in a plane crash caused by apparent human error has given rise to many conspiracy theories of what happened.
Similar conspiracy theories exist on almost any domain and people looking at the same field of information come to differing conclusions.
This subjective judgement or biases in decision making make data and analysis two separate fields altogether. In business this distinction isn’t always very clear. For example, is the job of a data scientist to interpret the data or simply run machine learning models on a given data set. Each part of the process from which data is collected all the way to how we present the results has an impact on the product of the data science process and therefore decisions made in the end.
Decision Making or generating narratives is not synonymous with data itself and is a separate process. Therefore, it is of paramount importance in data driven decision making to understand how the conclusions were generated and the methods used to arrive to that. In scientific papers therefore, the methods have to be shown and elaborated on in a separate section. In business however, this is not always the case and management can mistakenly assume that the results are synonymous with the data available.
CONCLUSION
To conclude, data is one of the most important aspects of the new digital economy and often called the “new oil”.
I find this particularly amusing having spent 8 years in the oil and gas industry. However, to truly appreciate the power of data we need to understand what data is and how it leads to decision making in business. We also need to realise data isn’t always stored as unstructured text or in relational databases and that decisions are not synonymous with the data itself. To appreciate that, you just have to see the coronavirus sceptics who consider it to be a hoax.