The main question
I often hear that companies have a lot of data that is not used or analysed, and many times that is true. However, I also come across companies that believe they are collecting data to help them answer a specific question. But when they start analysing it, they realize either that the data cannot fulfil (completely, or even partly) the intended purpose – or, even worse, that what they intended to measure does not fully answer their main question.
So what should be considered when collecting data? First and foremost, don’t start by thinking about the data itself. Start by thinking about the problem you want to solve, the question you want to answer.
I’ll give you an example. One of our clients collected data for over a year and requested our support to build a model based on it. The client wanted to use the model to support decision-making during the tendering process, to basically answer the question: should we buy product A or product B, from supplier X or supplier Z? A complication arose from the fact that future costs (e.g. maintenance costs) had to be considered in the equation.
Sometimes analytics is more important than data
We received a lot of data to analyse and concluded we simply could not use it in the way expected. Why not? Because of data granularity, incomplete data due to “free-text” fields, missing key fields, etc.
What did we do? We thought about the main question again. What was the client looking for? A tool to support them during tendering. What was the objective? To minimize costs. We realized that the key to formulating the main question should be the ambition not only to estimate lifecycle costs, but also to ensure that offers were transparent and comparable.
What we did then was to make a tool to build scenarios not only around the products and expected lifecycle costs, but also around the commercial terms offered by each supplier. This small tweak in how to tackle the main question made a big difference. Now the client could compare different scenarios, visualize expected savings and select the most beneficial option. Of course, we still advised on how the data should be collected in the future in order to improve the model; but the initial analysis based on a little data answered a large part of the question already.
The lessons learned from this case
- Think about your main question first, and consider all factors affecting the question; sometimes the most obvious ones can be overlooked.
- It is important to consider not only what data you will collect, but also how you will collect it. For example, free-text fields can yield a lot of data, or none, depending on the case.
- You don’t always need huge amounts of data to answer your questions – the power lies in the analysis.