Data Analytics: A Systematic Approach to Discover Needles in a Haystack
Imagine the following conversation between a weary business leader and his/her enthusiastic IT expert:
Business: “We are inundated with data. We are so busy that we don’t even have the time to modify our reports and so, we keep churning out the same reports day after day. I am sure there are some nuggets of insights in all these mounds of data.”
IT: “Our storage is overflowing and we need you to authorize additional storage. We wish we could help you to delve into the data if only you could tell us what you are looking for.”
It’s a virtual deadlock! We are quite sure that this type of conversation often plays out in the corporate corridors these days. While business leaders know how to run their business in general, they are also looking for help to spot emerging trends in order to fine tune their business actions. If critical strategic insights can be routinely drawn from operational data, excessive data can stop being a burden and turn into a pillar of strength.
How does one break this deadlock and effectively engage with business customers? To handle this situation, we offer a simple approach illustrated below:
Step 1: Define the Need
Knowing the problem is half the solution, as the adage goes. The catch lies in knowing what insights/trends to look for. A conversation can help in discovering it.
For example, in case of an airline company that generates reams of data on actual flight arrivals/departures over several months, it is quite likely that they are interested in its On Time Performance (OTP) record. If this is processed by basic analytics tools, they may come up with top/bottom OTP rankers perhaps by parameters such as seat occupancy, aircraft type etc. But, that kind of retrospective data is hardly useful to make any business changes as they are more a function of what is possible than what is needed.
On the other hand, if a conclusion such as, “8 percent of flights with a delay in departure of over 12 minutes were due to delay in locating 3 or more passengers in the airport whose luggage had to get identified and offloaded. These instances were split evenly across 4 airports and 80% were on weekends” is indicated, that is an actionable insight! A conversation with customers and their operational entities would help in identifying such requirements.
In fact, we recommend that this step be conducted in two phases. Let the customer be aware of what is possible in the first exercise. Armed with this realization, customers will come back with specifics of what exactly can help their business. A workshop or a brainstorming session would be an appropriate format to execute this step.
Step 2 – Gather Data Sources
The quality of output is purely a function of the quality of input; otherwise, it is ‘garbage in - garbage out.’ What data sets are required to discover critical insights? Business users often have a keen sense of understanding of their data and its underlying causative relationships. However, it is also necessary to include data that has no apparent cause-effective relationship with what we are looking for in order to discover any hidden relationships. Basically, if we know everything we are looking for, there is nothing to find!
This step is often iterative and is referred to as ‘feature selection’ by many data analytics practitioners.
Step 3 – Iterative Fine Tuning
This step very likely needs involvement and guidance from the business users. Apart from pure analytics, there is often a learning component in our quest for gathering business insights.
Business users have to provide critical feedback on the output delivered by such a system and learning is indeed an important part of this initiative. It is important to mold the software in order to nudge it in the right direction and converge.
Based on specific business feedback, certain rules and features may be re-tuned and re-validated. The more complex the insight demanded the more iterative is the application tuning and learning required.
Step 4 – Establish Completion Criteria
Business requirements are dynamic and are prone to change. So, technically, the business insights required are also equally dynamic. However, it is practical to define the ‘stopping or DONE’ criteria to determine success for each phase of the analytics exercise. It is possible to move on to new objectives only when their pre-established success criteria have been reached.
However, there can be some unexpected twists in this process. While enterprises may possess huge amount of data, it is not always true that they have good quality data. For example, a customer with a complex software development initiative is interested in exploring methods to predict the possibility of defects in regression runs and the modules that may fail prior to actually running the regression. However, if the historical data erroneously indicates low regression failures of only 2%, a learning system developed based on this may also prove defective as the system’s learning is likely skewed. Hence, enterprise data needs pre-treatment before its use and the conversation with customer could include this.
When you encounter your customers next with a strident call for help, it would be helpful if you could initiate a conversation and hopefully these steps will help you make the next moves in drawing your customers into a constructive collaboration.
If there are any such queries from some of your customers already, we’d be happy to meet and discuss your requirement.