Every technical project involves some sort of analytics, ranging from simply reporting key facts, to predicting new events. With the rise of Big Data and as our ability to collect and fuse data from different sources increases, advanced data types such as time series, spatial data, and graph data are moving into the analytic mainstream. However, many business leaders struggle to understand how their analytics efforts stack-up relative to their competitors and what more they could be doing more to gain a sustainable competitive advantage.
At Elder Research, we define ten increasingly sophisticated levels of analytics. However, first we need to define the categories of modeling technology.
FOUR CATEGORIES OF MODELING TECHNOLOGY
We divide modeling technology into four categories based on the type of knowledge required for use:
- Descriptive – deterministically summarize data
- Expert-Driven – computationally encode expert opinions and assumptions
- Data-Driven – induce new rules or formulas from data
- Data+Expert – combine deductive and inductive reasoning to determine causes from measured effects
Expert-driven modeling is deductive – it reasons from theory to specific cases.
Data-driven modeling is inductive – it reasons from specific cases (data) to a theory (model).
These sources of knowledge – data or expert – are independent; a modeling technology can rely on either or both, to varying degree.
Though there are potential pitfalls at all levels, we believe the accuracy and quality of the answer improves as you move up the levels. We grade data-driven inductive modeling approaches superior to expert-driven ones, as the inductive techniques allow unknown rules or relationships to be discovered from the data and are less susceptible to the biases and misconceptions common to human reasoning. On the other hand, expert-driven approaches are preferred if the data is filtered or poorly represents the full situation. Ideally, it is best to employ analytic approaches which combine both expert-driven and data-driven modeling.
TEN LEVELS OF ANALYTICS
There are further distinctions within each of the four categories of modeling technology that are useful for applying algorithms to specific problems. We have identified ten increasingly complex levels of analytics. Very often, higher levels depend on techniques from lower ones; for example, data-driven analytics techniques often rely on optimization and simulation techniques when learning structure or parameters.
The figure below summarizes the ten levels, with complexity (and with it, power and danger) rising from the bottom to the top, as well as increasing from left to right. The upward dimension changes when moving from simple Descriptive analytic tasks to more Predictive and Prescriptive, and from Business Intelligence to Advanced Analytics. The position from left to right reflects the intensity and complexity within a category.
Optimization is shown as the most advanced form of expert-driven technique, as domain knowledge is essential to creating a useful simulation or equation to optimize. But the search for parameter values is usually automated, so it can be considered a transitional form to the next level category that is data-driven.
Each level of the data-driven approaches increases complexity and power over the previous one. Parameter learning employs optimization to find the best parameter values for a fixed model structure. Structure learning performs an additional search over a large set of possible model structures. Ensembles combine multiple models having different strengths and weaknesses into a single model, which is typically more accurate and stable than any of its components. This combination of strengths makes ensemble methods the highest form of data-driven modeling.
Causal modeling draws from both data-driven and expert-driven techniques. It is like an automatic scientist using theory and data to refine a hypothesis. Expert-driven modeling depends on the expert knowing the cause, and data-driven modeling can reveal a possible cause, but only causal modeling can confirm a cause-and-effect relationship by combining both forms of knowledge to rule out alternatives.
This framework can be used to help teams understand where they are operating and what approaches are best for the analytic question and challenge they are facing. As one moves up the levels, the complexity and power of analytics increases. This makes possible greater accuracy when it’s done right, but more danger if done wrong! Successful analytics teams know how to combine data- and expert-driven approaches to maximize insight and value.
Download the Ten Levels of Analytics eBook to explore more on the ten levels of analytics and the extention of the levels to encompass emerging data types such as time series, spatial data, and graph data. These advanced data types provide data complexity as second dimension for analytics categorization alongside algorithmic sophistication.