The Nuances of Model Interpretability


Wayne Folta

Date Published:
July 7, 2017

There is growing literature around interpretable machine learning and explaining black box outputs to humans who will make real decisions based on the results. Predictive model interpretability is a nuanced and complex subject. For example, AlphaGO, the experimental deep learning solution created by Google to play the ancient board game Go, made headlines recently for defeating a Go grandmaster for the first time. This was a significant milestone for a machine learning system since Go is significantly more complex than chess.  When Go masters took an interest in AlphaGO’s winning strategy, the program’s creators faced a familiar question: Why did it choose certain moves?


According to Tony McCaffrey, the author of an article in the Harvard Business Review, “much of the information was distributed across large swaths of the neural network and was not available to be summarized in a reasoned argument as to why a particular move was chosen.”

But why do game moves matter in the context of solving business problems? The following analogy is helpful:

Game Moves = Decisions Made

Players = Subject Matter Experts

AlphaGO made some creative and successful moves, but they were not moves that an expert human player would have made in the same context. When making a move a human expert thinks about questions like: how risky is this decision, could it lose me the game, and what is the basis for this decision? McCaffrey continued, “Until summarizing is possible, managers and CEOs who are experts in their various fields won’t trust a computer — or another human — that cannot offer a reason for why a creative yet risky move is the best one to make. ” In other words, McCaffrey is arguing that implementation requires trust, and trust requires more interpretability.

What is Interpretability?

At Elder Research, we often consider a tradeoff between model interpretability and model accuracy.  For example, to enhance predictive accuracy we may find it best to use a neural network. But, the customer wants the model results to be “interpretable”, so instead we compromise and use a logistic  regression. But what does “interpretable” really mean when we or our clients talk about it? Certainly, the person interpreting the model makes a difference; are they an expert (subject-matter or statistical) or non-expert, and what decision making role do they play?  We also need to determine why the data is being interpreted: for government clients it may be for compliance or regulation, while a commercial company may be trying to determine whether to adopt a certain business practice. We may be interpreting the data for validation of its correctness or understanding of how it works.

We’ve condensed some of these different versions of interpretability into four typical questions that our clients might ask:

  • How does your technique work?
  • How does the model actually work?
  • What did you learn from the trained model?
  • What’s behind the model scores?

Let’s look at each of these questions to see how they define interpretability from different perspectives. Misinterpreting the question — and each question often has two different ways it can be taken — might endanger our client relationships and the success of our projects.

How Does Your Technique Work?

Effective communication between the analytics team, leadership, and key stakeholders is vital for project success. When project stakeholders or business leaders ask this question they are usually seeking two different things: First, is what you are proposing reasonable? Business leaders want to feel comfortable moving forward with a solution, as they know the effort will cost money and resources, and put their reputation on the line.  The second question is, can you effectively convey technical concepts to us? They want to know that the analytics team can communicate with key stakeholders and those in non-technical business disciplines to build support for the project, rather than muddying the waters with jargon and perceived techno-intimidation.

How Does the Model Actually Work?

This question is generally asked by someone in the organization who is technically proficient, and isn’t satisfied with the previous question. They want to understand details about how the model will work in practice. By way of example we will discuss two recent projects where clients asked this question. In the first example project we worked with the client’s expert. He had a PhD in statistics and developed a very sophisticated statistical framework for doing predictive maintenance.  While working with our team and asking detailed questions he gained more depth about machine learning and techniques such as lasso regression.  When we discussed the details of how lasso regression works he said that he couldn’t believe this was the first time he was learning about it. By going into the technical weeds with this project stakeholder we built confidence in, and support for, the project.

In the second example project, we were working with a non-expert project champion who was in charge of a fraud investigation team. After several weeks on the project, we noticed some tension in her communication with our team.  She had been asking how a random forest model works and we provided high-level answers, but she wanted the technical details.  In this case, her actual question was our fourth question (What is behind the model scores), but we still answered her expressed question with a white paper on how decision trees (underlying random forests) work. She was satisfied with our responsiveness, and the communication went back to normal.  We also needed to answer her unexpressed question, but failing to answer her expressed question gave her the impression that we were talking down to her, which may have eventually derailed the project.

Open communication is vital to every analytics project. To assess model interpretability stakeholders must communicate business context and how they will use the model results. Data Scientists must understand stakeholder needs and articulate answers appropriate for the audience.  The goal is transparency in the algorithm; can a client representative look into the algorithm, understand it, and take action on the results?

What Did You Learn From the Trained Model?

Model training involves feature engineering and selection, avoiding bias and leaks from the future, and making tradeoffs based on business goals. After you’ve done all this, what have you learned that you can share with the project stakeholders?  This involves questions about model performance and about what we have learned about the business.

First, how well will the model perform once it’s deployed? The answer requires rigorous testing, including techniques that Elder Research emphasizes like permutation tests (which we call Target Shuffling). The result is vital for a go/no-go decision and the stakes rise dramatically once a “go” decision is made, so the client needs to know reliable performance expectations, not the overly optimistic ones that the typical textbook procedures provide.

The second question is:  What did you learn from the trained model that tells us something new about our business? Which features are major drivers, which features are positive or negative influences, and what features turn out to not be as influential as we thought? This is what we might call “white paper” results, where the model is used to extract insights into the business.

What is Behind the Model Scores?

When actually using an analytical system, the next question users may ask is:  Where do I look to confirm and follow up on the model’s results? For example, Elder Research developed a model to score and rank high risk cases for a team of fraud investigators. The client had a hard time making progress on certain high-risk cases that had been rated by black-box machine learning techniques. They didn’t know where to look in the data to confirm the risk and to gather risk-related case details for the investigation. This is important in situations where a legal case must be made based on the data. A judge would likely not accept an algorithm score alone as a reason to justify an investigation. It is necessary to understand the major drivers that caused a case to score high. This “explainability” is somewhat different from “transparency”.


The effective interpretability of a statistical model depends on who is interpreting it and for what purpose. Before building a model, we Data Scientists need to listen to the those who will interpret and take action on the model results.  We need to  make sure we understand the business context for the model, as well as properly understand and address client questions about our approach. If users only need to follow the model’s decision and don’t have to justify it, most interpretability issues fade in importance.  In cases requiring interpretation of the scoring details or justification of results, the tools we can use from our modeling toolbox may be more limited. Model interpretability requirements may even be such that we have to trade off between building of a simpler model or a more complex one.