Data Science

Machine Learning Model Interpretation

To either a model-driven company or a company catching up with the rapid adoption of AI in the industry, machine learning model interpretation has become a key factor that helps to make decisions towards promoting models into business. This is not an easy task — imagine trying to explain a mathematical theory to your parents. Yet business owners should always be curious about these models, and some questions easily raise:

  • How do machine learning models work?
  • How reliable are the predictions made from these models?
  • How carefully do we have to monitor these models?

As people understand more and more about the need and importance of model interpretation, data scientists need to build up a model interpretation strategy at the beginning of model building process that helps them to answer the questions above, for many possible reasons such as auditing. Such a strategy requires a structure aligned with the scope for interpretability:

  • model algorithm and model hyperparameter,
  • the model training process,
  • model interpretability metrics,
  • human-friendly explanations.

At this point, we are talking about a model as an asset in business. We need to understand that producing predictions is only the basics of a model. We also need to make sure that the results of the model can be understood in a business sense. A good model interpretation requires a thorough understanding of the model life cycle as well as a clear and intuitive demonstration in each step.

Interpretable Models

One of the easier ways to handle model interpretation is to start with an interpretable model, such as linear regression, logistic regression, decision tree, CNN, etc. High interpretability and efficiency are part of the main reasons that these models are widely used across different industries. They have a relative easy algorithm and training process, which are easy to monitor and adjust as a result.

png | png
Kaggle Housing Price | Scikit-Learn

Feature importance is a very intuitive score that is implemented in many packages. A feature with higher importance not only contributes more to the prediction made by the model but also accounts for more in the prediction error. As a numerical score, importance can be used to create easy visualizations to present model training results. Feature importance not only provides a straightforward understanding of features but also provides a reliable metric for the dimension reduction process.

  • Filter methods select a subset of features and compare model performance, and they generate metrics like linear discriminant analysis, ANOVA, Chi-Square, etc.
  • Wrapper methods train a model using a selected subset of features, and feature importance serves as a criterion depending on the selection or elimination method used.

Neural Networks

People have seen how powerful some of the neural network models can be. Cool visualizations illustrating how these models work help a lot for people to understand them more. Examples include illustrating cues to find the “focus” of deep neural networks in image recognition. Because of the linear nature of some neural networks, it is not hard to trace back several layers of neurons and figure out where the model is focusing on when it is making a prediction.

png

Source: Open AI

However, even though sometimes a nice neural network demo looks nice and informative, it may not show you the feature importance information, i.e. which factors drive the model to make such a prediction. The classic example of tensorflow MNIST model using Convolutional Neural Network has an easy trace-back of which pixels are driving the prediction:

png

Source: Not another MNIST tutorial with TensorFlow

Recently, tensorflow also introduced a new tf-explain package that helps to generate such trace-back for image recognition as well.

function myfunction($param){
if(1 > 5){
 echo "this is working";
}
else{
 echo "this is not working";
}
}

from tf_explain.callbacks.occlusion_sensitivity import OcclusionSensitivityCallback

callbacks = [
    GradCAMCallback(
        validation_data=(x_val, y_val),
        layer_name="activation_1",
        class_index=0,
        output_dir=output_dir,
    )
]

model.fit(x_train, y_train, batch_size=2, epochs=2, callbacks=callbacks)

!tensorboard --logdir logs

png

Source: TensorFlow Explain

We have to keep in mind that each example above uses methods designed for the specific type of models or problems. There are a lot of exploratory works and hyperparameter tuning to do before such a nice demonstration can be made.

LIME and Shapley Values

Over the years, data scientists and researchers have developed tools to help with model interpretation, and Local Interpretable Model-Agnostic Explanations (LIME) is one of the tools based on the paper “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. The goal is obvious from the title of the paper. LIME does a good job of giving a meaningful estimate of feature importance for a given test data using the idea of Shapley Values, which is a game theory method of assigning weights to features depending on their contribution to the final prediction.

png
Source: UC Business Analytics R Programming Guide

However, in contrast to a global surrogate model, which is an approximated interpretable model of a black-box model, being local is one of the biggest disadvantages. Seeing LIME results for a particular data point, which could be quite an abnormal one, does not tell enough story of the whole model. The correct interpretation of “local” is hard to answer as well. To have a better understanding of the model, we would need to run tests on a large number of data points. To go from a local observation to a global picture could cost us a lot of computation power. Also, creating a local surrogate model works fine when the black box model consists of a smooth function, but not so well when the black box model contains many discrete (categorical) variables.

Concluding Remarks

Even with all these data science tools, model interpretability is still one of the main challenges in machine learning as our models are also growing in complexity. It is one of the most important attributes of a model asset from a business perspective. Researchers are trying their best to reveal the mysterious yet fascinating “black-box” of AI models. Only after we understand our AI models from a deeper level can we improve our AI models to a whole new level.

Resources

Interpretable Machine Learning is a very good online book about this topic by Christoph Molnar for people interested in more details.

All ModelOp Blog Posts 

Q&A with Ben Mackenzie, AI Architect

Q&A with Ben Mackenzie, AI Architect

2 Minute Read By Ben Mackenzie & Linda Maggi How AI Architects are the Key to Operationalize and Scale Your AI Initiatives Each week we meet more and more clients who are realizing the importance of operationalizing the AI model lifecycle and who are dismissing...

Behind the scene of ModelOp by our Brewmasters- Part1

Behind the scene of ModelOp by our Brewmasters- Part1

2 Minute Read By Greg Lorence As a long-time homebrewer, when our President, Scott asked me, “wouldn’t it be cool if you and Jim brewed a beer to commemorate our rebrand later this year?” my reaction, after the immediate “heck yeah! Beer is awesome”, was honestly...

Open Data Group Officially Becomes ModelOp

Open Data Group Officially Becomes ModelOp

2 Minute Read By ModelOp Today, Open Data Group rebrands as ModelOp. Read more on Globe Newswire It is an exciting day for us, if only because people will stop asking “Why are you called Open Data Group?” after they understand what we do. More importantly the name...

Gartner & WIA Conferences Exit Poll

Gartner & WIA Conferences Exit Poll

2 Minute Read By Garrett Long As we continue into our “Year of Model Operations”, I thought it would be useful to highlight some of the key things I observed, learned and shared over the last few weeks at both the Gartner Data and Analytics Summit March 18-21, 2019 in...

Matching for Non Random Studies

Experimental designs such as A/B testing are a cornerstone of statistical practice. By randomly assigning treatments to subjects, we can test the effect of a test versus a control (as in a clinical trial for a proposed new drug) or can determine which of several web...

Distances And Data Science

We're all aware of what 'distance' means in real-life scenarios, and how our notion of what 'distance' means can change with context. If we're talking about the distance from the ODG office to one of our favorite lunch spots, we probably mean the distance we walk when...

Communicating between Go and Python or R

Data science and engineering teams at Open Data Group are polyglot by design: we like to choose the best tool for the task at hand. Most of the time, this means our services and components communicate through things like client libraries and RESTful APIs. But...

Open Data Group Announces FastScore™ Release 1.10

New Features for Leading Model Operations System Further Automate and Accelerate Deployment of Analytic Models into Production Chicago, IL. – June 10, 2019 – Open Data Group, the leader in model operations (ModelOps) systems for large enterprises, today announced the...