Explain NLP models with LIME & SHAP


join(map(str, exp.

as_list(label=8))))It is obvious that this document has the highest explanation for label sql.

We also notice that the positive and negative signs are with respect to a particular label, such as word “sql” is positive towards class sql while negative towards class python, and vice versa.

We are going to generate labels for the top 2 classes for this document.

exp = explainer.

explain_instance(X_test[idx], c.

predict_proba, num_features=6, top_labels=2)print(exp.

available_labels())It gives us sql and python.


show_in_notebook(text=False)Figure 1Let me try to explain this visualization:For this document, word “sql” has the highest positive score for class sql.

Our model predicts this document should be labeled as sql with the probability of 100%.

If we remove word “sql” from the document, we would expect the model to predict label sql with the probability at 100% — 65% = 35%.

On the other hand, word “sql” is negative for class python, and our model has learned that word “range” has a small positive score for class python.

We may want to zoom in and study the explanations for class sql, as well as the document itself.


show_in_notebook(text=y_test[idx], labels=(4,))Figure 2Interpreting text predictions with SHAPThe following process were learned from this tutorial.


pyAfter model is trained, we use the first 200 training documents as our background data set to integrate over, and to create a SHAP explainer object.

We get the attribution values for individual predictions on a subset of the test set.

Transform the index to words.

Use SHAP’s summary_plot method to show the top features impacting model predictions.

attrib_data = X_train[:200]explainer = shap.

DeepExplainer(model, attrib_data)num_explanations = 20shap_vals = explainer.

shap_values(X_test[:num_explanations])words = processor.


word_indexword_lookup = list()for i in words.

keys(): word_lookup.

append(i)word_lookup = [''] + word_lookupshap.

summary_plot(shap_vals, feature_names=word_lookup, class_names=tag_encoder.

classes_)Figure 3Word “want” is the biggest signal word used by our model, contribute most to class jquery predictions.

Word “php” is the 4th biggest signal word used by our model, contributing most to class php of course.

On the other hand, word “php” is likely to have a negative signal to the other class because it is unlikely to see word “php” to appear in a python document.

There are a lot to learn in terms of machine learning interpretability with LIME & SHAP.

I have only covered a tiny piece for NLP.

Jupyter notebook can be found on Github.

Enjoy the fun!.. More details

Leave a Reply