Prototypes & Criticisms - Explain Black Box Models and Underlying Data Distributions Effectively

Post by
R&D Team prototypes and criticisms


How do humans make complex decisions? According to Cohen 1996 and Simon & Newell 1972, humans do exemplar-based reasoning for complex decisions. Numerous studies of human reasoning have shown that the use of examples is fundamental to developing effective strategies for complex decision making. We try to justify our reasoning by thinking about similar experiences or completely different experiences from our lives. From the world of machine learning, in a supervised learning setting, example-based classifiers have been shown to achieve comparable performance to non-interpretable methods, while offering a condensed view of a dataset (Bien and Tibshirani, 2011). 

We call these examples, prototypes - a data point that best summarizes and represents the underlying dataset or its distribution to communicate meaning insights. For example, for loyal customers, we would like to identify a non-loyal customer with very similar underlying data distribution so we can try to convert that specific non-loyal customer to a loyal customer. Or in a classification example, we would like to understand the model-behavior and predictions that are most representative of the underlying data. 

over-generalization been at el
An example of over-generalization (Been et al)

However, relying on examples alone is not enough because it can lead to over-generalization and misunderstanding. Therefore, we couple prototypes with criticisms. Criticisms are the exact opposites of prototypes. These are instances that are not well represented by the set of prototypes. These instances help us identify where we can expect divergences within our dataset. We’ll get to criticisms in a later article. 

explainX finds prototypes and criticisms with just a single click
Prototypes & Criticisms (Been et al)

As data scientists, we should be able to use prototypes and criticisms independently from a machine learning model to describe the data, but also to create an interpretable model or to make a black-box model interpretable. The explainX framework comes built-in with the ProtoDash algorithm - an efficient and fast data selection method that selects prototypes with importance weights. 

If you’re curious about the mathematical framework behind the algorithm, we would highly recommend you read the
original paper.

Prototypes in Action

So let’s see an example where we can use prototypical analysis from our real life?

Here we consider an e-commerce dataset from a large retailer with two years of customer data ranging from 2015 to the end of 2016. This dataset contains information about roughly 80 million customers out of which 2 million are loyal customers. We also know that in 2015, we had 10,000 regular customers which later became loyal customers in 2016. The goal is to accurately predict the total expenditure of a customer and to evaluate if being a loyalty or a regular customer has any effect on his behavior independent of factors such as the number of online visits, his geo or zip, average time per visit, the average number of pages viewed per visit, brand affinities, color and finish affinities, which are the attributes in the dataset. 

We build a machine learning model from the 2016 data and train it on the 2016 data. We use that training to evaluate its performance on the 2015 data we have for 10,000 regular customers that were among the loyal customers in 2016 but not in 2015. As proof of concept, we are evaluating how accurately we can predict the expenditure of these 10,000 customers in 2015, with a model that is built using the 2016 data. 

For training our algorithm, entire loyal customers were part of the training. As we have such an imbalanced dataset with few instances of converted customers, what can we do to increase the accuracy of our machine learning model? To address this problem, we can use ProtoDash algorithm to help us choose prototypes from the regular customer base that best represents the loyalty group. 

After selecting the method, we observe the Root Mean Squared Error (RMSE) and clearly see that by building our training data set with ProtoDash leads to a lower RMSE. This also helps us fight biases as now our model is trained on a dataset that is best represented of the underlying data distribution. This ensures consistency, fair AI and improved accuracy. 

ProtoDash performance results in lower RMSE of the regression model.

This algorithm can be further used in a loan approval application. After using a machine learning model to make predictions about a customer’s future loan approval status, a loan officer can understand how and why the model came to a certain prediction so that the officer can make an informed and trusted decision. After using prototypical analysis, that works with existing predictive models, he can clearly show how customers compare to other customers who had a similar profile and repayment records to the model’s prediction for the current customer. 

For instance, if the AI model recommended that Bob should be denied the loan, the prototypical analysis will use that prediction and identify customers will similar profiles: Alan and Alicia had many important features in common that are considered as red-flags for loan granting bodies. This information helps evaluate the applicant’s risk and helps the loan officer identify areas that need improvement. By using prototypical analysis, a loan officer can build a repeatable decision-making system that feeds in predictions from the AI model and then re-confirmed with confidence by the human-in-loop.  

Prototypical analysis with explainX

ExplainX utilizes an optimized version of the ProtoDash algorithm that works with any black-box machine learning algorithm to identify similar prototypes to strengthen your decision making. Interested to give it a try? Try out our open-source library at

Looking to implement xAI in your organization? Contact us now

Share this with your friends.

Ready to dive in?
Create your free account now.