Professional Version

explainX.ai
Welcome!

explainX.ai is an end to end explainable AI framework for model developers to fully understand, explain & debug their machine learning models.

Get started with the
explainX.ai Sandbox Environment

Your Sample Use Case
Learn more in the docs   >

Each data scientist can easily follow the instructions line-by-line to build the first use case. You just need to open up your Jupyter notebook to get started.

Please note that this example will only work on local host. If you are running your notebook on the cloud, please visit our documentation to follow a step-by-step example.

Open up your terminal and install the explainx library. As we are using Random Forest Model for this example, make sure you have the sklearn library installed.

Once you are done installing the library, open up your jupyter notebook and import the required modules.

Let's load the HELOC Loan Application Approval dataset by FICO that is already available in the explainX library.

Let's also split our dataset into training and testing.

As this is a classification problem, we will be training a simple RandomForestClassifier.

After the training is complete, we just need to pass the model and our test datasets into the explainx function.

The function will take a few seconds or minutes, depending on the size of the data.
Once the function has successfully executed, simply click on the link to view the dashboard and start explaining the model:

Analyzing the
explainX.ai Dashboard

Congratulations for successfully running the explainX module. Now we can get access to the dashboard where we can explain and debug our black-box models.

1. View Your Dataset

Get a snapshot of your test data.

2. Feature Influence on a Global Level

Right off the bat explainX offers features attributions that uses techniques like SHAP, LIME & Integrated Gradients at the backend.
By utilizing this information, users can easily uncover how much each variable has contributed to the overall model.

On the right, we have the insights generator that allows to explain what this graph means to the model developer. Exposing the model developer with concepts like average probability, the model developer is able to better quantify the impact each variable has on the model output.

Aggregate Feature Importance

The global feature importance identifies top variables with high impact values, according to the model, on the overall prediction.

Global Feature Impact

The global feature impact identifies whether the variable had an overall positive impact or an overall negative impact, according to the model, on the overall prediction.

3. Regional Level Explanation by using SQL

We can figure out how our model behaves in different subsets of our data. For example, the model might behave differently on customers in NYC than it did on customers in LA.
So by using SQL query, adding these rules and exploring behavior on subset of data becomes extremely valuable. 

4. Local Level Explanation for a Single Prediction Point

Next, you can zoom in on a single instance and examine what the model predicts for this input, and explain why.
For each prediction point i.e. a new customer, the model might behave differently and it is important to identify that.

One of the cool things about explainX is that our local level explanation is coupled with a what-if analysis form. The purpose of this form is to enable data scientists and business users simulate various scenarios and explore how the model behaves.

Local Feature Impact

Local Feature Impact is narrowing down of the global feature impact graph. This shows the decision plot and how much each feature contributed to the overall model prediction for a specific data point.

Similar Prototypes

We also understand that humans think through similar examples so in order to further dig deeper into how the model prediction, users can generate similar profiles from within the dataset that had similar attributes. These similar profiles show similarities in model prediction and our ground truth values.

5. Feature Interactions

If a machine learning model makes a prediction based on two features, we can decompose the prediction into four terms: a constant term, a term for the first feature, a term for the second feature and a term for the interaction between the two features.
The interaction between two features is the change in the prediction that occurs by varying the features after considering the individual feature effects.

Partial Dependence Plot (PDP) Plot

The partial dependence plot (short PDP or PD plot) shows the marginal effect one or two features have on the predicted outcome of a machine learning model. In this example, we can see that as AGE increase, its impact on the predicting outcome also increase which results in a higher prediction (mean value of the house) represented by the increase color contrast.

Summar Plot

The summary plot combines feature importance with feature effects. Each point on the summary plot is an impact value for a feature and an instance. The position on the y-axis is determined by the feature and on the x-axis by the impact value. The colour represents the value of the feature from low to high.

6. Data Distributions

Histograms & Violet Plots

Use the histogram to get a distribution of your numerical or categorical variables.
Use joint violet plots to get basic statistics summary like mean, median, model, Q1, Q3, Q4 for your variables. Now you can identify the distribution of the predicting variable on top of your other variables to find a join distribution.

6. Data Distributions

Histograms & Violet Plots

Use the histogram to get a distribution of your numerical or categorical variables.
Use joint violet plots to get basic statistics summary like mean, median, model, Q1, Q3, Q4 for your variables. Now you can identify the distribution of the predicting variable on top of your other variables to find a join distribution.