Analyze
Analysis and model explainability functions in PyCaret
plot_model
This function analyzes the performance of a trained model on the hold-out set. It may require re-training the model in certain cases.
Example
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')
# creating a model
lr = create_model('lr')
# plot model
plot_model(lr, plot = 'auc')
Change the scale
The resolution scale of the figure can be changed with scale parameter.

Save the plot
You can save the plot as a png file using the save parameter.

Customize the plot
PyCaret uses Yellowbrick for most of the plotting work. Any argument that is acceptable for Yellowbrick visualizers can be passed as plot_kwargs parameter.



Use train data
If you want to assess the model plot on the train data, you can pass use_train_data=True in the plot_model function.

Plot on train data vs. hold-out data


Examples by module
Classification
Plot Name
Plot
Area Under the Curve
โaucโ
Discrimination Threshold
โthresholdโ
Precision Recall Curve
โprโ
Confusion Matrix
โconfusion_matrixโ
Class Prediction Error
โerrorโ
Classification Report
โclass_reportโ
Decision Boundary
โboundaryโ
Recursive Feature Selection
โrfeโ
Learning Curve
โlearningโ
Manifold Learning
โmanifoldโ
Calibration Curve
โcalibrationโ
Validation Curve
โvcโ
Dimension Learning
โdimensionโ
Feature Importance (Top 10)
โfeatureโ
Feature IImportance (all)
'feature_all'
Model Hyperparameter
โparameterโ
Lift Curve
'lift'
Gain Curve
'gain'
KS Statistic Plot
'ks'


















Regression
Name
Plot
Residuals Plot
โresidualsโ
Prediction Error Plot
โerrorโ
Cooks Distance Plot
โcooksโ
Recursive Feature Selection
โrfeโ
Learning Curve
โlearningโ
Validation Curve
โvcโ
Manifold Learning
โmanifoldโ
Feature Importance (top 10)
โfeatureโ
Feature Importance (all)
'feature_all'
Model Hyperparameter
โparameterโ








Clustering
Name
Plot
Cluster PCA Plot (2d)
โclusterโ
Cluster TSnE (3d)
โtsneโ
Elbow Plot
โelbowโ
Silhouette Plot
โsilhouetteโ
Distance Plot
โdistanceโ
Distribution Plot
โdistributionโ






Anomaly Detection
Name
Plot
t-SNE (3d) Dimension Plot
โtsneโ
UMAP Dimensionality Plot
โumapโ


evaluate_model
The evaluate_model displays a user interface for analyzing the performance of a trained model. It calls the plot_model function internally.

interpret_model
This function analyzes the predictions generated from a trained model. Most plots in this function are implemented based on the SHAP (Shapley Additive exPlanations). For more info on this, please see https://shap.readthedocs.io/en/latest/
Example

Save the plot
You can save the plot as a png file using the save parameter.
Change plot type
There are a few different plot types available that can be changed by the plot parameter.
Correlation

By default, PyCaret uses the first feature in the dataset but that can be changed using feature parameter.

Partial Dependence Plot

By default, PyCaret uses the first available feature in the dataset but this can be changed using the feature parameter.

Morris Sensitivity Analysis

Permutation Feature Importance

Reason Plot

When you generate reason plot without passing the specific index of test data, you will get the interactive plot displayed with the ability to select the x and y-axis. This will only be possible if you are using Jupyter Notebook or an equivalent environment. If you want to see this plot for a specific observation, you will have to pass the index in the observation parameter.

Here the observation = 1 means index 1 from the test set.
Use train data
By default, all the plots are generated on the test dataset. If you want to generate plots using a train data set (not recommended) you can use use_train_data parameter.

dashboard
The dashboard function generates the interactive dashboard for a trained model. The dashboard is implemented using ExplainerDashboard (explainerdashboard.readthedocs.io)
Dashboard Example



Video:
check_fairness
There are many approaches to conceptualizing fairness. The check_fairness function follows the approach known as group fairness, which asks: which groups of individuals are at risk for experiencing harm. check_fairness provides fairness-related metrics between different groups (also called sub-population).
Check Fairness Example


Video:
get_leaderboard
This function returns the leaderboard of all models trained in the current setup.

You can also access the trained Pipeline with this.

assign_model
This function assigns labels to the training dataset using the trained model. It is available for Clustering, Anomaly Detection, and NLP modules.
Clustering

Anomaly Detection

Last updated
Was this helpful?