# Analyze

## plot\_model

This function analyzes the performance of a trained model on the hold-out set. It may require re-training the model in certain cases.

### **Example**

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
lr = create_model('lr')

# plot model
plot_model(lr, plot = 'auc')
```

{% endcode %}

![Output from plot\_model(lr, plot = 'auc')](/files/RC4MdwOXcYUm6Hsq6vLm)

### **Change the scale**

The resolution scale of the figure can be changed with `scale` parameter.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
lr = create_model('lr')

# plot model
plot_model(lr, plot = 'auc', scale = 3)
```

{% endcode %}

![Output from plot\_model(lr, plot = 'auc', scale = 3)](/files/aV03M0PGnDPDxCRBHKvS)

### Save the plot

You can save the plot as a `png` file using the `save` parameter.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
lr = create_model('lr')

# plot model
plot_model(lr, plot = 'auc', save = True)
```

{% endcode %}

![Output from plot\_model(lr, plot = 'auc', save = True)](/files/rZ11AjT5BJlPjdy5lckJ)

### Customize the plot

PyCaret uses [Yellowbrick](https://www.scikit-yb.org/en/latest/) for most of the plotting work. Any argument that is acceptable for Yellowbrick visualizers can be passed as `plot_kwargs` parameter.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
lr = create_model('lr')

# plot model
plot_model(lr, plot = 'confusion_matrix', plot_kwargs = {'percent' : True})
```

{% endcode %}

![Output from plot\_model(lr, plot = 'confusion\_matrix', plot\_kwargs = {'percent' : True})](/files/2UQ5spY7vdHUnMVLBO5w)

{% tabs %}
{% tab title="Before Customization" %}
![](/files/0dE7c9Bb4UwcdJnKZgR7)
{% endtab %}

{% tab title="After Customization" %}
![](/files/Ys5y5vBT0Q5PS65f4vGI)
{% endtab %}
{% endtabs %}

### Use train data

If you want to assess the model plot on the train data, you can pass `use_train_data=True` in the `plot_model` function.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
lr = create_model('lr')

# plot model
plot_model(lr, plot = 'auc', use_train_data = True)
```

{% endcode %}

![Output from plot\_model(lr, plot = 'auc', use\_train\_data = True)](/files/cWUyFJHrRYNIbVckY86m)

#### Plot on train data vs. hold-out data

{% tabs %}
{% tab title="Train Data" %}
![](/files/bkmlEbPcEFqSjfTxb4pD)
{% endtab %}

{% tab title="Hold-out Data" %}
![](/files/ucV4FmAWxTrxrj7UjZ6y)
{% endtab %}
{% endtabs %}

### **Examples by module**

#### **Classification**

| **Plot Name**               | **Plot**            |
| --------------------------- | ------------------- |
| Area Under the Curve        | ‘auc’               |
| Discrimination Threshold    | ‘threshold’         |
| Precision Recall Curve      | ‘pr’                |
| Confusion Matrix            | ‘confusion\_matrix’ |
| Class Prediction Error      | ‘error’             |
| Classification Report       | ‘class\_report’     |
| Decision Boundary           | ‘boundary’          |
| Recursive Feature Selection | ‘rfe’               |
| Learning Curve              | ‘learning’          |
| Manifold Learning           | ‘manifold’          |
| Calibration Curve           | ‘calibration’       |
| Validation Curve            | ‘vc’                |
| Dimension Learning          | ‘dimension’         |
| Feature Importance (Top 10) | ‘feature’           |
| Feature IImportance (all)   | 'feature\_all'      |
| Model Hyperparameter        | ‘parameter’         |
| Lift Curve                  | 'lift'              |
| Gain Curve                  | 'gain'              |
| KS Statistic Plot           | 'ks'                |

{% tabs %}
{% tab title="auc" %}
![](/files/eHGfcQZQaHH8XkYiEgRQ)
{% endtab %}

{% tab title="confusion\_matrix" %}
![](/files/p0xMHQ2KCqFwxJp3pSRP)
{% endtab %}

{% tab title="threshold" %}
![](/files/E8VAzVutosUeogvMLrqH)
{% endtab %}

{% tab title="pr" %}
![](/files/N8VfJYMsJD57BhycycZi)
{% endtab %}

{% tab title="error" %}
![](/files/h7LxvTSYKuaTSMPIsc9Y)
{% endtab %}

{% tab title="class\_report" %}
![](/files/j36PMiMVs8XAcxshsYiq)
{% endtab %}

{% tab title="rfe" %}
![](/files/fH5PoDJyejpUurhSUYEa)
{% endtab %}

{% tab title="learning" %}
![](/files/9WjZai6DOL1xtLhmYTa8)
{% endtab %}

{% tab title="vc" %}
![](/files/vNy8V1V6bPJ24iGaQimt)
{% endtab %}
{% endtabs %}

{% tabs %}
{% tab title="feature" %}
![](/files/9eOJJxxrlpBfH5DuTLXv)
{% endtab %}

{% tab title="manifold" %}
![](/files/rHSOdbYCFHfsU2tcug95)
{% endtab %}

{% tab title="calibration" %}
![](/files/e3gOioH8AzznvYh5fqP1)
{% endtab %}

{% tab title="dimension" %}
![](/files/MDPV7o70ZpJajac2LaJY)
{% endtab %}

{% tab title="boundary" %}
![](/files/B9wc3ythtLOJ1HbPh2IR)
{% endtab %}

{% tab title="lift" %}
![](/files/Wnd6VPIzqGkfyDqdzeSI)
{% endtab %}

{% tab title="gain" %}
![](/files/eKQPBezNjHaZUjOdqmHl)
{% endtab %}

{% tab title="ks" %}
![](/files/GD2iE4aYnMnihJda8vGC)
{% endtab %}

{% tab title="parameter" %}
![](/files/0aEpcckQ5VJPpw1fiIts)
{% endtab %}
{% endtabs %}

#### Regression

| **Name**                    | **Plot**       |
| --------------------------- | -------------- |
| Residuals Plot              | ‘residuals’    |
| Prediction Error Plot       | ‘error’        |
| Cooks Distance Plot         | ‘cooks’        |
| Recursive Feature Selection | ‘rfe’          |
| Learning Curve              | ‘learning’     |
| Validation Curve            | ‘vc’           |
| Manifold Learning           | ‘manifold’     |
| Feature Importance (top 10) | ‘feature’      |
| Feature Importance (all)    | 'feature\_all' |
| Model Hyperparameter        | ‘parameter’    |

{% tabs %}
{% tab title="residuals" %}
![](/files/FB8hR2S8g2L0myv6GwwX)
{% endtab %}

{% tab title="error" %}
![](/files/58wiBFOyQNHGcWtos13D)
{% endtab %}

{% tab title="cooks" %}
![](/files/4dLIRugOxSzOnvWe2RsB)
{% endtab %}

{% tab title="rfe" %}
![](/files/iI0sp4XZkuh9UKEZ2bMl)
{% endtab %}

{% tab title="feature" %}
![](/files/rqFVB0udUqwQjCKiQmIT)
{% endtab %}

{% tab title="learning" %}
![](/files/B1hxSfngcRSNEsyI4ho6)
{% endtab %}

{% tab title="vc" %}
![](/files/OB2hawOpKNantFlhXN1p)
{% endtab %}

{% tab title="manifold" %}
![](/files/upGsA1HVBr2xYfDpSQyQ)
{% endtab %}
{% endtabs %}

#### Clustering

| **Name**              | **Plot**       |
| --------------------- | -------------- |
| Cluster PCA Plot (2d) | ‘cluster’      |
| Cluster TSnE (3d)     | ‘tsne’         |
| Elbow Plot            | ‘elbow’        |
| Silhouette Plot       | ‘silhouette’   |
| Distance Plot         | ‘distance’     |
| Distribution Plot     | ‘distribution’ |

{% tabs %}
{% tab title="cluster" %}
![](/files/hHV2BAS2IOJsdO1JPyLf)
{% endtab %}

{% tab title="tsne" %}
![](/files/o805cFHCXld6VZrV8HxQ)
{% endtab %}

{% tab title="elbow" %}
![](/files/lOCq4aryXb0PetEIJlPZ)
{% endtab %}

{% tab title="silhouette" %}
![](/files/GD5uO6TYvUNnmblRDoWX)
{% endtab %}

{% tab title="distance" %}
![](/files/2CTYMpac6i9qF1Rx6S0i)
{% endtab %}

{% tab title="distribution" %}
![](/files/w7jiN5Sro6JMtViHU2X6)
{% endtab %}
{% endtabs %}

#### Anomaly Detection

| **Name**                  | **Plot** |
| ------------------------- | -------- |
| t-SNE (3d) Dimension Plot | ‘tsne’   |
| UMAP Dimensionality Plot  | ‘umap’   |

{% tabs %}
{% tab title="tsne" %}
![](/files/htuqkWx3HZX66FprSE6I)
{% endtab %}

{% tab title="umap" %}
![](/files/k8r7Yo44MaeV1vdB0ZLG)
{% endtab %}
{% endtabs %}

## evaluate\_model

The `evaluate_model` displays a user interface for analyzing the performance of a trained model. It calls the [plot\_model](#plot_model) function internally.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
juice = get_data('juice')

# init setup
from pycaret.classification import *
exp_name = setup(data = juice,  target = 'Purchase')

# create model
lr = create_model('lr')

# launch evaluate widget
evaluate_model(lr)
```

{% endcode %}

![Output from evaluate\_model(lr)](/files/5wWdSsTivrAUuysdLqyl)

{% hint style="info" %}
**NOTE:** This function only works in Jupyter Notebook or an equivalent environment.
{% endhint %}

## interpret\_model

This function analyzes the predictions generated from a trained model. Most plots in this function are implemented based on the SHAP (Shapley Additive exPlanations). For more info on this, please see <https://shap.readthedocs.io/en/latest/>

### Example

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost)
```

{% endcode %}

![Output from interpret\_model(xgboost)](/files/Zz4yzva7R2lnrhXZtAKB)

### Save the plot

You can save the plot as a `png` file using the `save` parameter.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, save = True)
```

{% endcode %}

{% hint style="info" %}
**NOTE:** When `save=True` no plot is displayed in the Notebook.&#x20;
{% endhint %}

### Change plot type

There are a few different plot types available that can be changed by the `plot` parameter.

#### Correlation

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, plot = 'correlation')
```

{% endcode %}

![Output from interpret\_model(xgboost, plot = 'correlation')](/files/vGXq1NC7IoWwDPhdvic0)

By default, PyCaret uses the first feature in the dataset but that can be changed using `feature` parameter.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, plot = 'correlation', feature = 'Age (years)')
```

{% endcode %}

![Output from interpret\_model(xgboost, plot = 'correlation', feature = 'Age (years)')](/files/VnViQMiUKr36517CS829)

#### Partial Dependence Plot

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, plot = 'pdp')
```

{% endcode %}

![Output from interpret\_model(xgboost, plot = 'pdp')](/files/zd0QeWAk8VCDlE7GQLL8)

By default, PyCaret uses the first available feature in the dataset but this can be changed using the `feature` parameter.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, plot = 'pdp', feature = 'Age (years)')
```

{% endcode %}

![Output from interpret\_model(xgboost, plot = 'pdp', feature = 'Age (years)')](/files/0cVKAasb6Qt5JtPXXh4J)

#### Morris Sensitivity Analysis

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, plot = 'msa')
```

{% endcode %}

![Output from interpret\_model(xgboost, plot = 'msa')](/files/hF9xkFpCVRqWLYCVq8t9)

#### Permutation Feature Importance

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, plot = 'pfi')
```

{% endcode %}

![Output from interpret\_model(xgboost, plot = 'pfi')](/files/v3EUqehe8CCxOeuNeVtP)

#### Reason Plot

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, plot = 'reason')
```

{% endcode %}

![Output from interpret\_model(xgboost, plot = 'reason')](/files/E3g4VqFuvPXfD10hOmaz)

When you generate `reason` plot without passing the specific index of test data, you will get the interactive plot displayed with the ability to select the x and y-axis. This will only be possible if you are using Jupyter Notebook or an equivalent environment. If you want to see this plot for a specific observation, you will have to pass the index in the `observation` parameter.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, plot = 'reason', observation = 1)
```

{% endcode %}

![](/files/VTggbGHmK8zMoV79oVap)

Here the `observation = 1` means index 1 from the test set.

### Use train data

By default, all the plots are generated on the test dataset. If you want to generate plots using a train data set (not recommended) you can use `use_train_data` parameter.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# creating a model
xgboost = create_model('xgboost')

# interpret model
interpret_model(xgboost, use_train_data = True)
```

{% endcode %}

![Output from interpret\_model(xgboost, use\_train\_data = True)](/files/EhPuc21TOapTCSavydBK)

## dashboard

The `dashboard` function generates the interactive dashboard for a trained model. The dashboard is implemented using ExplainerDashboard ([explainerdashboard.readthedocs.io](https://explainerdashboard.readthedocs.io))

#### Dashboard Example

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
juice = get_data('juice')

# init setup
from pycaret.classification import *
exp_name = setup(data = juice,  target = 'Purchase')

# train model
lr = create_model('lr')

# launch dashboard
dashboard(lr)
```

{% endcode %}

![Dashboard (Classification Metrics)](/files/5LHRj3p6E4KaFryQtzIZ)

![Dashboard (Individual Predictions)](/files/v3LoMWRmyL5OaSNmOCoL)

![Dashboard (What-if analysis)](/files/YBUzck68bOjWGl1V266l)

#### Video:

{% embed url="<https://www.youtube.com/watch?v=FZ5-GtdYez0>" %}

## check\_fairness

There are many approaches to conceptualizing fairness. The `check_fairness` function follows the approach known as group fairness, which asks: which groups of individuals are at risk for experiencing harm. `check_fairness` provides fairness-related metrics between different groups (also called sub-population).

#### Check Fairness Example

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
income = get_data('income')

# init setup
from pycaret.classification import *
exp_name = setup(data = income,  target = 'income >50K')

# train model
lr = create_model('lr')

# check model fairness
lr_fairness = check_fairness(lr, sensitive_features = ['sex', 'race'])
```

{% endcode %}

![](/files/bKKmMpFfmwmx951ZbR77)

![](/files/vhZLzLgpeG2rY2OInAzg)

#### Video:

{% embed url="<https://www.youtube.com/watch?v=mjhDKuLRpM0>" %}

## get\_leaderboard

This function returns the leaderboard of all models trained in the current setup.

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# compare models
top3 = compare_models(n_select = 3)

# tune top 3 models
tuned_top3 = [tune_model(i) for i in top3]

# ensemble top 3 tuned models
ensembled_top3 = [ensemble_model(i) for i in tuned_top3]

# blender
blender = blend_models(tuned_top3)

# stacker
stacker = stack_models(tuned_top3)

# check leaderboard
get_leaderboard()
```

{% endcode %}

![Output from get\_leaderboard()](/files/Xr72d1ymFChVB4yHVDWl)

You can also access the trained Pipeline with this.&#x20;

{% code lineNumbers="true" %}

```python
# check leaderboard
lb = get_leaderboard()

# select top model
lb.iloc[0]['Model']
```

{% endcode %}

![Output from lb.iloc\[0\]\['Model'\]](/files/yEypd4toIZZDWck9HpKl)

## assign\_model

This function assigns labels to the training dataset using the trained model. It is available for [Clustering](/docs/get-started/modules.md), [Anomaly Detection](/docs/get-started/modules.md), and [NLP](/docs/get-started/modules.md) modules.

#### Clustering

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
jewellery = get_data('jewellery')

# init setup
from pycaret.clustering import *
clu1 = setup(data = jewellery)

# train a model
kmeans = create_model('kmeans')

# assign model
assign_model(kmeans)
```

{% endcode %}

![Output from assign\_model(kmeans)](/files/IoututWtAi5uZCJGfm9X)

#### Anomaly Detection

{% code lineNumbers="true" %}

```python
# load dataset
from pycaret.datasets import get_data
anomaly = get_data('anomaly')

# init setup
from pycaret.anomaly import *
ano1 = setup(data = anomaly)

# train a model
iforest = create_model('iforest')

# assign model
assign_model(iforest)
```

{% endcode %}

![Output from assign\_model(iforest)](/files/kxQsQdVAswcwxmdM6rCD)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://pycaret.gitbook.io/docs/get-started/functions/analyze.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
