Comment on page
Optimize
Optimization functions in PyCaret
This function tunes the hyperparameters of the model. The output of this function is a scoring grid with cross-validated scores by fold. The best model is selected based on the metric defined in
optimize
parameter. Metrics evaluated during cross-validation can be accessed using the get_metrics
function. Custom metrics can be added or removed using add_metric
and remove_metric
function.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(data = boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# tune model
13
tuned_dt = tune_model(dt)

Output from tune_model(dt)
To compare the hyperparameters.
1
# default model
2
print(dt)
3
4
# tuned model
5
print(tuned_dt)

Model hyperparameters before and after tuning
Hyperparameter tuning at the end of the day is an optimization that is constrained by the number of iterations, which eventually depends on how much time and resources you have available. The number of iterations is defined by
n_iter
. By default, it is set to 10
.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(data = boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# tune model
13
tuned_dt = tune_model(dt, n_iter = 50)

Output from tune_model(dt, n_iter = 50)
n_iter = 10
n_iter = 50


When you are tuning the hyperparameters of the model, you must know which metric to optimize for. That can be defined under
optimize
parameter. By default, it is set to Accuracy
for classification experiments and R2
for regression.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(data = boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# tune model
13
tuned_dt = tune_model(dt, optimize = 'MAE')

Output from tune_model(dt, optimize = 'MAE')
The tuning grid for hyperparameters is already defined by PyCaret for all the models in the library. However, if you wish you can define your own search space by passing a custom grid using
custom_grid
parameter. 1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# define search space
13
params = {"max_depth": np.random.randint(1, (len(boston.columns)*.85),20),
14
"max_features": np.random.randint(1, len(boston.columns),20),
15
"min_samples_leaf": [2,3,4,5,6]}
16
17
# tune model
18
tuned_dt = tune_model(dt, custom_grid = params)

Output from tune_model(dt, custom_grid = params)
PyCaret integrates seamlessly with many different libraries for hyperparameter tuning. This gives you access to many different types of search algorithms including random, bayesian, optuna, TPE, and a few others. All of this just by changing a parameter. By default, PyCaret using
RandomGridSearch
from the sklearn and you can change that by using search_library
and search_algorithm
parameter in the tune_model
function.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# tune model sklearn
13
tune_model(dt)
14
15
# tune model optuna
16
tune_model(dt, search_library = 'optuna')
17
18
# tune model scikit-optimize
19
tune_model(dt, search_library = 'scikit-optimize')
20
21
# tune model tune-sklearn
22
tune_model(dt, search_library = 'tune-sklearn', search_algorithm = 'hyperopt')
scikit-learn
optuna
scikit-optimize
tune-sklearn




By default PyCaret's
tune_model
function only returns the best model as selected by the tuner. Sometimes you may need access to the tuner object as it may contain important attributes, you can use return_tuner
parameter.1
# load dataset
2
from pycaret.datasets importh get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# tune model and return tuner
13
tuned_model, tuner = tune_model(dt, return_tuner=True)

Output from tune_model(dt, return_tuner=True)
1
type(tuned_model), type(tuner)

Output from type(tuned_model), type(tuner)
1
print(tuner)

Output from print(tuner)
Often times the
tune_model
will not improve the model performance. In fact, it may end up making performance worst than the model with default hyperparameters. This may be problematic when you are not actively experimenting in the Notebook rather you have a python script that runs a workflow of create_model
--> tune_model
or compare_models
--> tune_model
. To overcome this issue, you can use choose_better
. When set to True
it will always return a better performing model meaning that if hyperparameter tuning doesn't improve the performance, it will return the input model.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# tune model
13
dt = tune_model(dt, choose_better = True)

Output from tune_model(dt, choose_better = True)
NOTE:
choose_better
doesn't affect the scoring grid that is displayed on the screen. The scoring grid will always present the performance of the best model as selected by the tuner, regardless of the fact that output performance < input performance.This function ensembles a given estimator. The output of this function is a scoring grid with CV scores by fold. Metrics evaluated during CV can be accessed using the
get_metrics
function. Custom metrics can be added or removed using add_metric
and remove_metric
function.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# ensemble model
13
bagged_dt = ensemble_model(dt)

Output from ensemble_model(dt)
1
type(bagged_dt)
2
# >>> sklearn.ensemble._bagging.BaggingRegressor
3
4
print(bagged_dt)

Output from print(bagged_dt)
1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# ensemble model
13
bagged_dt = ensemble_model(dt, fold = 5)

Output from ensemble_model(dt, fold = 5)
The model returned by this is the same as above, however, the performance evaluation is done using 5 fold cross-validation.
Bagging, also known as Bootstrap aggregating, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. It also reduces variance and helps to avoid overfitting. Although it is usually applied to decision tree methods, it can be used with any type of method. Bagging is a special case of the model averaging approach.

Boosting is an ensemble meta-algorithm for primarily reducing bias and variance in supervised learning. Boosting is in the family of machine learning algorithms that convert weak learners to strong ones. A weak learner is defined to be a classifier that is only slightly correlated with the true classification (it can label examples better than random guessing). In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification.

There are two possible ways you can ensemble your machine learning model with
ensemble_model
. You can define this in the method
parameter.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# ensemble model
13
boosted_dt = ensemble_model(dt, method = 'Boosting')

Output from ensemble_model(dt, method = 'Boosting')
1
type(boosted_dt)
2
# >>> sklearn.ensemble._weight_boosting.AdaBoostRegressor
3
4
print(boosted_dt)

Output from print(boosted_dt)
By default, PyCaret uses 10 estimators for both
Bagging
or Boosting
. You can increase that by changing n_estimators
parameter.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
dt = create_model('dt')
11
12
# ensemble model
13
ensemble_model(dt, n_estimators = 100)

Output from ensemble_model(dt, n_estimators = 100)
Often times the
ensemble_model
will not improve the model performance. In fact, it may end up making performance worst than the model with ensembling. This may be problematic when you are not actively experimenting in the Notebook rather you have a python script that runs a workflow of create_model
--> ensemble_model
or compare_models
--> ensemble_model
. To overcome this issue, you can use choose_better
. When set to True
it will always return a better performing model meaning that if hyperparameter tuning doesn't improve the performance, it will return the input model.1
# load dataset
2
from pycaret.datasets import get_data
3
boston = get_data('boston')
4
5
# init setup
6
from pycaret.regression import *
7
reg1 = setup(boston, target = 'medv')
8
9
# train model
10
lr = create_model('lr')
11
12
# ensemble model
13
ensemble_model(lr, choose_better = True)

Output from ensemble_model(lr, choose_better = True)
Notice that with
choose_better = True
the model returned from the ensemble_model
is a simple LinearRegression
instead of BaggedRegressor
. This is because the performance of the model didn't improve after ensembling and hence input model is returned. This function trains a Soft Voting / Majority Rule classifier for select models passed in the
estimator_list
parameter. The output of this function is a scoring grid with CV scores by fold. Metrics evaluated during CV can be accessed using the get_metrics
function. Custom metrics can be added or removed using add_metric
and remove_metric
function.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# blend models
15
blender = blend_models([lr, dt, knn])

Output from blend_models([lr, dt, knn])
1
type(blender)
2
# >>> sklearn.ensemble._voting.VotingClassifier
3
4
print(blender)

Output from print(blender)
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# blend models
15
blender = blend_models([lr, dt, knn], fold = 5)

Output from blend_models([lr, dt, knn], fold = 5)
The model returned by this is the same as above, however, the performance evaluation is done using 5 fold cross-validation.
You can also automatically generate the list of input estimators using the compare_models function. The benefit of this is that you do not have the change your script at all. Every time the top N models are used as an input list.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# blend models
10
blender = blend_models(compare_models(n_select = 3))

Output from blend_models(compare_models(n_select = 3))
Notice here what happens. We passed
compare_models(n_select = 3
as an input to blend_models
. What happened internally is that the compare_models
function got executed first and the top 3 models are then passed as an input to the blend_models
function. 1
print(blender)

Output from print(blender)
In this example, the top 3 models as evaluated by the
compare_models
are LogisticRegression
, LinearDiscriminantAnalysis
, and RandomForestClassifier
.When
method = 'soft'
, it predicts the class label based on the argmax of the sums of the predicted probabilities, which is recommended for an ensemble of well-calibrated classifiers.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# blend models
15
blender_soft = blend_models([lr,dt,knn], method = 'soft')

Output from blend_models([lr,dt,knn], method = 'soft')
When the
method = 'hard'
, it uses the predictions (hard labels) from input models instead of probabilities.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# blend models
15
blender_hard = blend_models([lr,dt,knn], method = 'hard')

Output from blend_models([lr,dt,knn], method = 'hard')
The default method is set to
auto
which means it will try to use soft
method and fall back to hard
if the former is not supported, this may happen when one of your input models does not support predict_proba
attribute.By default, all the input models are given equal weight when blending them but you can explicitly pass the weights to be given to each input model.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# blend models
15
blender_weighted = blend_models([lr,dt,knn], weights = [0.5,0.2,0.3])

Output from blend_models([lr,dt,knn], weights = [0.5,0.2,0.3])
You can also tune the weights of the blender using the
tune_model
.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# blend models
15
blender_weighted = blend_models([lr,dt,knn], weights = [0.5,0.2,0.3])
16
17
# tune blender
18
tuned_blender = tune_model(blender_weighted)

Output from tune_model(blender_weighted)
1
print(tuned_blender)

Output from print(tuned_blender)
Often times the
blend_models
will not improve the model performance. In fact, it may end up making performance worst than the model with blending. This may be problematic when you are not actively experimenting in the Notebook rather you have a python script that runs a workflow of compare_models
--> blend_models
. To overcome this issue, you can use choose_better
. When set to True
it will always return a better performing model meaning that if blending the models doesn't improve the performance, it will return the single best performing input model.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# blend models
15
blend_models([lr,dt,knn], choose_better = True)

Output from blend_models([lr,dt,knn], choose_better = True)
Notice that because
choose_better=True
the final model returned by this function is LogisticRegression
instead of VotingClassifier
because the performance of Logistic Regression was most optimized out of all the given input models plus the blender.This function trains a meta-model over select estimators passed in the
estimator_list
parameter. The output of this function is a scoring grid with CV scores by fold. Metrics evaluated during CV can be accessed using the get_metrics
function. Custom metrics can be added or removed using add_metric
and remove_metric
function.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# stack models
15
stacker = stack_models([lr, dt, knn])

Output from stack_models([lr, dt, knn])
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# stack models
15
stacker = stack_models([lr, dt, knn], fold = 5)

Output from stack_models([lr, dt, knn], fold = 5)
The model returned by this is the same as above, however, the performance evaluation is done using 5 fold cross-validation.
You can also automatically generate the list of input estimators using the compare_models function. The benefit of this is that you do not have the change your script at all. Every time the top N models are used as an input list.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# stack models
10
stacker = stack_models(compare_models(n_select = 3))

Output from stack_models(compare_models(n_select = 3))
Notice here what happens. We passed
compare_models(n_select = 3
as an input to stack_models
. What happened internally is that the compare_models
function got executed first and the top 3 models are then passed as an input to the stack_models
function. 1
print(stacker)

Output from print(stacker)
In this example, the top 3 models as evaluated by the
compare_models
are LogisticRegression
, RandomForestClassifier
, and LGBMClassifier
.There are a few different methods you can explicitly choose for stacking or pass
auto
to be automatically determined. When set to auto
, it will invoke, for each model, predict_proba
, decision_function
or predict
function in that order. Alternatively, you can define the method explicitly.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# stack models
15
stacker = stack_models([lr, dt, knn], method = 'predict')

Output from stack_models([lr, dt, knn], method = 'predict')
When no
meta_model
is passed explicitly, LogisticRegression
is used for Classification experiments and LinearRegression
is used for Regression experiments. You can also pass a specific model to be used as a meta-model.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# train meta-model
15
lightgbm = create_model('lightgbm')
16
17
# stack models
18
stacker = stack_models([lr, dt, knn], meta_model = lightgbm)

Output from stack_models([lr, dt, knn], meta_model = lightgbm)
1
print(stacker.final_estimator_)

Output from print(stacker.final_estimator_)
There are two ways you can stack models. (i) only the predictions of input models will be used as training data for meta-model, (ii) predictions as well as the original training data is used for training meta-model.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a few models
10
lr = create_model('lr')
11
dt = create_model('dt')
12
knn = create_model('knn')
13
14
# stack models
15
stacker = stack_models([lr, dt, knn], restack = False)

Output from stack_models([lr, dt, knn], restack = False)
This function optimizes the probability threshold for a trained model. It iterates over performance metrics at different
probability_threshold
with a step size defined in grid_interval
parameter. This function will display a plot of the performance metrics at each probability threshold and returns the best model based on the metric defined under optimize
parameter.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a model
10
knn = create_model('knn')
11
12
# optimize threshold
13
optimized_knn = optimize_threshold(knn)

Output from optimize_threshold(knn)
1
print(optimized_knn)

Output from print(optimized_knn)
This function calibrates the probability of a given model using isotonic or logistic regression. The output of this function is a scoring grid with CV scores by fold. Metrics evaluated during CV can be accessed using the
get_metrics
function. Custom metrics can be added or removed using add_metric
and remove_metric
function.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# train a model
10
dt = create_model('dt')
11
12
# calibrate model
13
calibrated_dt = calibrate_model(dt)

Output from calibrate_model(dt)
1
print(calibrated_dt)

Output from print(calibrated_dt)
Before Calibration
After Calibration


Last modified 8mo ago