Deploy
MLOps and deployment related functions in PyCaret
This function generates the label using a trained model. When
data
is None, it predicts label and score on the holdout set. 1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# create a model
10
xgboost = create_model('xgboost')
11
12
# predict on hold-out
13
predict_model(xgboost)

Output from predict_model(xgboost)
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# create a model
10
xgboost = create_model('xgboost')
11
12
# predict on new data
13
new_data = diabetes.copy()
14
new_data.drop('Class variable', axis = 1, inplace = True)
15
predict_model(xgboost, data = new_data)

Output from predict_model(xgboost, data=new_data)
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# create a model
10
xgboost = create_model('xgboost')
11
12
# predict on new data
13
new_data = diabetes.copy()
14
new_data.drop('Class variable', axis = 1, inplace = True)
15
predict_model(xgboost, raw_score = True, data = new_data)

Output from predict_model(xgboost, raw_score = True, data = new_data)
The threshold for converting predicted probability to the class labels. Unless this parameter is set, it will default to the value set during model creation. If that wasn’t set, the default will be 0.5 for all classifiers. Only applicable for binary classification.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# create a model
10
xgboost = create_model('xgboost')
11
12
# probability threshold 0.3
13
predict_model(xgboost, probability_threshold = 0.3)

Output from predict_model(xgboost, probability_threshold = 0.3)

probability threshold = 0.5 vs. probability threshold = 0.3
This function trains a given model on the entire dataset including the hold-out set.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# create a model
10
rf = create_model('rf')
11
12
# finalize a model
13
finalize_model(rf)

Output from finalize_model(rf)
This function doesn't change any parameter of the model. It only refits on the entire dataset including the hold-out set.
This function deploys the entire ML pipeline on the cloud.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# create a model
10
lr = create_model('lr')
11
12
# finalize a model
13
final_lr = finalize_model(lr)
14
15
# deploy a model
16
deploy_model(final_lr, model_name = 'lr_aws', platform = 'aws', authentication = { 'bucket' : 'pycaret-test' })

Output from deploy_model(...)
Before deploying a model to an AWS S3 (‘aws’), environment variables must be configured using the command-line interface. To configure AWS environment variables, type aws configure in your python command line. The following information is required which can be generated using the Identity and Access Management (IAM) portal of your amazon console account:
- AWS Access Key ID
- AWS Secret Key Access
- Default Region Name (can be seen under Global settings on your AWS console)
- Default output format (must be left blank)
To deploy a model on Google Cloud Platform ('gcp'), the project must be created using the command-line or GCP console. Once the project is created, you must create a service account and download the service account key as a JSON file to set environment variables in your local environment.
To deploy a model on Microsoft Azure ('azure'), environment variables for the connection string must be set in your local environment. Go to settings of storage account on Azure portal to access the connection string required.
- AZURE_STORAGE_CONNECTION_STRING (required as environment variable)
This function saves the transformation pipeline and a trained model object into the current working directory as a pickle file for later use.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# create a model
10
dt = create_model('dt')
11
12
# save pipeline
13
save_model(dt, 'dt_pipeline')

Output from save_model(dt, 'dt_pipeline')
This function loads a previously saved pipeline.
1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# create a model
10
dt = create_model('dt')
11
12
# save pipeline
13
save_model(dt, 'dt_pipeline')
14
15
# load pipeline
16
load_model('dt_pipeline')

Output from load_model('dt_pipeline')
The
save_experiment
function saves the experiment to a pickle file. The experiment is saved using cloudpickle to deal with lambda functions. The data or test data is NOT saved with the experiment and will need to be specified again when loading using load_experiment
.1
# load dataset
2
from pycaret.datasets import get_data
3
diabetes = get_data('diabetes')
4
5
# init setup
6
from pycaret.classification import *
7
clf1 = setup(data = diabetes, target = 'Class variable')
8
9
# save experiment
10
save_experiment('my_saved_experiment1')
The
load_experiment
function loads an experiment from the path or a file. The data
(and test_data
) is not saved with the experiment and will need to be specified again at the time of loading.1
# load data
2
data = get_data('diabetes')
3
4
# load experiment function
5
from pycaret.classification import load_experiment
6
clf2 = load_experiment('my_saved_experiment1', data = data)

The
check_drift
function generates a drift report file using the evidently library.1
# load dataset
2
from pycaret.datasets import get_data
3
data = get_data('insurance')
4
5
# generate drift report
6
check_drift(reference_data = data.head(500), current_data = data.tail(500), target = 'charges')
It will generate a HTML report locally.


This function transpiles the trained machine learning model's decision function in different programming languages such as Python, C, Java, Go, C#, etc. It is very useful if you want to deploy models into environments where you can't install your normal Python stack to support model inference.
1
# load dataset
2
from pycaret.datasets import get_data
3
juice = get_data('juice')
4
5
# init setup
6
from pycaret.classification import *
7
exp_name = setup(data = juice, target = 'Purchase')
8
9
# train a model
10
lr = create_model('lr')
11
12
# convert a model
13
convert_model(lr, 'java')

Output from convert_model(lr, 'java')
This function takes an input model and creates a POST API for inference. It only creates the API and doesn't run it automatically. To run the API, you must run the Python file using
!python
.1
# load dataset
2
from pycaret.datasets import get_data
3
juice = get_data('juice')
4
5
# init setup
6
from pycaret.classification import *
7
exp_name = setup(data = juice, target = 'Purchase')
8
9
# train a model
10
lr = create_model('lr')
11
12
# create api
13
create_api(lr, 'lr_api')
14
15
# run api
16
!python lr_api.py

Output from create_api(lr, 'lr_api')
Once you initialize API with the
!python
command. You can see the server on localhost:8000/docs.
FastAPI server hosted on localhost
This function creates a
Dockerfile
and requirements.txt
for productionalizing API end-point.1
# load dataset
2
from pycaret.datasets import get_data
3
juice = get_data('juice')
4
5
# init setup
6
from pycaret.classification import *
7
exp_name = setup(data = juice, target = 'Purchase')
8
9
# train a model
10
lr = create_model('lr')
11
12
# create api
13
create_api(lr, 'lr_api')
14
15
# create docker
16
create_docker('lr_api')

Output from create_docker('lr_api')
You can see two files are created for you.

%load requirements.txt

%load DockerFile
This function creates a basic
gradio
app for inference. It will later be expanded for other app types such Streamlit
.1
# load dataset
2
from pycaret.datasets import get_data
3
juice = get_data('juice')
4
5
# init setup
6
from pycaret.classification import *
7
exp_name = setup(data = juice, target = 'Purchase')
8
9
# train a model
10
lr = create_model('lr')
11
12
# create app
13
create_app(lr)

Output from create_app(lr)
Last modified 2mo ago