# load datasetfrom pycaret.datasets import get_datadiabetes =get_data('diabetes')# init setupfrom pycaret.classification import*clf1 =setup(data = diabetes, target ='Class variable')# create a modelxgboost =create_model('xgboost')# predict on new datanew_data = diabetes.copy()new_data.drop('Class variable', axis =1, inplace =True)predict_model(xgboost, data = new_data)
Probability by class
NOTE: This is only applicable for the Classification use-cases.
# load datasetfrom pycaret.datasets import get_datadiabetes =get_data('diabetes')# init setupfrom pycaret.classification import*clf1 =setup(data = diabetes, target ='Class variable')# create a modelxgboost =create_model('xgboost')# predict on new datanew_data = diabetes.copy()new_data.drop('Class variable', axis =1, inplace =True)predict_model(xgboost, raw_score =True, data = new_data)
Setting probability threshold
NOTE: This is only applicable for the Classification use-cases (binary only).
The threshold for converting predicted probability to the class labels. Unless this parameter is set, it will default to the value set during model creation. If that wasn’t set, the default will be 0.5 for all classifiers. Only applicable for binary classification.
Before deploying a model to an AWS S3 (‘aws’), environment variables must be configured using the command-line interface. To configure AWS environment variables, type aws configure in your python command line. The following information is required which can be generated using the Identity and Access Management (IAM) portal of your amazon console account:
AWS Access Key ID
AWS Secret Key Access
Default Region Name (can be seen under Global settings on your AWS console)
Default output format (must be left blank)
GCP
To deploy a model on Google Cloud Platform ('gcp'), the project must be created using the command-line or GCP console. Once the project is created, you must create a service account and download the service account key as a JSON file to set environment variables in your local environment.
To deploy a model on Microsoft Azure ('azure'), environment variables for the connection string must be set in your local environment. Go to settings of storage account on Azure portal to access the connection string required.
AZURE_STORAGE_CONNECTION_STRING (required as environment variable)
The save_experiment function saves the experiment to a pickle file. The experiment is saved using cloudpickle to deal with lambda functions. The data or test data is NOT saved with the experiment and will need to be specified again when loading using load_experiment.
The load_experiment function loads an experiment from the path or a file. The data (and test_data) is not saved with the experiment and will need to be specified again at the time of loading.
This function transpiles the trained machine learning model's decision function in different programming languages such as Python, C, Java, Go, C#, etc. It is very useful if you want to deploy models into environments where you can't install your normal Python stack to support model inference.
This function takes an input model and creates a POST API for inference. It only creates the API and doesn't run it automatically. To run the API, you must run the Python file using !python.