PyCaret Official

Searchโฆ

GET STARTED

IMPORTANT LINKS

Scale and Transform

Normalize

Normalization is a technique often applied as part of data preparation for machine learning. The goal of normalization is to rescale the values of numeric columns in the dataset without distorting differences in the ranges of values or losing information. There are several methods available for normalization, by default, PyCaret uses

`zscore`

.**normalize: bool, default = False**When set to`True`

, the feature space is transformed using the method defined under the`normalized_method`

parameter.**normalize_method: string, default = โzscoreโ**Defines the method to be used for normalization. By default, the method is set to`zscore`

. The other available options are:

The standard zscore is calculated as z = (x โ u) / s**z-score**

scales and translates each feature individually such that it is in the range of 0 โ 1.**minmax**scales and translates each feature individually such that the maximal absolute value of each feature will be 1.0. It does not shift/center the data and thus does not destroy any sparsity.`maxabs`

scales and translates each feature according to the Interquartile range. When the dataset contains outliers, the robust scaler often gives better results.`robust`

1

# load dataset

2

from pycaret.datasets import get_data

3

pokemon = get_data('pokemon')

4

โ

5

# init setup

6

from pycaret.classification import *

7

clf1 = setup(data = pokemon, target = 'Legendary', normalize = True)

Copied!

Effect of Normalization:

Feature Transform

While **normalization**** **rescales the data within new limits to reduce the impact of magnitude in the variance, Feature transformation is a more radical technique. Transformation changes the shape of the distribution such that the transformed data can be represented by a normal or approximate normal distribution. There are two methods available for transformation

`yeo-johnson`

and `quantile`

.**transformation: bool, default = False**When set to`True`

, a power transformer is applied to make the data more normal / Gaussian-like. This is useful for modeling issues related to heteroscedasticity or other situations where normality is desired. The optimal parameter for stabilizing variance and minimizing skewness is estimated through maximum likelihood.**transformation_method: string, default = โyeo-johnsonโ**Defines the method for transformation. By default, the transformation method is set to`yeo-johnson`

. The other available option is`quantile`

transformation. Both the transformation transforms the feature set to follow a Gaussian-like or normal distribution. Quantile transformer is non-linear and may distort linear correlations between variables measured at the same scale.

1

# load dataset

2

from pycaret.datasets import get_data

3

pokemon = get_data('pokemon')

4

โ

5

# init setup

6

from pycaret.classification import *

7

clf1 = setup(data = pokemon, target = 'Legendary', transformation = True)

Copied!

Dataframe view before transformation

Dataframe view after transformation

Effect of Feature Transformation:

Target Transform

`box-cox`

`yeo-johnson`

.**transform_target: bool, default = False**When set to`True`

, the target variable is transformed using the method defined in`transform_target_method`

parameter. Target transformation is applied separately from feature transformations.**transform_target_method: string, default = โbox-coxโ**`box-cox`

requires input data to be strictly positive, while`yeo-johnson`

supports both positive and negative data. When`transform_target_method`

is`box-cox`

and target variable contains negative values, the method is internally forced to`yeo-johnson`

to avoid any exceptions.

Example

1

# load dataset

2

from pycaret.datasets import get_data

3

diamond = get_data('diamond')

4

โ

5

# init setup

6

from pycaret.regression import *

7

reg1 = setup(data = diamond, target = 'Price', transform_target = True)

Copied!

Before

Dataframe view before target transformation

After

Dataframe view after target transformationn

`pycaret.classification module.`

Last modified 3mo ago

Export as PDF

Copy link

Edit on GitHub