Comment on page
Machine Learning in SQL using PyCaret Part I

This post is a step-by-step tutorial on how to train and deploy an Unsupervised Machine Learning Clustering model in SQL Server using **PyCaret(a low-code ML library in Python).**
Things we will cover in this article:
- 1.How to download and install SQL Server for free
- 2.How to create a new database and importing data into a database
- 3.How to enable and use Python scripting in database
- 4.How to train a clustering algorithm in order to assign cluster labels to each observation in the dataset
The go-to tools/ environments for performing ML experiments are Command-Line, IDEs, or Notebooks. However, such tools/environments may pose limitations when the data size gets very large, or when the ML model is required to be put in production. There has been a dire need to have the ability to programme and train models where data reside. MS SQL Server introduced this capability in their SQL Server version 2019. The distinct advantages of using SQL Server for Machine Learning are:
i. Extracting a large amount of data from the system is tedious and time-consuming. Conducting ML experiments on a server brings the code to data, rather than taking data to the code
ii. ML experiments are executed mostly in computer/cpu memory. Most of the machines hit a performance ceiling when training an ML algorithm on large data sets. ML on the SQL Server database avoids this
iii. It is easy to integrate and deploy ML Pipelines along with other ETL processes
SQL Server is a Microsoft relational database management system. As a database server, it performs the primary function of storing and retrieving data as requested by different applications. In this tutorial, we will use **SQL Server 2019 ****Developer** for machine learning by importing PyCaret library into SQL Server.
If you have used SQL Server before, it is likely that you have it installed and have access to the database. If not, **click here** to download SQL Server 2019 Developer or other edition.

Before using PyCaret functionality into SQL Server, you’ll need to install SQL Server and PyCaret. This is a multi-step process:
Download the SQL Server 2019 Developer Edition file “SQL2019-SSEI-Dev.exe”

Open the file and follow the instructions to install (recommended to use Custom install option)

Choose New SQL Server stand-alone installation

In the Instance Features option, select the features including “Python” under Machine Learning Services and Language Extensions and Machine Learning Server (Standalone)

Click “Accept” to provide consent to install Python

Installation may take 15–20 minutes
**Click here** or Open SQL Server Installation Center to download “SQL Server Management Tools” file “SSMS-Setup-ENU.exe”

Open “SSMS-Setup-ENU.exe” file to start the installation

Installation may take 5–10 minutes
Once you have everything installed, you will need to start an instance of the server. To do so, start SSMS. At the login stage, you’ll be asked to the name of the SQL Server that you can choose from the drop-down menu. Once a connection is established, you can see all the objects from the server. If you have downloaded SQL Server for the first time and you do not have a database to work with, you will need to create a new database first.
In the Object Explorer panel, right-click on Databases, and choose New Database

Enter the database name and other information
The setup may take 2–3 minutes including creating a database, user and setting ownership
You will now have to import a CSV file into a database using SQL Server Management Studio.
Create a table “jewellery