ML Model Training to Model in production with Few Lines of Code
Do you know?
- In the United States, someone has a heart attack every 40 seconds.
- Every year, about 805,000 people in the United States have a heart attack.
Significant major symptoms of a heart attack are
- Chest pain or discomfort. Most heart attacks entail pain in the center or left side of the chest that goes away for a long time and then comes back. The discomfort may feel like pain, pressure, squeezing, or fullness.
- Feeling weak, light-headed, or faint.
- Pain or discomfort in the jaw, neck, or back.
- Pain or discomfort in one or both arms or shoulders.
- Shortness of breath often comes with chest discomfort, but the shortness of breath can also happen before chest discomfort.
Immediatly Call 9–1–1 if you notice symptoms of a heart attack.
Enough of Heart Attacks… Let’s talk about what we all are here..Machine Attacks..oops..no…Learning…Machine Learning
In this blog we are going to do:
- EDA on the Heart Attack Dataset from Kaggle
- Create multiple ML Models with LuciferML
- Deploy our best model using TrueFoundry
Let’s start!
Introduction
Today we are going to Analyze a person’s health record and using that, we will predict if a person had a Heart attack or not.
We will use LuciferML for our models and will use TrueFoundry for those models’ experiment tracking and deployment.
For those who don’t know
LuciferML is an AutoML Library for Creating models. It takes away all your hassles and does all the hard work for you. It’s currently in development—it Preprocesses and Trains different models on your Data.
TrueFoundry is an MLOps platform that provides the fastest post-model pipeline framework for Data Scientists (DS) and Machine Learning Engineers (MLEs). It also allows for the monitoring of the trained model in 15 minutes. With MLFoundry, you can track experiments with a few lines of Code, compare and visualize experiments on a rich dashboard, and log experiment artifacts.
About Dataset
This dataset from Kaggle has 14 columns, out of which we have to predict the “output” column.
It has data that tells if a person has a heart attack or not based on their other health factors.
Installation
!pip install lucifer-ml mlfoundry servicefoundry gradio
Importing Libraries
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
plt.style.use('dark_background')
import numpy as np
import pandas as pd
import seaborn as sns
#Importing LuciferML
from luciferml.supervised.classification import Classificationimport mlfoundry as mlf
import warnings
import servicefoundry.core as sfy
warnings.simplefilter(action='ignore', category=Warning)
Loading Dataset
dataset = pd.read_csv('../input/heart-attack-analysis-prediction-dataset/heart.csv')
Exploratory Data Analysis
dataset.head()
People age above 40 and below 68 are more likely to get heart attack. People at the age of 58 are highly prone to heart attacks.
Data Preparation
features = dataset.iloc[:, 0:-1]
labels = dataset.iloc[:, -1]
Training a lot of Models…
classifier = Classification(predictor = 'all', lda = 'y',smote = 'y')
classifier.fit(features, labels)
result = classifier.result_df
Leaderboard
result = result.sort_values(by = 'KFold Accuracy', ascending = False).reset_index(drop = True)
result.iloc[0]
K-Nearest Neighbors Classifier is best model with an accuracy of 87.09%
Experiment Tracking
Experiment Tracking is the process of saving/logging all the info. released every experiment and test
a) Login to TrueFoundry
sfy.login(api_key)
You can get the API Key from here:
b) Creating a new project
mlf_api = mlf.get_client(
api_key=api_key)
mlf_run = mlf_api.create_run(
project_name='heart', run_name='heart-run-4')
c) Logging Dataset
mlf_run.log_dataset("features", features)
mlf_run.log_dataset("labels", labels)
d) Logging Model
mlf_run.log_model(name = 'Best Model', model = result.iloc[0]['Model'], framework = 'sklearn', description = 'My Model')
e) Capturing System Metrics
MLFoundry automatically captures system metrics…isn’t it good?
Deploying model in 3..2..1…
I wish I could write a blog with this ease, just like I am deploying a model so quickly. :(
Kudos to the TrueFoundry Team for this ease of deployment.
We will deploy our model as a Gradio WebApp.
For deploying a model we need two files one is deploy.py which is the deployment code and the other is requirements.txt.
Directory Structure
.
├── main.py
└── requirements.txt
First, we will write deployment code using the magic function
%%writefile deploy.py
import mlfoundry as mlf
import gradio as gr
import pandas as pd
import numpy as npmlf_client = mlf.get_client(
api_key=api_key')runs = mlf_client.get_all_runs('heart')run = mlf_client.get_run(runs['run_id'][0])model = run.get_model()df = run.get_dataset('features')df = pd.DataFrame(df.features)inputs = []
i = 0
sample = df.iloc[0:1].values.tolist()[0]
for x in df.columns:
if df[x].dtype == 'object':
inputs.append(gr.Textbox(label=x, value=sample[i]))
elif df[x].dtype == 'float64' or df[x].dtype == 'int64':
inputs.append(gr.Number(label=x, value=sample[i]),)
i += 1def predict(*val):
global model
if type(val) != list:
val = [val]
if type(val) != np.array:
val = np.array(val)
print(val.shape)
if val.ndim == 1:
val = val.reshape(1, -1)
pred = model.predict(val)
return pred.tolist()[0]app = gr.Interface(fn=predict, inputs=inputs,
outputs=gr.Textbox(label='Output'))
app.launch(server_name="0.0.0.0", server_port=8080)
Next, we will write the requirements.txt
requirements = sfy.gather_requirements("deploy.py")
reqs = []
for i, j in enumerate(requirements):
reqs.append('{}=={}'.format(j, requirements[j]))
with open('requirements.txt', 'w') as f:
for line in reqs:
f.write(line)
f.write('\n')
Creating service and deploying it on our workspace
service = Service(
name="heart-service-1",
image=Build(
build_spec=PythonBuild(
command="python deploy.py",
),
),
ports=[{"port": 8080}],
resources=Resources(memory_limit="1.5Gi", memory_request="1Gi"),
)
service.deploy(workspace_fqn=workspace)
Deployed Model
Model Deployed Here: https://heart-service-1-arsh-dev.tfy-ctl-euwe1-develop.develop.truefoundry.tech