This project aims to predict the customer satisfaction score for the next order or purchase based on a customer's historical data. I use the Brazilian E-Commerce Public Dataset by Olist to analyze 100,000+ orders across various marketplaces from 2016 to 2018. By predicting satisfaction scores from features like order status, price, and reviews, the project demonstrates how to build production-ready ML pipelines using ZenML and MLflow.
The pipeline trains a machine learning model that predicts customer satisfaction and deploys it using MLflow, enabling continuous monitoring and deployment.
Given a customer's order history, we predict the review score for their next purchase based on features such as:
- Order status
- Price
- Payment method
- Freight performance
- Customer reviews
I create a ZenML pipeline that trains, evaluates, and deploys a machine learning model to predict customer satisfaction. By integrating MLflow, we can track model performance and ensure continuous deployment for future predictions.
- ZenML: Framework for building and deploying production-grade ML pipelines.
- MLflow: Tool for tracking experiments and model deployment.
- Python 3.8+: Core programming language for model development.
- Data Ingestion: Load and preprocess data from the Brazilian e-commerce dataset.
- Data Cleaning: Remove unwanted columns and outliers.
- Model Training: Train a machine learning model and log results using MLflow autologging.
- Model Evaluation: Evaluate the model's performance based on the Mean Squared Error (MSE).
- Model Deployment: Deploy the model using MLflow, enabling real-time predictions.
- Continuous Deployment Pipeline: Automate retraining and redeployment when model accuracy improves beyond a set threshold.
Follow the steps below to clone and set up the repository:
# Clone the repository
git clone https://github.com/Mayankpratapsingh022/Customer-Satisfaction-MLOps.git
# Navigate to the project folder
cd Customer-Satisfaction-MLOps
# Install dependencies
pip install -r requirements.txt
# Set up ZenML with MLflow
pip install zenml[server]
zenml up
zenml integration install mlflow -y
The training pipeline consists of several steps:
- Ingest Data: Load the dataset and create a pandas DataFrame.
- Clean Data: Perform data cleaning and feature selection.
- Train Model: Train a machine learning model and log performance with MLflow autologging.
- Evaluate Model: Calculate metrics such as MSE to evaluate the model.
The deployment pipeline handles continuous model deployment with MLflow:
- Deployment Trigger: Check if the newly trained model meets accuracy requirements.
- Model Deployer: Deploy the model if the evaluation metric passes the threshold.
The pipeline automatically updates the MLflow deployment server with the latest model version when performance improves.
- Dataset: Brazilian E-Commerce Public Dataset by Olist
- ZenML Documentation: ZenML
- MLflow Documentation: MLflow
Using this framework, you can continuously predict customer satisfaction, deploy models in production, and monitor the performance for future improvements.