This project analyzes customer churn data to build predictive models and calculate net profits associated with different customer retention strategies. Using machine learning and financial analysis, it aims to provide actionable insights into reducing customer churn and optimizing profit through realistic retention strategies.
- Project Overview
- Dataset
- Project Structure
- Machine Learning Models
- Financial Analysis
- Setup and Usage
- Dependencies
- Results
The main goal of this project is to predict customer churn and evaluate the financial impact of customer retention strategies. We use customer behavioral and demographic data to create machine learning models that predict whether a customer will churn and perform a profit analysis based on potential retention strategies and costs.
The dataset includes the following main columns:
- Demographic data:
gender
,SeniorCitizen
,Partner
,Dependents
- Service usage data:
PhoneService
,MultipleLines
,InternetService
,OnlineSecurity
,OnlineBackup
,DeviceProtection
,TechSupport
,StreamingTV
,StreamingMovies
- Account information:
Contract
,PaperlessBilling
,PaymentMethod
,MonthlyCharges
,TotalCharges
- Target:
Churn
(whether the customer left)
The project includes the following stages:
- Data Preprocessing: Cleaning and encoding data, handling missing values, and preparing data for modeling.
- Feature Engineering: Selecting and creating new features to enhance model accuracy.
- Machine Learning Modeling: Training multiple models to predict churn.
- Financial Analysis: Evaluating profit based on churn predictions and calculating net profit under different retention strategies.
Several machine learning models were developed and evaluated using metrics such as ROC-AUC and accuracy. Key models include:
- Random Forest Classifier
- Decision Tree Classifier
- XGBoost Classifier
The best-performing model was chosen for the final churn predictions.
A detailed financial analysis was added to calculate the net profit for different customer retention strategies. This includes:
- Average Monthly Revenue Calculation: Using actual customer
MonthlyCharges
data to calculate potential yearly revenue per customer. - Retention Scenarios: Simulating revenue based on different customer retention rates (20%, 50%, 100%).
- Retention Costs: Assessing the net profit associated with various retention costs per customer (e.g., $50, $100, up to $300).
- Visualization: Plotting net profit outcomes across various retention rates and costs to provide a realistic financial outlook.
-
Clone the repository:
git clone https://github.com/adamw80/Customer-Churn-and-Profit-Analysis.git cd Customer-Churn-and-Profit-Analysis
-
Install dependencies: Ensure you have Python 3.6+ and install required packages:
pip install -r requirements.txt
-
Run the analysis: Open and run the Jupyter Notebook:
jupyter notebook Customer\ Churn\ and\ Profit\ Analysis.ipynb
The project relies on the following packages:
- Pandas: Data manipulation
- NumPy: Numerical computations
- Scikit-Learn: Machine learning model building
- XGBoost: Gradient boosting model
- Matplotlib/Seaborn: Data visualization
- Imbalanced-learn: Handling imbalanced datasets for model training
The analysis shows that specific retention strategies can significantly affect net profit, depending on the retention cost and targeted retention rate. Key findings include:
- Churn Prediction: The model provided accurate predictions for customer churn, enabling targeted retention.
- Financial Impact: Retention costs have a critical threshold beyond which net profit decreases. Visualizations provide insights into optimal retention costs per customer to maximize profitability.
This project is licensed under the MIT License.
For questions or feedback, feel free to reach out:
- Email: [email protected]
- GitHub: adamw80