Built a few forecasting models to determine the demand for a particular product.
- Set up a data science project structure in a new git repository in your GitHub account
- Download the product demand data set from https://www.kaggle.com/felixzhao/productdemandforecasting
- Load the data set into panda data frames
- Formulate one or two ideas on how feature engineering would help the data set to establish additional value using exploratory data analysis
- Build one or more forecasting models to determine the demand for a particular product using the other columns as features
- Document your process and results
- Commit your notebook, source code, visualizations and other supporting files to the git repository in GitHub
- Product_Code : The product name encoded
- Warehouse: Warehouse name encoded
- Product_Category: Product Category for each Product_Code encoded
- Date: The date customer needs the product
- Order_Demand:single order qty
- Train Time series models to forecast product demand
-It performed worst on test data
- Even on training data didn't perform well
- Training R2 score 0.00798308424338956
- Testing R2 score -0.001335919802511798
- Pretty good performance compared to LR based model
- Training R2 score 0.852648878481129
- Testing R2 score -0.6673676260758077
- Linear Regression model performed worst
- Gradient Boosting model was good
- To further improve we can make use of recurrent neural networks(RNN) which I am sure would perform best in such scenarios.
- Project description
- Contains link to dataset
- Jupyter Notebook for Exploratory data analysis, Visualization, Feature Engineering and Forecasting Demand.
- plot- warehouse sample count
- Plot- Demand of Product 1359
- Plot- Linear Regression model predictions on testing data
- plot- Linear Regression model predictions on complete data
- Plot- Gradient boosting model predictions on testing data
- Plot- Gradient boosting model predictions on complete data
- Info about Tools, frameworks and libraries required to reproduce the work flow