This project focuses on forecasting financial time series data, specifically the VWAP (Volume Weighted Average Price), using the ARIMA (AutoRegressive Integrated Moving Average) model. The objective is to build a predictive model that accurately forecasts VWAP values and evaluates its performance using key statistical metrics.
- Forecast VWAP using historical time series data.
- Evaluate model performance using statistical error metrics such as RMSE and MAE.
- Provide visual insights through plots and graphs for better understanding.
- The dataset was imported and prepared for analysis by:
- Handling missing values (if any).
- Parsing time-series indices to ensure temporal alignment.
- Normalizing and scaling the VWAP column for model optimization.
- Time series data was visualized to uncover trends, seasonality, and anomalies.
- Stationarity was tested using methods like the Augmented Dickey-Fuller (ADF) test.
- ARIMA was chosen due to its effectiveness for univariate time series forecasting.
- Hyperparameters for the ARIMA model were tuned using the
auto_arima
function to minimize AIC and BIC scores.
- The ARIMA model was trained on the training data.
- Forecasts were generated for the test data.
- Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) were computed to assess model performance.
- Root Mean Square Error (RMSE):
222.02
- Mean Absolute Error (MAE):
160.31
The model demonstrates good performance in tracking VWAP trends, as evident from the low RMSE and MAE values.
df['VWAP'].plot(figsize=(14, 7), title='VWAP Time Series')
from pmdarima import auto_arima
model = auto_arima(df['VWAP'], seasonal=False, stepwise=True)
forecast = model.predict(n_periods=len(test_data))
from sklearn.metrics import mean_absolute_error, mean_squared_error
rmse = np.sqrt(mean_squared_error(test_data['VWAP'], test_data['Forecast_ARIMA']))
mae = mean_absolute_error(test_data['VWAP'], test_data['Forecast_ARIMA'])
The ARIMA model successfully forecasts VWAP values with reasonable accuracy, making it a valuable tool for financial analysis. However, there is potential to enhance the model by:
- Incorporating exogenous variables such as trading volume or market sentiment.
- Exploring other machine learning models, e.g., LSTM or Prophet, for better accuracy.
- Extending the analysis to capture seasonal and cyclical patterns in data.
- Integration of external factors (macroeconomic indicators, sentiment analysis).
- Application of advanced deep learning methods for time series forecasting.
- Comprehensive backtesting to validate model robustness under real-world scenarios.
data/
: Contains the dataset used for analysis.notebooks/
: Includes the Jupyter notebook with all code and analysis steps.images/
: Contains visualizations and plots generated during the project.
- Clone the repository:
git clone https://github.com/your-username/finance-time-series-forecasting.git
- Navigate to the project directory:
cd finance-time-series-forecasting
- Install dependencies:
pip install -r requirements.txt
- Run the Jupyter notebook:
jupyter notebook notebooks/finance_time_series_analysis.ipynb
- Python 3.x
- pandas
- numpy
- matplotlib
- seaborn
- statsmodels
- pmdarima
- scikit-learn
This project is inspired by real-world financial forecasting challenges. Special thanks to the open-source contributors who maintain the tools and libraries used in this analysis.
Your Name
Your GitHub Profile