-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/probabilistic ensemble #1692
Conversation
…m_samples to the ensemble() method
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## master #1692 +/- ##
==========================================
- Coverage 94.27% 94.16% -0.12%
==========================================
Files 125 125
Lines 11607 11622 +15
==========================================
+ Hits 10943 10944 +1
- Misses 664 678 +14
☔ View full report in Codecov by Sentry. |
Simplified the definition of probabilistic Having probabilistic models is allowed, they will be trained as such but their prediction will contain only one sample when passed to the regression model at fitting and prediction time (the first one, due to the way Attaching two codes snippets to demonstrate how to get probabilistic forecast with from darts.models import LinearRegressionModel, LightGBMModel, RegressionEnsembleModel
from darts.utils.timeseries_generation import sine_timeseries, gaussian_timeseries
import matplotlib.pyplot as plt
# creating synthetic data
start = 10
end = 400
sin_series = sine_timeseries(start=start, end=end, value_amplitude=10)
gaus_series = gaussian_timeseries(mean=10, start=start, end=end)
tmp = sin_series + gaus_series
train, val = tmp.split_after(0.8)
quantiles = [0.25, 0.5, 0.75]
# probabilistic ensembling model
ensemble_lin_reg = LinearRegressionModel(quantiles=quantiles,
lags_future_covariates=[0],
likelihood="quantile")
# probabilistic models
lgbm_model = LightGBMModel(quantiles=quantiles, lags=4, likelihood="quantile")
linreg_model = LinearRegressionModel(quantiles=quantiles, lags=4, likelihood="quantile")
ensemble = RegressionEnsembleModel([lgbm_model, linreg_model],
regression_train_n_points=140,
regression_model=ensemble_lin_reg
)
ensemble.fit(train)
pred = ensemble.predict(len(val), num_samples=1000)
val.plot(label="test")
pred.plot(label="prediction")
plt.show() # deterministic models
lgbm_model = LightGBMModel(lags=4)
linreg_model = LinearRegressionModel(lags=4)
ensemble = RegressionEnsembleModel([lgbm_model, linreg_model],
regression_train_n_points=140,
regression_model=ensemble_lin_reg
)
ensemble.fit(train)
pred = ensemble.predict(len(val), num_samples=1000)
val.plot(label="test")
pred.plot(label="prediction")
plt.show() |
…ut probabilistic ensembles
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
…with probabilistic forecasting models, NaiveEnsembleModel also properly ensemble such probabilistic models (takes into account n_samples).
…in ensemble tests
Extended the feature: if the These samples are then reduced component-wise (one for each forecasting model) using either the Also, |
…8co/darts into feat/probabilistic_ensemble
…el, updated the tests accordingly
View / edit / reply to this conversation on ReviewNB dennisbader commented on 2023-05-19T14:46:01Z To make the RegressionEnsembleModel probabilistic, we simply have to use a probabilistic regression model: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice @madtoinou, thanks a lot 🚀 👍
There are one or two points where we could simplify a bit.
Regarding the non-reduction of the samples, I thought more about stacking the points vertically to generate more rows, instead of adding the dimensions as columns.
Maybe we can discuss this again?
darts/tests/models/forecasting/test_regression_ensemble_model.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Dennis Bader <[email protected]>
… regression cannot generate probabilistic forecast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks a lot ! :)
We're close! Mainly added a couple of minor suggestions. I think we could allow a mix of deterministic and probabilistic forecasting_models
for RegressionEnsembleModel
as internally we reduce the probabilistic ones anyway to a deterministic forecast. WDYT?
@@ -26,10 +27,24 @@ class EnsembleModel(GlobalForecastingModel): | |||
---------- | |||
models | |||
List of forecasting models whose predictions to ensemble | |||
|
|||
.. note:: | |||
if all the models are probabilistic, the `EnsembleModel` will also be probabilistic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true for naive ensemble but not for RegressionEnsembleModel, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, the docstring is different in RegressionEnsembleModel
. This note could probably be removed since EnsembleModel
cannot be instantiated anyway.
@@ -69,6 +92,52 @@ def __init__( | |||
f"{regression_model.lags}", | |||
) | |||
|
|||
raise_if( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move all those tests to the base class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, there will be a small discrepancies since the name of the argument/attributes are slightly different between the two classes (regression_
is used as a prefix in RegressionEnsembleModel
)
Co-authored-by: Dennis Bader <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @madtoinou, thanks a lot! 🚀
One more test about the mixed proba model and we're good to go!
# forecasting models are a mix of probabilistic and deterministic, probabilistic regressor | ||
ensemble_mixproba = RegressionEnsembleModel( | ||
forecasting_models=[ | ||
self.get_probabilistic_global_model([-1, -3], quantiles), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check also mixed proba model with train_num_samples > 1 and a reduction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had to move the reduction to _make_multiple_prediction()
so that they could be stacked properly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect 💯 Great work @madtoinou!
Fixes #1682, fixes #1600.
Summary
The
num_sample
argument is properly passed toregression_model.predict()
inRegressionEnsembleModel.ensemble()
.In order to obtain probabilistic predictions, all the models of the
RegressionEnsembleModel
need to be probabilistic, including itsregression_model
.It's however still possible to obtain deterministic predictions if the
regression_model
is not probabilistic but the others are or by usingnum_sample=1
.Other Information
The definition of a probabilistic
EnsembleModel
should be better defined; at the moment, it requires that all its model are probabilistic however for aRegressionEnsembleModel
, we might also want the regression model itself to be probabilistic? Or on the opposite, ss a probabilisticregression_model
enough to make aRegressionEnsembleModel
probabilistic even if all the other models are deterministic?