[BUG] Error in running model.predict with RegressionEnsembleModel #1340

TheNumbersAI · 2022-11-06T20:45:52Z

First of all, I really love the DARTS software! Magnificent! But I think there is an issue with RegressionEnsembleModel. Any help or guidance would be appreciated.

Describe the bug
When using past_covariates (and no future_covariates) with a RegressionEnsembleModel, I get an error such as this:

ERROR: ValueError: The corresponding future_covariate of the series at index 0 isn't sufficiently long. Given horizon n=1, min(lags_future_covariates)=0, max(lags_future_covariates)=0 and output_chunk_length=1
the future_covariate has to range from 2022-10-03 00:00:00 until 2022-10-03 00:00:00 (inclusive), but it ranges only from 2022-10-10 00:00:00 until 2022-10-10 00:00:00.

To Reproduce
Here is a code snippet, although it is a challenge uploading sample data etc. I've simplified it here to remove some of the training etc details

series is time series data
train_transformed is time series data sampled from start = 0.5
past_covariates is also time series data
both series and past_covariates have the same range, and are transformed via a scaler per darts documentation

window = 5
lags = [-1, -2, -5]
my_model1 = LinearRegressionModel(lags=lags, output_chunk_length=window, lags_past_covariates=lags)
my_model2 = LightGBMModel(lags=lags, output_chunk_length=window, lags_past_covariates=lags)
my_model1.fit(train_transformed, past_covariates=past_covariates)
my_model2.fit(train_transformed, past_covariates=past_covariates)
my_ensemble_model = RegressionEnsembleModel([my_model1, my_model2], regression_train_n_points=2 * window) # Strangely when I set this to be "window", I get an error during backtesting or prediction, but doubling at least bypasses any backtesting error.
backtest = my_ensemble_model.historical_forecasts(series, start=0.5, last_points_only=True, forecast_horizon=window, stride=1, verbose=True, past_covariates=past_covariates)
prediction = my_ensemble_model.predict(n=window, series=series, past_covariates=past_covariates)

Expected behavior
I expect to get a time series of prediction data, similar to what I get with any other DARTS model, which I am able to do with the underlying components of the RegressionEnsembleModel for example; LinearRegressionModel and LightGBMModel work as I have defined them. However, I get the ValueError above. Trying different values for n in predicting also gets the same error.

System (please complete the following information):

Python 3.10.6
darts==0.22.0

Additional context
I have no problem getting the NaiveEnsembleModel to work with the same parameters when I instantiate it and use it to make a prediction, it's only RegressionEnsembleModel which fails. I'm stumped!

Detailed stack trace:

2022-11-06 13:26:54 main_logger ERROR: ValueError: The corresponding future_covariate of the series at index 0 isn't sufficiently long. Given horizon `n=1`, `min(lags_future_covariates)=0`, `max(lags_future_covariates)=0` and `output_chunk_length=1`
the future_covariate has to range from 2022-10-03 00:00:00 until 2022-10-03 00:00:00 (inclusive), but it ranges only from 2022-10-10 00:00:00 until 2022-10-10 00:00:00.

ValueError Traceback (most recent call last)
Cell In [36], line 1
----> 1 prediction = model.predict(n=1, series=series, past_covariates=all_covariates) # still need to set n=window
2 o_prediction = scaler.inverse_transform(prediction)

File ~/Documents/beachhome/lib/python3.10/site-packages/darts/models/forecasting/ensemble_model.py:172, in EnsembleModel.predict(self, n, series, past_covariates, future_covariates, num_samples)
163 predictions = self._make_multiple_predictions(
164 n=n,
165 series=series,
(...)
168 num_samples=num_samples,
169 )
171 if self.is_single_series:
--> 172 return self.ensemble(predictions)
173 else:
174 return self.ensemble(predictions, series)

File ~/Documents/beachhome/lib/python3.10/site-packages/darts/models/forecasting/regression_ensemble_model.py:161, in RegressionEnsembleModel.ensemble(self, predictions, series)
158 predictions = [predictions]
159 series = [series]
--> 161 ensembled = [
162 self.regression_model.predict(
163 n=len(prediction), series=serie, future_covariates=prediction
164 )
165 for serie, prediction in zip(series, predictions)
166 ]
168 return ensembled[0] if self.is_single_series else ensembled

File ~/Documents/beachhome/lib/python3.10/site-packages/darts/models/forecasting/regression_ensemble_model.py:162, in (.0)
158 predictions = [predictions]
159 series = [series]
161 ensembled = [
--> 162 self.regression_model.predict(
163 n=len(prediction), series=serie, future_covariates=prediction
164 )
165 for serie, prediction in zip(series, predictions)
166 ]
168 return ensembled[0] if self.is_single_series else ensembled

File ~/Documents/beachhome/lib/python3.10/site-packages/darts/models/forecasting/regression_model.py:553, in RegressionModel.predict(self, n, series, past_covariates, future_covariates, num_samples, **kwargs)
550 last_req_ts = last_pred_ts + lags[-1] * ts.freq
552 # check for sufficient covariate data
--> 553 raise_if_not(
554 cov.start_time() <= first_req_ts
555 and cov.end_time() >= last_req_ts,
556 f"The corresponding {cov_type}_covariate of the series at index {idx} isn't sufficiently long. "
557 f"Given horizon n={n}, min(lags_{cov_type}_covariates)={lags[0]}, "
558 f"max(lags_{cov_type}_covariates)={lags[-1]} and "
559 f"output_chunk_length={self.output_chunk_length}\n"
560 f"the {cov_type}_covariate has to range from {first_req_ts} until {last_req_ts} (inclusive), "
561 f"but it ranges only from {cov.start_time()} until {cov.end_time()}.",
562 )
564 # Note: we use slice() rather than the [] operator because
565 # for integer-indexed series [] does not act on the time index.
566 last_req_ts = (
567 # For range indexes, we need to make the end timestamp inclusive here
568 last_req_ts + ts.freq
569 if ts.has_range_index
570 else last_req_ts
571 )

File ~/Documents/beachhome/lib/python3.10/site-packages/darts/logging.py:78, in raise_if_not(condition, message, logger)
76 if not condition:
77 logger.error("ValueError: " + message)
---> 78 raise ValueError(message)

ValueError: The corresponding future_covariate of the series at index 0 isn't sufficiently long. Given horizon n=1, min(lags_future_covariates)=0, max(lags_future_covariates)=0 and output_chunk_length=1
the future_covariate has to range from 2022-10-03 00:00:00 until 2022-10-03 00:00:00 (inclusive), but it ranges only from 2022-10-10 00:00:00 until 2022-10-10 00:00:00.

The text was updated successfully, but these errors were encountered:

dennisbader · 2022-11-13T10:56:51Z

Hey @Jason-Merkoski and thanks for raising this issues.
It is indeed a bug where we reused the training series in RegressionEnsembleModel.predict() if only a single target series was used at fitting time. I will work on a fix.

The future covariates here is specific to the Ensemble models where we use the output of your two sub models as a future covariates.

TheNumbersAI · 2022-11-14T17:07:27Z

@dennisbader could you reopen this and take a look? After I delete my darts installation, and rebuild from scratch after purging the pip cache, to pull changes from "main" I still cannot use historical_forecasts or predict. Maybe I'm missing something? My code is unchanged from above.

File "/lib/python3.10/site-packages/darts/utils/utils.py", line 172, in sanitized_method
return method_to_sanitize(self, *only_args.values(), **only_kwargs)
File "/lib/python3.10/site-packages/darts/models/forecasting/forecasting_model.py", line 500, in historical_forecasts
forecast = self._predict_wrapper(
File "/lib/python3.10/site-packages/darts/models/forecasting/forecasting_model.py", line 1228, in _predict_wrapper
return self.predict(
File "/lib/python3.10/site-packages/darts/models/forecasting/ensemble_model.py", line 172, in predict
return self.ensemble(predictions)
File "/lib/python3.10/site-packages/darts/models/forecasting/regression_ensemble_model.py", line 161, in ensemble
ensembled = [
File "/lib/python3.10/site-packages/darts/models/forecasting/regression_ensemble_model.py", line 162, in
self.regression_model.predict(
File "/lib/python3.10/site-packages/darts/models/forecasting/regression_model.py", line 613, in predict
covariate_matrices[cov_type][
IndexError: index 1 is out of bounds for axis 1 with size 1

dennisbader · 2022-11-14T18:03:06Z

It works on my end.
I wonder why this line is actually not crashing for you:

my_ensemble_model = RegressionEnsembleModel([my_model1, my_model2], regression_train_n_points=2 * window)

According your example you fit my_model1, and my_model2 before creating the ensemble model. This should raise an error like the following:

ValueError: Cannot instantiate EnsembleModel with trained/fitted models. Consider resetting all models with `my_model.untrained_model()`

I get your code working with some dummy timeseries:

from darts.models import LinearRegressionModel, LightGBMModel, RegressionEnsembleModel
from darts.utils.timeseries_generation import linear_timeseries
from darts.dataprocessing.transformers import Scaler

scaler = Scaler()
series = linear_timeseries(length=100)

ts_train, ts_val = series[:50], series[50:]
train_transformed = scaler.fit_transform(ts_train)

past_covariates = series

window = 5
lags = [-1, -2, -5]

my_model1 = LinearRegressionModel(lags=lags, output_chunk_length=window, lags_past_covariates=lags)
my_model2 = LightGBMModel(lags=lags, output_chunk_length=window, lags_past_covariates=lags)

# do not fit the models before creating the ensemble models
my_ensemble_model = RegressionEnsembleModel([my_model1, my_model2], regression_train_n_points=2*window)
backtest = my_ensemble_model.historical_forecasts(
    series,
    start=0.5,
    last_points_only=True,
    forecast_horizon=window,
    stride=1,
    verbose=True,
    past_covariates=past_covariates
)

prediction = my_ensemble_model.predict(n=window, series=series, past_covariates=past_covariates)

TheNumbersAI · 2022-11-14T22:03:14Z

Thanks for the speedy response @dennisbader !

As for the fitting of each input model, my sample code for reproducing the issue missed that part; in my production code I do use a construct like models = [m.untrained_model() for m in models] as inputs.

The fix works now, much obliged, and I look forward to using your great software!

TheNumbersAI added bug Something isn't working triage Issue waiting for triaging labels Nov 6, 2022

dennisbader mentioned this issue Nov 13, 2022

Fix/ensemble predict with series #1357

Merged

dennisbader closed this as completed in #1357 Nov 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Error in running model.predict with RegressionEnsembleModel #1340

[BUG] Error in running model.predict with RegressionEnsembleModel #1340

TheNumbersAI commented Nov 6, 2022 •

edited

Loading

dennisbader commented Nov 13, 2022 •

edited

Loading

TheNumbersAI commented Nov 14, 2022

dennisbader commented Nov 14, 2022 •

edited

Loading

TheNumbersAI commented Nov 14, 2022 •

edited

Loading

[BUG] Error in running model.predict with RegressionEnsembleModel #1340

[BUG] Error in running model.predict with RegressionEnsembleModel #1340

Comments

TheNumbersAI commented Nov 6, 2022 • edited Loading

dennisbader commented Nov 13, 2022 • edited Loading

TheNumbersAI commented Nov 14, 2022

dennisbader commented Nov 14, 2022 • edited Loading

TheNumbersAI commented Nov 14, 2022 • edited Loading

TheNumbersAI commented Nov 6, 2022 •

edited

Loading

dennisbader commented Nov 13, 2022 •

edited

Loading

dennisbader commented Nov 14, 2022 •

edited

Loading

TheNumbersAI commented Nov 14, 2022 •

edited

Loading