Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

return metric score along with untrained best models & params #822

Merged

Conversation

khanetor
Copy link
Contributor

Fixes #776.

Summary

The gridsearch method also returned the best metric score, aka the minimum error, along with the best model and parameters.

@codecov-commenter
Copy link

codecov-commenter commented Feb 26, 2022

Codecov Report

Merging #822 (b48a8e6) into master (7ca7801) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #822   +/-   ##
=======================================
  Coverage   91.33%   91.33%           
=======================================
  Files          69       69           
  Lines        6867     6867           
=======================================
  Hits         6272     6272           
  Misses        595      595           
Impacted Files Coverage Δ
darts/models/forecasting/forecasting_model.py 96.66% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ca7801...b48a8e6. Read the comment docs.

Copy link
Contributor

@hrzn hrzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change, but I think we can do it. The other option would be to introduce a parameter in the function to optionally return the best metric, but that would inflate the signature. So I think we could merge this change. Could you maybe just add a short unit test that verifies the actual value being returned in test_backtesting.py ?

@khanetor
Copy link
Contributor Author

This is a breaking change, but I think we can do it. The other option would be to introduce a parameter in the function to optionally return the best metric, but that would inflate the signature. So I think we could merge this change. Could you maybe just add a short unit test that verifies the actual value being returned in test_backtesting.py ?

There is a function compare_best_against_random in file "test_backtesting.py", which seems to perform some score comparison between best params and random params. I was thinking that instead of recalculating best_score_1 and best_score_2, I can use the returned score from the new gridsearch function. Would that be a good approach in this case? One concern that I have is that best_model_1 gridsearch uses stride, but the best_score_1 backtesting only uses a default stride, so the scores might be different. Do you think we should add the usage of stride in the subsequent backtesting?

Another approach is to leave this as-is, and create another test where I also manually backtest the best model, and compare the score with the one returned by gridsearch function.

@khanetor
Copy link
Contributor Author

Out of curiosity, there are some tests that you mark as skipUnless TORCH_AVAILABLE. I don't recall models such as RandomForest from sklearn requires PyTorch.

@hrzn
Copy link
Contributor

hrzn commented Feb 28, 2022

This is a breaking change, but I think we can do it. The other option would be to introduce a parameter in the function to optionally return the best metric, but that would inflate the signature. So I think we could merge this change. Could you maybe just add a short unit test that verifies the actual value being returned in test_backtesting.py ?

There is a function compare_best_against_random in file "test_backtesting.py", which seems to perform some score comparison between best params and random params. I was thinking that instead of recalculating best_score_1 and best_score_2, I can use the returned score from the new gridsearch function. Would that be a good approach in this case? One concern that I have is that best_model_1 gridsearch uses stride, but the best_score_1 backtesting only uses a default stride, so the scores might be different. Do you think we should add the usage of stride in the subsequent backtesting?

Another approach is to leave this as-is, and create another test where I also manually backtest the best model, and compare the score with the one returned by gridsearch function.

I think in case of doubt it is better to create a new unit test dedicated to test this new functionality. My only concern is that ideally it shouldn't be too long-running; so maybe take a small series (e.g. air passengers), a model very quick to fit (e.g. Theta), and not too large a parameter space.

@hrzn
Copy link
Contributor

hrzn commented Feb 28, 2022

Out of curiosity, there are some tests that you mark as skipUnless TORCH_AVAILABLE. I don't recall models such as RandomForest from sklearn requires PyTorch.

RandomForest does not depend on PyTorch, but it used to (we used to rely on PyTorch Datasets to train them, but that is no longer the case). It might be that we forgot to remove a couple of skipUnless clauses from the unit tests. If you find some and you are confident there's no PyTorch-based models involved, you can remove them in this PR, and I'll try testing (it will be tested by our CI pipelines upon merging anyway).

@khanetor khanetor requested a review from hrzn February 28, 2022 08:45
@khanetor
Copy link
Contributor Author

khanetor commented Mar 1, 2022

Hello @hrzn . Anything else I should add? 😄

Copy link
Contributor

@hrzn hrzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@hrzn
Copy link
Contributor

hrzn commented Mar 1, 2022

No it's good now, thanks for adding the test! Will merge once the tests are done, and it will be released when we release the new version of Darts. Thanks!

@hrzn hrzn merged commit cb80cf3 into unit8co:master Mar 1, 2022
@khanetor khanetor deleted the feat/gridsearch-returns-backtest-score branch May 13, 2022 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gridsearch should return metric score along with the best untrained model and parameters.
3 participants