-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry metric store for more transport errors #538
Retry metric store for more transport errors #538
Conversation
Align Dockerfile with JAVAx_HOME environment variable requirements introduced in: elastic#518
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvement! Looks fine in general, I just left a few ideas for you to ponder.
tests/metrics_test.py
Outdated
|
||
# should return on first success | ||
operation = mock.Mock(side_effect=[gateway_timeout, gateway_timeout, "success", gateway_timeout]) | ||
operation = mock.Mock(side_effect=[bad_gateway, bad_gateway, "success", bad_gateway]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe randomize and vary the error type here? I think then we could also reduce the number of individual test cases?
esrally/metrics.py
Outdated
elif e.status_code == 429 and execution_count < max_execution_count: | ||
self.logger.debug("Execution rejected in attempt [%d/%d].", execution_count, max_execution_count) | ||
time.sleep(3) | ||
retriable_responses_with_sleep = {502: 1, 503: 1, 504: 1, 429: 3} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would also be ok if we settle on a single waiting period. Wdyt?
Dockerfile
Outdated
@@ -25,6 +25,7 @@ ENV HOME /home/${NEW_USER} | |||
ENV PYENV_ROOT=/home/${NEW_USER}/.pyenv | |||
ENV PATH=$PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH | |||
ENV JAVA_HOME=/opt/jdk-10 | |||
ENV JAVA10_HOME=/opt/jdk-10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this came from this commit 9ddb393 to fix the it tests in Docker. I'll pull it out and submit separately.
and switch to fixed wait time between attempts (3s).
This will be addressed separately outside of the PR.
@danielmitterdorfer Thanks for the review! I changed the logic, as discussed, to use randomized tests and fixed retry periods. Would appreciate another look when you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look great. Thanks for the PR! LGTM.
Thank you for the review @danielmitterdorfer ! |
Improve the resiliency of Rally when using an Elasticsearch remote metrics store by retrying certain transport errors.
Also enhance existing tests to check that retries/debug output and sleep are executed properly.