-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add gsutil upload retry helper function #6817
Conversation
qa/common/util.sh
Outdated
rm -f gcs_upload.gsutil_cp.log | ||
until gsutil -m cp -c -L gcs_upload.gsutil_cp.log -r $local_path $gcs_path; do | ||
sleep 1 | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add max retry count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not, because if it fails to upload after max retry, there is nothing to do but fail the test. In that case, it would be better to keep retrying until it timed out, which gives the best chances for the test to pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the timeout? If you are referring to the CI timeout, that will be too long, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, adding max retry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If fails:
+ gcs_upload models/ gs://<hidden>
+ local local_path=models/
+ local gcs_path=gs://<hidden>
+ local retry_left=30
+ local log_file=gcs_upload.gsutil_cp.log
+ rm -f gcs_upload.gsutil_cp.log
+ touch gcs_upload.gsutil_cp.log
+ gsutil -m cp -c -L gcs_upload.gsutil_cp.log -r models/ gs://<hidden>
Copying file://models/dummy [Content-Type=application/octet-stream]...
Error copying file://models/: NotFoundException: 404 The destination bucket gs://<hidden> does not exist or the write to the destination must be restarted
CommandException: 1 file/object could not be transferred.
+ sleep 1
+ (( retry_left-- ))
+ [[ 29 -eq 0 ]]
+ gsutil -m cp -c -L gcs_upload.gsutil_cp.log -r models/ gs://<hidden>
Copying file://models/dummy [Content-Type=application/octet-stream]...
Error copying file://models/: NotFoundException: 404 The destination bucket gs://<hidden> does not exist or the write to the destination must be restarted
CommandException: 1 file/object could not be transferred.
+ sleep 1
+ (( retry_left-- ))
+ [[ 28 -eq 0 ]]
...
+ [[ 0 -eq 0 ]]
+ cat gcs_upload.gsutil_cp.log
... <hidden> ...
+ echo -e '=== Failed upload to GCS bucket'
=== Failed upload to GCS bucket
+ return 121
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If success:
+ gcs_upload models/ gs://<hidden>
+ local local_path=models/
+ local gcs_path=gs://<hidden>
+ local retry_left=30
+ local log_file=gcs_upload.gsutil_cp.log
+ rm -f gcs_upload.gsutil_cp.log
+ touch gcs_upload.gsutil_cp.log
+ gsutil -m cp -c -L gcs_upload.gsutil_cp.log -r models/ gs://<hidden>
Copying file://models/dummy [Content-Type=application/octet-stream]...
/ [0/1 files][ 0.0 B/ 0.0 B]
/ [1/1 files][ 0.0 B/ 0.0 B]
Operation completed over 1 objects.
...
qa/common/util.sh
Outdated
local local_path=$1 | ||
local gcs_path=$2 | ||
rm -f gcs_upload.gsutil_cp.log | ||
until gsutil -m cp -c -L gcs_upload.gsutil_cp.log -r $local_path $gcs_path; do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid that -m
flag by default includes -c
flag.
The issue with it that it will keep return non-zero code if error happen.
-c
If an error occurs, continue attempting to copy the remaining files. If any copies are unsuccessful, gsutil's exit status is non-zero, even if this flag is set. This option is implicitly set when running gsutil -m cp....
https://cloud.google.com/storage/docs/gsutil/commands/cp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be we should drop this flag in favor of below?
gsutil cp -L gcs_upload.gsutil_cp.log -r $local_path $gcs_path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the -c
flag is exactly what we want, because the goal is to capture the error in a loop until it success. (or until a max retry which I am looking into that)
See the ticket for more details.