Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup PartiallyFailed #3087

Closed
jravetch opened this issue Nov 17, 2020 · 6 comments
Closed

Backup PartiallyFailed #3087

jravetch opened this issue Nov 17, 2020 · 6 comments

Comments

@jravetch
Copy link

What steps did you take and what happened:
[A clear and concise description of what the bug is, and what commands you ran.)
Backup PartiallyFailed, failing to backup secrets.

What did you expect to happen:
Full backup

The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)

  • velero backup logs velero-schedule-config-20201117120023 | grep -v 'level=info'

time="2020-11-17T12:01:42Z" level=error msg="Error listing items" backup=velero/velero-schedule-config-20201117120023 error="stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 44029; INTERNAL_ERROR" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/backup/item_collector.go:294" error.function="github.com/vmware-tanzu/velero/pkg/backup.(*itemCollector).getResourceItems" group=v1 logSource="pkg/backup/item_collector.go:294" namespace= resource=secrets

Environment:

  • Velero version (use velero version): 1.5.1
  • Velero features (use velero client config get features): gcp plugin 1.1.0
  • Kubernetes version (use kubectl version): 1.16.13
  • Kubernetes installer & version: GKE
  • Cloud provider or hardware configuration: GKE
  • OS (e.g. from /etc/os-release): COS

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@ashish-amarnath
Copy link
Member

@jravetch The error is coming from the API server when Velero is trying to list all the secrets.
Can you please confirm that your Kubernetes cluster is functional and the API server is running and is available?

@ashish-amarnath ashish-amarnath added the Needs info Waiting for information label Nov 18, 2020
@jravetch
Copy link
Author

Yes this cluster is functioning normally. It is a large cluster with over 100 nodes. Velero and previously ark was running in this cluster without issue. Velero was also upgraded from version 1.0 to 1.5.

@ashish-amarnath
Copy link
Member

@jravetch What is the version of Kubernetes that you are running? Also, I found this issue googleapis/google-cloud-go#784 reporting a similar error and it is coming from the API server in GKE cluster. Is this happening consistently?

@jravetch
Copy link
Author

jravetch commented Nov 18, 2020

We are running 1.16.13-gke.401. Not sure how that issue is related as I don't see GKE mentioned and was closed over 2yrs ago. We are not seeing this error outside of Velero backup logs.

@jravetch
Copy link
Author

I am also able list all secrets via kubectl

kubectl get secrets --all-namespaces | wc -l
4184

@carlisia carlisia added Info Received Needs investigation and removed Needs info Waiting for information labels Nov 30, 2020
@jravetch
Copy link
Author

jravetch commented Dec 2, 2020

After cleaning up 3k+ unused secrets in the affected cluster, velero was able to list and backup the secrets without issue. @ashish-amarnath are you aware of any limitation when backing up large number of secrets?

@vmware-tanzu vmware-tanzu locked and limited conversation to collaborators Jan 13, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

3 participants