Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC fails when manifest not found #15822

Closed
dkulchinsky opened this issue Oct 19, 2021 · 3 comments · Fixed by #16094
Closed

GC fails when manifest not found #15822

dkulchinsky opened this issue Oct 19, 2021 · 3 comments · Fixed by #16094
Assignees
Labels

Comments

@dkulchinsky
Copy link
Contributor

dkulchinsky commented Oct 19, 2021

Expected behavior and actual behavior:
GC is bailing out when a 404 "MANIFEST_UNKNOWN" is received from the registry, I expect the GC job to log this and continue to the next manifest/blob and not fail entirely.

the log message also suggest that there was a retry, which doesn't seem to make sense if the error code is a 404?

2021-10-19T15:57:45Z [INFO] [/jobservice/job/impl/gc/garbage_collection.go:261]: delete the manifest with registry v2 API: <project>/<repo>/releasectl, application/vnd.docker.distribution.manifest.v2+json, sha256:e92fd04beb144a930978f41c34d9af463d262890b0d6138fa1f5825ff565e409
2021-10-19T15:58:46Z [ERROR] [/jobservice/job/impl/gc/garbage_collection.go:264]: failed to delete manifest with v2 API, <project>/<repo>/releasectl, sha256:e92fd04beb144a930978f41c34d9af463d262890b0d6138fa1f5825ff565e409, retry timeout: http status code: 404, body: {"errors":[{"code":"MANIFEST_UNKNOWN","message":"manifest unknown"}]}
2021-10-19T15:58:46Z [ERROR] [/jobservice/job/impl/gc/garbage_collection.go:166]: failed to execute GC job at sweep phase, error: failed to delete manifest with v2 API: <project>/<repo>/releasectl, sha256:e92fd04beb144a930978f41c34d9af463d262890b0d6138fa1f5825ff565e409: retry timeout: http status code: 404, body: {"errors":[{"code":"MANIFEST_UNKNOWN","message":"manifest unknown"}]}

registry logs:

time="2021-10-19T15:58:45.52245501Z" level=info msg="authorized request" go.version=go1.15.12 http.request.host="harbor-registry:5000" http.request.id=cf340697-f870-4145-b45f-21427c32b3e0 http.request.method=DELETE http.request.remoteaddr="127.0.0.1:32952" http.request.uri="/v2/<project>/<repo>/releasectl/manifests/sha256:e92fd04beb144a930978f41c34d9af463d262890b0d6138fa1f5825ff565e409" http.request.useragent=harbor-registry-client vars.name="<project>/<repo>/releasectl" vars.reference="sha256:e92fd04beb144a930978f41c34d9af463d262890b0d6138fa1f5825ff565e409"
time="2021-10-19T15:58:45.597635172Z" level=error msg="response completed with error" auth.user.name="harbor_registry_user" err.code="manifest unknown" err.message="manifest unknown" go.version=go1.15.12 http.request.host="harbor-registry:5000" http.request.id=cf340697-f870-4145-b45f-21427c32b3e0 http.request.method=DELETE http.request.remoteaddr="127.0.0.1:32952" http.request.uri="/v2/<project>/<repo>/releasectl/manifests/sha256:e92fd04beb144a930978f41c34d9af463d262890b0d6138fa1f5825ff565e409" http.request.useragent=harbor-registry-client http.response.contenttype="application/json; charset=utf-8" http.response.duration=141.276028ms http.response.status=404 http.response.written=70 vars.name="<project>/<repo>/releasectl" vars.reference="sha256:e92fd04beb144a930978f41c34d9af463d262890b0d6138fa1f5825ff565e409" 

This issue started after upgrading from v2.3.2 to v2.3.3, as far as we can tell.

Steps to reproduce the problem:
I'm not sure exactly how we ended up in this situation, in v2.3.2 we had occasional failures of GC, but the logs would either not have any errors or it would fail because of a retry timeout, after upgrading to v2.3.3 we've started hitting this issue.

Versions:
Please specify the versions of following systems.

  • harbor version: v2.3.3
  • docker engine version: N/A
  • docker-compose version: N/A

Additional context:

N/A

@dkulchinsky
Copy link
Contributor Author

@heww any chance GC can skip 404 (MANIFEST_UNKNOWN) errors instead of failing the GC entirely? with v2.3.3 we can't seem to never able to complete a GC run and have to keep re-firing it manually several times a day.

@dkulchinsky
Copy link
Contributor Author

@wy65701436 any thoughts on this? 🙏🏼

our GC executions fail all the time when hitting this 404 and our blobs/manifests to GC is growing rapidly and I'm very concerned about hitting the performance issues described in #15807 & #12948

@Vad1mo
Copy link
Member

Vad1mo commented Nov 3, 2021

related #15469

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants