-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage gets corrupted after podman pull is killed #14003
Comments
@mtrmac would your fixes in c/storage also address this issue? |
The issue description links to containers/storage#1136 , and seems consistent with that at a short glance (I didn’t try to reproduce). The two PRs that are waiting for review target inconsistent overlay driver state, and don’t fix this locking issue. |
A friendly reminder that this issue had no activity for 30 days. |
Since those two PRs were merged and podman has revendored storage, I am assuming this is fixed. Reopen if I am mistaken. |
|
I'm hitting this also, rebooting did not fix the issue. Update: |
|
A friendly reminder that this issue had no activity for 30 days. |
@mtrmac Should this still be opened? |
AFAIK, containers/storage#1136 is still outstanding (and I’m not working on it). As for whether Podman needs to track this separately from the c/storage issue, I don’t have a strong opinion. |
Ok I am going to close this issue, and we can follow it in containers/storage. |
In order to avoid a podman issue [1] causing a layer corruption when an image pull is killed midway, let's move the image pull outside of the timeout command. The timeout was recently reduced to 20 seconds with [2] making the issue more likely to happen. [1] containers/podman#14003 [2] openshift#3271
In order to avoid a podman issue [1] causing a layer corruption when an image pull is killed midway, let's move the image pull outside of the timeout command. The timeout was recently reduced to 20 seconds with [2] making the issue more likely to happen. [1] containers/podman#14003 [2] openshift#3271
I am experiencing this bug in centos stream 9. Is there a way to fix my podman host without wiping everything? |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
Podman storage gets corrupted if Podman is killed when a layer is incomplete.
Steps to reproduce the issue:
podman pull gcr.io/tensorflow-testing/nosla-cuda11.2-cudnn8.1-ubuntu18.04-manylinux2010-multipython@sha256:5102e2651975df6c131c4f0cb22454b81d509a7be2a3d98351a876d3f85ef2b8
kill the pull process when one of the layer is incomplete by monitoring /var/lib/containers/storage/overlay-layers/layers.json
Describe the results you received:
All three instances of podman pull returned the following error:
Note: since at this point the image is not downloaded yet, running
podman rmi
will not recover. Onlypodman system reset
helps.In addition, if after step 2 (killing podman pull when a layer is incomplete), I only run one podman pull. This podman pull will succeed. However
podman image inspect
will fail for this image with the errorError: layer not known
.podman run
will also fail with readlink error as described in containers/storage#1136Describe the results you expected:
I expected the following podman pulls succeed.
Additional information you deem important (e.g. issue happens only occasionally):
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
The text was updated successfully, but these errors were encountered: