Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SubPath unmount fails with "directory not empty" after smb-server Service IP changes #222

Closed
drigz opened this issue Feb 3, 2021 · 3 comments · Fixed by #268
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@drigz
Copy link

drigz commented Feb 3, 2021

What happened:

After the smb-server Service IP changes, I am unable to delete pods that have mounted smb volumes. The pods get stuck in "Terminating", and the kubelet logs contain:

Feb 03 18:40:58  kubelet[862425]: W0203 18:40:58.568146  862425 mount_helper_common.go:33] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume/1
Feb 03 18:41:19  kubelet[862425]: W0203 18:41:19.048156  862425 mount_helper_common.go:33] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume/2
Feb 03 18:41:19  kubelet[862425]: E0203 18:41:19.048366  862425 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/smb.csi.k8s.io^volume podName:ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6 nodeName:}" failed. No retries permitted until 2021-02-03 18:43:21.048305415 +0100 CET m=+12623.734651277 (durationBeforeRetry 2m2s). Error: "error cleaning subPath mounts for volume \"volume\" (UniqueName: \"kubernetes.io/csi/smb.csi.k8s.io^volume\") pod \"ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6\" (UID: \"ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6\") : error deleting /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume: remove /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume: directory not empty"

The "path does not exist" warning is wrong:

$ sudo ls /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume/1
ls: cannot access '/var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume/1': Host is down

What you expected to happen:

Pods can be deleted and recreated.

How to reproduce it:

  1. Create a volume based on smb-server.yaml.
  2. Mount the volume in a pod.
  3. Delete and recreate the smb-server service to change its IP. (I think this can happen other ways, I just did this to reproduce an error we encountered on another cluster)
  4. Restart the smb-server process. (not sure why this is required, maybe to break existing connections?)
  5. Try to delete the pod.

Anything else we need to know?:

#64 seems similar but the error is different.
kubernetes/kubernetes#97031 seems to be a similar codepath (and the fix may be the same?) but with a different cause.

Environment:

  • CSI Driver version: 0.4.0
  • Kubernetes version (use kubectl version): 1.18.10
  • OS (e.g. from /etc/os-release): Debian testing
  • Kernel (e.g. uname -a): 5.7.17
  • Install tools: manually copied YAMLs from this repo
  • Others:
@drigz
Copy link
Author

drigz commented Feb 3, 2021

I also have a question: if changing the service IP addresses breaks existing mounts, would it be better to specify a clusterIP in the smb-server Service?

@drigz
Copy link
Author

drigz commented Feb 3, 2021

Note: if you try to create a new pod instead of deleting an old one, you get the MountVolume.MountDevice failed for volume "workcell-spec" : stat /var/lib/kubelet/plugins/kubernetes.io/csi/pv/workcell-spec/globalmount: host is down error described in #164. I have also seen that error in our live clusters, although I don't if it correlated with an IP change in those cases.

@andyzhangx
Copy link
Member

looks like it's related to #164, need upstream fix, will take a look later.

@andyzhangx andyzhangx added the kind/bug Categorizes issue or PR as related to a bug. label Feb 19, 2021
andyzhangx added a commit to andyzhangx/csi-driver-smb that referenced this issue Aug 11, 2023
670bb0ef1 Merge pull request kubernetes-csi#229 from marosset/fix-codespell-errors
35d5e783c Merge pull request kubernetes-csi#219 from yashsingh74/update-registry
63473cc96 Merge pull request kubernetes-csi#231 from coulof/bump-go-version-1.20.5
29a5c76c7 Merge pull request kubernetes-csi#228 from mowangdk/chore/adopt_kubernetes_recommand_labels
8dd28211b Update cloudbuild image with go 1.20.5
1df23dba6 Merge pull request kubernetes-csi#230 from msau42/prow
1f92b7e7c Add ginkgo timeout to e2e tests to help catch any stuck tests
2b8b80ead fixing some codespell errors
c10b67804 Merge pull request kubernetes-csi#227 from coulof/check-sidecar-supported-versions
72984ec0a chore: adopt kubernetes recommand label
b05553510 Header
bd0a10b65 typo
c39d73c33 Add comments
f6491af0e Script to verify EOL sidecar version
4133d1df0 Merge pull request kubernetes-csi#226 from msau42/cloudbuild
8d519d237 Pin buildkit to v0.10.6 to workaround v0.11 bug with docker manifest
6e04a0301 Merge pull request kubernetes-csi#224 from msau42/cloudbuild
26fdfffdd Update cloudbuild image
6613c3980 Merge pull request kubernetes-csi#223 from sunnylovestiramisu/update
0e7ae993d Update k8s image repo url
77e47cce8 Merge pull request kubernetes-csi#222 from xinydev/fix-dep-version
155854b09 Fix dep version mismatch
8f839056a Merge pull request kubernetes-csi#221 from sunnylovestiramisu/go-update
1d3f94dd5 Update go version to 1.20 to match k/k v1.27
901bcb5a9 Update registry k8s.gcr.io -> registry.k8s.io

git-subtree-dir: release-tools
git-subtree-split: 670bb0ef135a53be44643cc34440eff22ad3ac8c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants