Skip to content

Commit 28f198d

Browse files
tgrosslgfa29
authored andcommitted
csi: fix handling of garbage collected node in node unpublish (#12350)
When a node is garbage collected, we assume that the volume is no longer attached to it and ignore the `ErrUnknownNode` error. But we used `errors.Is` to check for a wrapped error, and RPC flattens the errors during serialization. This results in an error check that works in automated testing but not in real clusters. Use a string contains check instead.
1 parent 0553297 commit 28f198d

File tree

2 files changed

+7
-2
lines changed

2 files changed

+7
-2
lines changed

.changelog/12350.txt

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
```release-note:bug
2+
csi: Fixed a bug where garbage collected nodes would block releasing a volume
3+
```

nomad/csi_endpoint.go

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
package nomad
22

33
import (
4-
"errors"
54
"fmt"
5+
"strings"
66
"time"
77

88
metrics "github.com/armon/go-metrics"
@@ -666,7 +666,9 @@ func (v *CSIVolume) nodeUnpublishVolumeImpl(vol *structs.CSIVolume, claim *struc
666666
// we should only get this error if the Nomad node disconnects and
667667
// is garbage-collected, so at this point we don't have any reason
668668
// to operate as though the volume is attached to it.
669-
if !errors.Is(err, structs.ErrUnknownNode) {
669+
// note: errors.Is cannot be used because the RPC call breaks
670+
// error wrapping.
671+
if !strings.Contains(err.Error(), structs.ErrUnknownNode.Error()) {
670672
return fmt.Errorf("could not detach from node: %w", err)
671673
}
672674
}

0 commit comments

Comments
 (0)