Skip to content

Commit 73afdea

Browse files
authored
csi: fix handling of garbage collected node in node unpublish (#12350)
When a node is garbage collected, we assume that the volume is no longer attached to it and ignore the `ErrUnknownNode` error. But we used `errors.Is` to check for a wrapped error, and RPC flattens the errors during serialization. This results in an error check that works in automated testing but not in real clusters. Use a string contains check instead.
1 parent 717053a commit 73afdea

File tree

2 files changed

+7
-2
lines changed

2 files changed

+7
-2
lines changed

.changelog/12350.txt

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
```release-note:bug
2+
csi: Fixed a bug where garbage collected nodes would block releasing a volume
3+
```

nomad/csi_endpoint.go

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
package nomad
22

33
import (
4-
"errors"
54
"fmt"
65
"net/http"
6+
"strings"
77
"time"
88

99
metrics "github.com/armon/go-metrics"
@@ -741,7 +741,9 @@ func (v *CSIVolume) nodeUnpublishVolumeImpl(vol *structs.CSIVolume, claim *struc
741741
// we should only get this error if the Nomad node disconnects and
742742
// is garbage-collected, so at this point we don't have any reason
743743
// to operate as though the volume is attached to it.
744-
if !errors.Is(err, structs.ErrUnknownNode) {
744+
// note: errors.Is cannot be used because the RPC call breaks
745+
// error wrapping.
746+
if !strings.Contains(err.Error(), structs.ErrUnknownNode.Error()) {
745747
return fmt.Errorf("could not detach from node: %w", err)
746748
}
747749
}

0 commit comments

Comments
 (0)