Skip to content

Commit

Permalink
add --capacity-controller-deployment-mode=local
Browse files Browse the repository at this point in the history
Producing CSIStorageCapacity objects for a node uses the same code,
the only difference is that there is just a single topology segment
that the external-provisioner needs to iterate over.

Also, that segment is fixed. Therefore we can use the simple mock
informer that previously was only used for testing.
  • Loading branch information
pohly committed Dec 15, 2020
1 parent 7b17f2c commit 3e7ea65
Show file tree
Hide file tree
Showing 5 changed files with 162 additions and 141 deletions.
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Note that the external-provisioner does not scale with more replicas. Only one e

See the [storage capacity section](#capacity-support) below for details.

* `--capacity-controller-deployment-mode=central`: Setting this enables producing CSIStorageCapacity objects with capacity information from the driver's GetCapacity call. 'central' is currently the only supported mode. Use it when there is just one active provisioner in the cluster. The default is to not produce CSIStorageCapacity objects.
* `--capacity-controller-deployment-mode=central|local`: Setting this enables producing CSIStorageCapacity objects with capacity information from the driver's GetCapacity call. Use `central` when there is just one active external-provisioner in the cluster. Use `local` when deploying external-provisioner on each node with distributed provisioning. The default is to not produce CSIStorageCapacity objects.

* `--capacity-ownerref-level <levels>`: The level indicates the number of objects that need to be traversed starting from the pod identified by the POD_NAME and POD_NAMESPACE environment variables to reach the owning object for CSIStorageCapacity objects: 0 for the pod itself, 1 for a StatefulSet, 2 for a Deployment, etc. Defaults to `1` (= StatefulSet).

Expand Down Expand Up @@ -151,7 +151,7 @@ determine with the `POD_NAME/POD_NAMESPACE` environment variables and
the `--capacity-ownerref-level` parameter. Other solutions will be
added in the future.

To enable this feature in a driver deployment (see also the
To enable this feature in a driver deployment with a central controller (see also the
[`deploy/kubernetes/storage-capacity.yaml`](deploy/kubernetes/storage-capacity.yaml)
example):

Expand All @@ -167,7 +167,7 @@ example):
fieldRef:
fieldPath: metadata.name
```
- Add `--enable-capacity=central` to the command line flags.
- Add `--capacity-controller-deployment-mode=central` to the command line flags.
- Add `StorageCapacity: true` to the CSIDriver information object.
Without it, external-provisioner will publish information, but the
Kubernetes scheduler will ignore it. This can be used to first
Expand All @@ -182,7 +182,7 @@ example):
with `--capacity-threads`.
- Optional: enable producing information also for storage classes that
use immediate volume binding with
`--enable-capacity=immediate-binding`. This is usually not needed
`--capacity-for-immediate-binding`. This is usually not needed
because such volumes are created by the driver without involving the
Kubernetes scheduler and thus the published information would just
be ignored.
Expand Down Expand Up @@ -232,6 +232,14 @@ CSIStorageCapacity objects, so in theory a malfunctioning or malicious
driver deployment could also publish incorrect information about some
other driver.

The deployment with [distributed
provisioning](#distributed-provisioning) is almost the same as above,
with some minor changes:
- Add `--capacity-controller-deployment-mode=local` to the command line flags.
- Use `--capacity-ownerref-level=0` and the `POD_NAMESPACE/POD_NAME`
variables to make the pod that contains the external-provisioner
the owner of CSIStorageCapacity objects for the node.

### CSI error and timeout handling
The external-provisioner invokes all gRPC calls to CSI driver with timeout provided by `--timeout` command line argument (15 seconds by default).

Expand Down
32 changes: 24 additions & 8 deletions cmd/csi-provisioner/csi-provisioner.go
Original file line number Diff line number Diff line change
Expand Up @@ -390,7 +390,8 @@ func main() {
)

var capacityController *capacity.Controller
if *capacityMode == capacity.DeploymentModeCentral {
if *capacityMode == capacity.DeploymentModeCentral ||
*capacityMode == capacity.DeploymentModeLocal {
podName := os.Getenv("POD_NAME")
namespace := os.Getenv("POD_NAMESPACE")
if podName == "" || namespace == "" {
Expand All @@ -407,13 +408,28 @@ func main() {
}
klog.Infof("using %s/%s %s as owner of CSIStorageCapacity objects", controller.APIVersion, controller.Kind, controller.Name)

topologyInformer := topology.NewNodeTopology(
provisionerName,
clientset,
factory.Core().V1().Nodes(),
factory.Storage().V1().CSINodes(),
workqueue.NewNamedRateLimitingQueue(rateLimiter, "csitopology"),
)
var topologyInformer topology.Informer
if *capacityMode == capacity.DeploymentModeCentral {
topologyInformer = topology.NewNodeTopology(
provisionerName,
clientset,
factory.Core().V1().Nodes(),
factory.Storage().V1().CSINodes(),
workqueue.NewNamedRateLimitingQueue(rateLimiter, "csitopology"),
)
} else {
var segment topology.Segment
if nodeDeployment == nil {
klog.Fatal("--capacity-controller-deployment-mode=local is only valid in combination with --node-deployment")
}
if nodeDeployment.NodeInfo.AccessibleTopology != nil {
for key, value := range nodeDeployment.NodeInfo.AccessibleTopology.Segments {
segment = append(segment, topology.SegmentEntry{Key: key, Value: value})
}
}
klog.Infof("producing CSIStorageCapacity objects with fixed topology segment %s", segment)
topologyInformer = topology.NewFixedNodeTopology(&segment)
}

// We only need objects from our own namespace. The normal factory would give
// us an informer for the entire cluster.
Expand Down
Loading

0 comments on commit 3e7ea65

Please sign in to comment.