Skip to content

Commit

Permalink
feat: support dynamic scaling of stable ReplicaSet as inverse of cana…
Browse files Browse the repository at this point in the history
…ry weight

Signed-off-by: Jesse Suen <[email protected]>
  • Loading branch information
jessesuen committed Aug 28, 2021
1 parent 89062e3 commit c522218
Show file tree
Hide file tree
Showing 23 changed files with 2,074 additions and 445 deletions.
34 changes: 33 additions & 1 deletion docs/features/canary.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ If no `duration` is specified for a pause step, the rollout will be paused indef
kubectl argo rollouts promote <rollout>
```

## Controlling Canary Scale
## Dynamic Canary Scale (with Traffic Routing)

By default, the rollout controller will scale the canary to match the current trafficWeight of the
current step. For example, if the current weight is 25%, and there are four replicas, then the
Expand Down Expand Up @@ -116,6 +116,38 @@ If no `duration` is specified for a pause step, the rollout will be paused indef
kubectl argo rollouts promote <rollout>
```

## Dynamic Stable Scale (with Traffic Routing)

!!! important
Available since v1.1

When using traffic routing, the stable ReplicaSet is left scaled to 100% during the update by default.
This has the advantage that if an abort occurs, traffic can be immediately shifted back to the
stable ReplicaSet without delay.

It is possible to reduce the scale of the stable ReplicaSet during update such that it scales down as
the traffic weight increases to canary. This would be desirable in scenarios where many pods
are run.

```yaml
spec:
strategy:
canary:
dynamicStableScale: true
```

NOTE: that if `dynamicStableScale` is set, and the rollout is aborted, the canary ReplicaSet will
scale down as traffic shifts back to stable. If you wish to leave the canary ReplicaSet scaled
up while aborting, then an explicit value for `abortScaleDownDelay` should be set

```yaml
spec:
strategy:
canary:
dynamicStableScale: true
abortScaleDownDelay: 30
```


## Mimicking Rolling Update
If the `steps` field is omitted, the canary strategy will mimic the rolling update behavior. Similar to the deployment, the canary strategy has the `maxSurge` and `maxUnavailable` fields to configure how the Rollout should progress to the new version.
Expand Down
40 changes: 40 additions & 0 deletions manifests/crds/rollout-crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,8 @@ spec:
type: object
canaryService:
type: string
dynamicStableScale:
type: boolean
maxSurge:
anyOf:
- type: integer
Expand Down Expand Up @@ -2776,6 +2778,44 @@ spec:
- name
- status
type: object
weights:
properties:
additional:
items:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
type: array
canary:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
stable:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
required:
- canary
- stable
type: object
type: object
collisionCount:
format: int32
Expand Down
40 changes: 40 additions & 0 deletions manifests/install.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10003,6 +10003,8 @@ spec:
type: object
canaryService:
type: string
dynamicStableScale:
type: boolean
maxSurge:
anyOf:
- type: integer
Expand Down Expand Up @@ -12474,6 +12476,44 @@ spec:
- name
- status
type: object
weights:
properties:
additional:
items:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
type: array
canary:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
stable:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
required:
- canary
- stable
type: object
type: object
collisionCount:
format: int32
Expand Down
40 changes: 40 additions & 0 deletions manifests/namespace-install.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10003,6 +10003,8 @@ spec:
type: object
canaryService:
type: string
dynamicStableScale:
type: boolean
maxSurge:
anyOf:
- type: integer
Expand Down Expand Up @@ -12474,6 +12476,44 @@ spec:
- name
- status
type: object
weights:
properties:
additional:
items:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
type: array
canary:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
stable:
properties:
podTemplateHash:
type: string
serviceName:
type: string
weight:
format: int32
type: integer
type: object
required:
- canary
- stable
type: object
type: object
collisionCount:
format: int32
Expand Down
43 changes: 42 additions & 1 deletion pkg/apiclient/rollout/rollout.swagger.json
Original file line number Diff line number Diff line change
Expand Up @@ -701,9 +701,13 @@
"currentExperiment": {
"type": "string",
"title": "CurrentExperiment indicates the running experiment"
},
"weights": {
"$ref": "#/definitions/github.jparrowsec.cn.argoproj.argo_rollouts.pkg.apis.rollouts.v1alpha1.TrafficWeights",
"title": "Weights records the weights which have been set on traffic provider. Only valid when using traffic routing"
}
},
"title": "CanaryStatus status fields that only pertain to the canary rollout"
"title": "CanaryStatus status fields that only pertain to the b rollout"
},
"github.jparrowsec.cn.argoproj.argo_rollouts.pkg.apis.rollouts.v1alpha1.CanaryStep": {
"type": "object",
Expand Down Expand Up @@ -792,6 +796,10 @@
"type": "integer",
"format": "int32",
"title": "AbortScaleDownDelaySeconds adds a delay in second before scaling down the canary pods when update\nis aborted for canary strategy with traffic routing (not applicable for basic canary).\n0 means canary pods are not scaled down.\nDefault is 30 seconds.\n+optional"
},
"dynamicStableScale": {
"type": "boolean",
"description": "DynamicStableScale is a traffic routing feature which dynamically scales the stable and canary\nReplicaSets to minimize total pods which are running during an update. This is calculated by\nscaling down the stable as traffic is increased to canary. When disabled (the default behavior)\nthe stable ReplicaSet remains fully scaled to support instantaneous aborts."
}
},
"title": "CanaryStrategy defines parameters for a Replica Based Canary"
Expand Down Expand Up @@ -1412,6 +1420,39 @@
},
"description": "TLSRoute holds the information on the virtual service's TLS/HTTPS routes that are desired to be matched for changing weights."
},
"github.jparrowsec.cn.argoproj.argo_rollouts.pkg.apis.rollouts.v1alpha1.TrafficWeights": {
"type": "object",
"properties": {
"canary": {
"$ref": "#/definitions/github.jparrowsec.cn.argoproj.argo_rollouts.pkg.apis.rollouts.v1alpha1.WeightDestination"
},
"stable": {
"$ref": "#/definitions/github.jparrowsec.cn.argoproj.argo_rollouts.pkg.apis.rollouts.v1alpha1.WeightDestination"
},
"additional": {
"type": "array",
"items": {
"$ref": "#/definitions/github.jparrowsec.cn.argoproj.argo_rollouts.pkg.apis.rollouts.v1alpha1.WeightDestination"
}
}
},
"title": "TrafficWeights describes the current status of how traffic has been split"
},
"github.jparrowsec.cn.argoproj.argo_rollouts.pkg.apis.rollouts.v1alpha1.WeightDestination": {
"type": "object",
"properties": {
"weight": {
"type": "integer",
"format": "int32"
},
"serviceName": {
"type": "string"
},
"podTemplateHash": {
"type": "string"
}
}
},
"google.protobuf.Any": {
"type": "object",
"properties": {
Expand Down
1 change: 1 addition & 0 deletions pkg/apis/api-rules/violation_exceptions.list
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,6 @@ API rule violation: list_type_missing,github.com/argoproj/argo-rollouts/pkg/apis
API rule violation: list_type_missing,github.com/argoproj/argo-rollouts/pkg/apis/rollouts/v1alpha1,RolloutStatus,Conditions
API rule violation: list_type_missing,github.com/argoproj/argo-rollouts/pkg/apis/rollouts/v1alpha1,RolloutStatus,PauseConditions
API rule violation: list_type_missing,github.com/argoproj/argo-rollouts/pkg/apis/rollouts/v1alpha1,TLSRoute,SNIHosts
API rule violation: list_type_missing,github.com/argoproj/argo-rollouts/pkg/apis/rollouts/v1alpha1,TrafficWeights,Additional
API rule violation: list_type_missing,github.com/argoproj/argo-rollouts/pkg/apis/rollouts/v1alpha1,WebMetric,Headers
API rule violation: names_match,github.com/argoproj/argo-rollouts/pkg/apis/rollouts/v1alpha1,RolloutStatus,HPAReplicas
Loading

0 comments on commit c522218

Please sign in to comment.