-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As a developer, I want to be notified when autoscaler kicks in during the workspaces startup #22598
Comments
A potentially easier alternative that we also discussed is enabling an option on the CheCluster that can be used to more easily configure the operators for working with auto-scaling. This could be done more quickly, in case detecting the "in-autoscale" state is tricky. |
@ibuziuk I tried creating a DWOC as you mentioned above. It does not appear to work. DWOC: apiVersion: controller.devfile.io/v1alpha1
kind: DevWorkspaceOperatorConfig
metadata:
name: scaling-workspace-config
namespace: openshift-devspaces
config:
workspace:
ignoredUnrecoverableEvents:
- FailedScheduling
progressTimeout: 600s devfile snippet: schemaVersion: 2.2.0
attributes:
controller.devfile.io/storage-type: per-workspace
controller.devfile.io/devworkspace-config: {"name": "scaling-workspace-config", "namespace": "openshift-devspaces"}
metadata:
name: che-workspace
components:
- name: dev-tools
container:
image: quay.io/cgruver0/che/che-dev-image:latest
etc... The resulting DevWorkspace object inherits the attributes as expected, but still fails to start immediately rather than waiting for node scaling. kind: DevWorkspace
spec:
contributions:
- kubernetes:
name: che-code-che-workspace
name: editor
routingClass: che
started: true
template:
attributes:
controller.devfile.io/devworkspace-config:
name: devworkspace-config
namespace: openshift-devspaces
controller.devfile.io/scc: container-build
controller.devfile.io/storage-type: per-workspace
projects:
- git:
remotes:
origin: https://github.com/cgruver/my-che-workspace.git
name: my-che-workspace
etc... Error: Error creating DevWorkspace deployment: Detected unrecoverable event FailedScheduling: 0/9 nodes are available: 2 Insufficient memory, 3 Insufficient cpu, 3 node(s) had untolerated taint {node-role.kubernetes.io/infra: }, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/9 nodes are available: 3 No preemption victims found for incoming pod, 6 Preemption is not helpful for scheduling... |
@cgruver Che (and Dev Spaces) have their own custom DevWorkspaceOperatorConfigs that are used in place of the one you created: # In your DevWorkspace object
controller.devfile.io/devworkspace-config:
name: devworkspace-config
namespace: openshift-devspaces You could try to edit that DWOC, or, alternatively, configure the DevWorkspace Operator itself by creating your dwoc in DWO's install namespace ( |
I think adding this as a parameter in the CheCluster CRD is a great idea. |
ping? I created a PoC. Is it acceptable? |
@monaka |
@tolusha Thanks for your check. |
I suppose this can be closed as eclipse-che/che-operator#1864 was merged. @ibuziuk |
@monaka thanks, but I do not think we can close this issue since there is no update on the user dashboard with appropriate notification. |
@ibuziuk @dkwon17 I have tried to set up cluster autoscaler but unfortunately I did not manage to cause new node provision. However I noticed that scaling up a machine set causes a new node provision. It means that we can intercept the |
@vinokurig yes, I think it is a good idea to use |
Is your enhancement related to a problem? Please describe
Currently in order to have the machine auto scaler support admin needs to create DWOC object on the cluster:
Basically, it means that
FailedScheduling
events would be ignored during the workspace startup + that workspace startup time would be longer (depending on infra it takes around 10 mins for a new node to be provisioned)Describe the solution you'd like
Ideally, DWO should detect when autoscaler kicks in and update DWOC accordingly, on the workspace startup screen the notification banner should be shown informing user that workspace startup would take longer due to a new node being provisioned:
Describe alternatives you've considered
Properly document DWOC config for autoscaler support
Additional context
Initial implementation of the Machine Autoscaler support - https://issues.redhat.com/browse/CRW-4072
The text was updated successfully, but these errors were encountered: