Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Scale Out functionality #613

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

radez
Copy link
Collaborator

@radez radez commented Feb 20, 2025

- Add nodes to worker inventory section and update vars in scaleout.yml to add nodes to the existing cluster.
- https://docs.openshift.com/container-platform/4.17/nodes/nodes/nodes-nodes-adding-node-iso.html
@openshift-ci openshift-ci bot requested review from akrzos and rsevilla87 February 20, 2025 13:04
Copy link

openshift-ci bot commented Feb 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: radez

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@josecastillolema
Copy link
Collaborator

/test ?

Copy link

openshift-ci bot commented Feb 20, 2025

@josecastillolema: The following commands are available to trigger required jobs:

/test deploy-5nodes
/test deploy-5nodes-dev
/test deploy-sno
/test deploy-sno-dev

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@josecastillolema
Copy link
Collaborator

/test deploy-sno

@akrzos
Copy link
Member

akrzos commented Feb 24, 2025

/test deploy-5nodes

@josecastillolema
Copy link
Collaborator

/test deploy-5nodes

The test failed because of:

   * could not run steps: step deploy-5nodes failed: failed to create credentials: could not read source credential: secrets "perfscale-metal-bastion" not found

Let me take care of this tomorrow morning, we need to update the secrets.
ps. Even with the updated secrets we will have the route issue :/
cc @akrzos

@akrzos
Copy link
Member

akrzos commented Feb 24, 2025

/test deploy-5nodes

The test failed because of:

   * could not run steps: step deploy-5nodes failed: failed to create credentials: could not read source credential: secrets "perfscale-metal-bastion" not found

Let me take care of this tomorrow morning, we need to update the secrets. ps. Even with the updated secrets we will have the route issue :/ cc @akrzos

Ok I was going to look into the route issue more this afternoon also.

@josecastillolema
Copy link
Collaborator

/test deploy-5nodes

The test failed because of:

   * could not run steps: step deploy-5nodes failed: failed to create credentials: could not read source credential: secrets "perfscale-metal-bastion" not found

Let me take care of this tomorrow morning, we need to update the secrets. ps. Even with the updated secrets we will have the route issue :/ cc @akrzos

Ok I was going to look into the route issue more this afternoon also.

Should be fixed when openshift/release#62015 merges

@akrzos
Copy link
Member

akrzos commented Feb 25, 2025

/test deploy-5nodes

Copy link
Member

@akrzos akrzos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we could include some sort of "limit" on initial deployment of a cluster such that say someone had a 200 node allocation given to them, and they ran create inventory, they would have 196 workers in the worker section. I am thinking we could make mno-deploy only deploy say 120 of the worker nodes and you'd have to use the mno-scale-out.yml playbook to increase the count of workers above this initial threshold. Do you think it is worth it to implement that into mno-deploy?

@@ -4,29 +4,29 @@
- name: Boot iso on dell hardware
include_tasks: dell.yml
with_items:
- "{{ groups[inventory_group][:index|int] }}"
- "{{ groups[inventory_group][offset:index|int] }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does offset need the cast to an int here (Line 7, 13, 19, and 25) like line 31 has?

hosts: bastion
vars_files:
- vars/scale_out.yml
- vars/all.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the all.yml needed for scaling up the cluster?

- name: Approve pending CSRs
shell: |
KUBECONFIG={{ bastion_cluster_config_dir }}/kubeconfig oc adm certificate approve {{ item.metadata.name }}
with_items: "{{ oc_get_csr.stdout | from_json | json_query(qry) }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we still have a lot of with_items throughout the playbooks and roles, but I think we should strive for new tasks to use loop instead.

@akrzos
Copy link
Member

akrzos commented Feb 27, 2025

/test deploy-5nodes

Copy link

openshift-ci bot commented Feb 27, 2025

@radez: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/deploy-sno 7bb5d2b link true /test deploy-sno
ci/prow/deploy-5nodes 7bb5d2b link true /test deploy-5nodes

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@akrzos
Copy link
Member

akrzos commented Feb 27, 2025

This needs to be rebased to pick up the fix in #619 for CI to work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants