-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use openshift-master-api and openshift-master-controllers setup always #4832
Use openshift-master-api and openshift-master-controllers setup always #4832
Conversation
8096eaa
to
78cb4f8
Compare
aos-ci-test |
Changes required to support this in upgrades:
|
This does not include the change to move controllers to static pods because I realized we need to deal with some fundamental ordering problems before that (the node needs to be configured and ready ahead of time, instead of masters first). Requires the work @kwoodson will be doing to split out node preparation from node config - once that is set up, we may actually want to run node prep on masters, then remove the service units for both api and controller and replace them with static pods, and use the kubelet in run-once mode as well. I don't see much value in system containers for api or controllers, but there is a lot of value in those as static pods, especially with crio present |
78cb4f8
to
8961e7f
Compare
aos-ci-test |
Tsk tsk, jobs not setting etcd hosts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
statements
tasks: | ||
- fail: | ||
msg: No etcd hosts defined. Running an all-in-one master is deprecated and will no longer be supported in a future upgrade. | ||
when: groups.oo_etcd_to_config | default([]) | length == 0 and not openshift_master_unsupported_all_in_one | default(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put AND
d when
statements on multiple lines in a list format, e.g. groups.oo_etcd_to_config | default([]) | length == 0 and not openshift_master_unsupported_all_in_one | default(False)
should be more like this:
when:
- groups.oo_etcd_to_config | default([]) | length == 0
- not openshift_master_unsupported_all_in_one | default(False)
- name: restart master api | ||
systemd: name={{ openshift.common.service_type }}-master-api state=restarted | ||
when: (openshift.master.ha is defined and openshift.master.ha | bool) and (not (master_api_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' | ||
when: (not (master_api_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
notify: Verify API Server | ||
|
||
- name: restart master controllers | ||
systemd: name={{ openshift.common.service_type }}-master-controllers state=restarted | ||
when: (openshift.master.ha is defined and openshift.master.ha | bool) and (not (master_controllers_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' | ||
when: (not (master_controllers_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
- name: restart master api | ||
systemd: name={{ openshift.common.service_type }}-master-api state=restarted | ||
when: (openshift.master.ha is defined and openshift.master.ha | bool) and (not (master_api_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' | ||
when: (not (master_api_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
notify: Verify API Server | ||
|
||
- name: restart master controllers | ||
systemd: name={{ openshift.common.service_type }}-master-controllers state=restarted | ||
when: (openshift.master.ha is defined and openshift.master.ha | bool) and (not (master_controllers_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' | ||
when: (not (master_controllers_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
@@ -275,14 +228,14 @@ | |||
delay: 1 | |||
run_once: true | |||
changed_when: false | |||
when: openshift_master_ha | bool and openshift.master.cluster_method == 'native' and master_api_service_status_changed | bool | |||
when: openshift.master.cluster_method == 'native' and master_api_service_status_changed | bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
|
||
- name: Start and enable master controller on first master | ||
systemd: | ||
name: "{{ openshift.common.service_type }}-master-controllers" | ||
enabled: yes | ||
state: started | ||
when: openshift_master_ha | bool and openshift.master.cluster_method == 'native' and inventory_hostname == openshift_master_hosts[0] | ||
when: openshift.master.cluster_method == 'native' and inventory_hostname == openshift_master_hosts[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
|
||
- name: Start and enable master controller on all masters | ||
systemd: | ||
name: "{{ openshift.common.service_type }}-master-controllers" | ||
enabled: yes | ||
state: started | ||
when: openshift_master_ha | bool and openshift.master.cluster_method == 'native' and inventory_hostname != openshift_master_hosts[0] | ||
when: openshift.master.cluster_method == 'native' and inventory_hostname != openshift_master_hosts[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
notify: Verify API Server | ||
|
||
- name: restart master controllers | ||
systemd: name={{ openshift.common.service_type }}-master-controllers state=restarted | ||
when: (openshift.master.ha is defined and openshift.master.ha | bool) and (not (master_controllers_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' | ||
when: (not (master_controllers_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to do the multi-line when in a follow up, it would make this much harder to review and the re are way more references in these files than just my changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do all the multi-line when in #4947 (comment)
- name: restart master api | ||
systemd: name={{ openshift.common.service_type }}-master-api state=restarted | ||
when: (openshift.master.ha is defined and openshift.master.ha | bool) and (not (master_api_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' | ||
when: (not (master_api_service_status_changed | default(false) | bool)) and openshift.master.cluster_method == 'native' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line WHEN
@smarterclayton they're in gerrit, i've updated them so that the master is an etcd host. |
8961e7f
to
8738627
Compare
aos-ci-test |
8738627
to
4be1576
Compare
aos-ci-test |
@@ -32,7 +32,7 @@ | |||
openshift_facts: | |||
role: master | |||
local_facts: | |||
cluster_method: "{{ openshift_master_cluster_method | default(None) }}" | |||
cluster_method: "{{ openshift_master_cluster_method | default('native') }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that, no more openshift_master_cluster_method variable not set
after 30 minutes of installation.
We should not merge this PR until we have a code that supports both changes. Since if anyone tries to upgrade from an existing (e.g. 3.5 or 3.6) cluster to 3.7 with openshift-ansible-3.7.*, the master service will keep running and may cause disruptions during the upgrade. |
Prevents playbooks from accidentally restarting the master service.
8e28bbc
to
c71bed7
Compare
Updated with the upgrade script (which simply disables and masks origin-master) and also removes all additional places that start origin-master. The fedora job failure may be because the origin-master service is still started. Rerunning in case it picked up the older change. |
aos-ci-test |
[test] |
registry flake? |
aos-ci-test |
Looks like the tests passed in the one job. [test] again to see if it works this time. |
Evaluated for openshift ansible test up to c71bed7 |
continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_openshift_ansible/424/) (Base Commit: 566731d) (PR Branch Commit: c71bed7) |
This went green on install_update. The fedora error seems to be happening on other PRs. The status_check failure on test was a flake in the job. Seems to be ok from a test perspective. |
[merge] |
Evaluated for openshift ansible merge up to c71bed7 |
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_openshift_ansible/819/) (Base Commit: 5fe4c8c) (PR Branch Commit: c71bed7) |
\O/ congratz @smarterclayton |
Will do follow up for the other changes tomorrow or Friday - specifically the when changes |
This breaks things like the metrics deployer that assume non-ha uses the atomic-openshift-master system unit. Will start filing BZs for other areas to adopt this change. |
The installer now always configures two services on the master for the api server and the controllers. Native clustering is the default configuration mode, even when only one master is configured. A warning is added if there are no etcd groups configured. Finally, the new client based leader election code in the controllers is used as the default, which removes the need for the controllers to have access to etcd.
This will set the stage for running controllers and apiservers as static pods.
@sdodson