Protect against starting rke2-server "by accident" on rke2-agent nodes #1590

Martin-Weiss · 2021-08-10T09:35:53Z

Is your feature request related to a problem? Please describe.
We have started rke2-server instead of rke2-agent "by accident" on a system. This i.e. caused one more etcd to be created and this also caused a change in the cluster CIDR settings for kube-proxy, because on agents the settings for CIRD were not included in the config.yaml.

Describe the solution you'd like
We should have a config option "agent or server" in the config.yaml and have only one single service.
Or we should detect during server or agent start, that the "other" one has been enabled or similar..

Describe alternatives you've considered
Do not make "human" mistakes ;-).

brandond · 2021-08-10T16:51:32Z

For the RPM packages at least, we only install the unit for one - whichever is selected during install:

rke2/install.sh

Lines 25 to 27 in cfa99d2

    
           #   - INSTALL_RKE2_TYPE 
        
           #     Type of rke2 service. Can be either "server" or "agent". 
        
           #     Default is "server".

We could probably modify the tarball installer to not drop the unit for the other type.

The two units also conflict with each other, so you can't start them at the same time:

rke2/bundle/lib/systemd/system/rke2-agent.service

Line 6 in cfa99d2

Conflicts=rke2-server.service

Martin-Weiss · 2021-08-10T16:56:24Z

But "systemctl stop rke2-agent; systemctl start rke2-server" does work and has caused a big problem.... maybe we could get a "type=[agent|server]" option for config.yaml?

mstrent · 2021-09-02T17:29:10Z

Just accidentally ran "systemctl start rke2-server" on all of my agent nodes and Bad Things happened. Still working on recovering.

This is v1.21.4+rke2r2. Installed via "curl -sfL https://get.rke2.io | sh -" method.

mstrent · 2021-09-02T20:53:56Z

I had to completely blow away and recreate the cluster. I didn't capture all the logs or issues or things I tried. But suffice to say, doing this is very bad and should be more actively prevented.

mstrent · 2021-09-02T21:22:52Z

In the mean time, I'm adding this to my Ansible deploy scripts for agent nodes:

# Accidentally starting rke2-server on an agent can totally b0rk the cluster.
- name: Delete rke2-server systemd unit files for safety
  file:
    path: "{{ item }}"
    state: absent
  with_items:
    - /usr/local/lib/systemd/system/rke2-server.env
    - /usr/local/lib/systemd/system/rke2-server.service
  register: server_service

- name: Reload systemd
  systemd:
    daemon_reload: yes
  when: server_service.changed

stale · 2022-03-02T02:25:21Z

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

stale bot added the status/stale label Mar 2, 2022

stale bot closed this as completed Mar 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Protect against starting rke2-server "by accident" on rke2-agent nodes #1590

Protect against starting rke2-server "by accident" on rke2-agent nodes #1590

Martin-Weiss commented Aug 10, 2021

brandond commented Aug 10, 2021 •

edited

Loading

Martin-Weiss commented Aug 10, 2021 •

edited

Loading

mstrent commented Sep 2, 2021 •

edited

Loading

mstrent commented Sep 2, 2021

mstrent commented Sep 2, 2021 •

edited

Loading

stale bot commented Mar 2, 2022

Protect against starting rke2-server "by accident" on rke2-agent nodes #1590

Protect against starting rke2-server "by accident" on rke2-agent nodes #1590

Comments

Martin-Weiss commented Aug 10, 2021

brandond commented Aug 10, 2021 • edited Loading

Martin-Weiss commented Aug 10, 2021 • edited Loading

mstrent commented Sep 2, 2021 • edited Loading

mstrent commented Sep 2, 2021

mstrent commented Sep 2, 2021 • edited Loading

stale bot commented Mar 2, 2022

brandond commented Aug 10, 2021 •

edited

Loading

Martin-Weiss commented Aug 10, 2021 •

edited

Loading

mstrent commented Sep 2, 2021 •

edited

Loading

mstrent commented Sep 2, 2021 •

edited

Loading