Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip,dnr,dnm] server,cli: bar decommissioned nodes from re-joining the cluster #54373

Closed

Conversation

irfansharif
Copy link
Contributor

*: persist a prevent startup file on decomm

Does not work for decommissioning non-live nodes. Does not actually get
checked either. Not sure if we want to use a file as such, or a store
local key. We're also arbitrarily using the first store to do such a
thing (should we just write it to every store?).

Release note: None

*: consult gating file on start up

And introduce (broken) --force flag to `cockroach node decommission`).

Release note: None

@irfansharif irfansharif requested a review from a team as a code owner September 14, 2020 22:59
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@irfansharif irfansharif removed the request for review from a team September 14, 2020 23:00
@irfansharif irfansharif force-pushed the 200910.decomm-gate branch 2 times, most recently from 525a6eb to a8077fb Compare September 23, 2020 23:44
*: consult gating file on start up

And introduce --force flag to `cockroach node decommission`).

Release note: None
@irfansharif
Copy link
Contributor Author

Abandoning this PR, @tbg is instead going to attempt the following approach:

We'll install a gossip listener near pkg/rpc that will listen in on changes to liveness records. When this listener learns of that a node is fully decommissioned, it will persist that information to a store local key/file. That file/information (also cached in-memory) will be checked in our rpc layer when heartbeating currently open connections, and when accepting new ones. This will effectively let us close out all connections to all fully decommissioned nodes. This file will also be checked during start up to populate our cache, and to maintain the running list of "nodes we shouldn't talk to anymore".

+cc @knz, this overlaps with areas I think you were planning on otherwise working on.

@irfansharif irfansharif deleted the 200910.decomm-gate branch September 24, 2020 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants