Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gossip: remove frequent gossiping of gossip client connections #89613

Merged
merged 1 commit into from
Dec 21, 2022

Conversation

a-robinson
Copy link
Contributor

@a-robinson a-robinson commented Oct 8, 2022

These gossip-clients keys make up two thirds or more of the gossip traffic in various large clusters I've inspected, consuming almost an entire CPU core in the worst case I've seen. They don't provide enough value to justify that sort of ongoing cost, so this commit removes them entirely as well as the periodic logging of the gossip network and the crdb_internal.gossip_network table, both of which relied on them.

These gossip-clients keys make up two thirds or more of the gossip
traffic in various large clusters I've inspected, consuming almost an
entire CPU core in the worst case I've seen. They don't provide enough
value to justify that sort of ongoing cost, so this commit removes them
entirely as well as the periodic logging of the gossip network and the
crdb_internal.gossip_network table, both of which relied on them.

Release note (backward-incompatible change): We've stopped
supporting/populating the crdb_internal.gossip_network table. It was an
internal table with no API guarantees (so perhaps no meriting a release
note?).

Release note (performance improvement): Significantly reduced CPU usage
of the underlying gossip network in large clusters.


Informs #51838 (largely fixes it for practical purposes, although there's likely still more that could be done)

This is clearly going to break the gossip roachtest, but between @irfansharif kindly volunteering to fix that up separately and his existing TODO in that file I've left that out of this change.

I don't know if completely removing the gossip_network table is really the best idea or if it should just be left in and only populated with the clients from the local node. For example, when run in a mixed version cluster does debug zip run all of its sql commands against the local node or does it run some against remote nodes? If an old node ever tries to query the gossip_network table on a different node it could have a bad time.

@irfansharif @kvoli

@a-robinson a-robinson requested review from a team as code owners October 8, 2022 18:51
@a-robinson a-robinson requested a review from a team October 8, 2022 18:51
@blathers-crl
Copy link

blathers-crl bot commented Oct 8, 2022

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

I have added a few people who may be able to assist in reviewing:

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Oct 8, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@a-robinson
Copy link
Contributor Author

(apologies in advance if some tests fail in CI -- my laptop is able to build cockroach but not really set up to run the full test suite, so I'm relying on CI to point out any issues)

@blathers-crl
Copy link

blathers-crl bot commented Oct 9, 2022

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@irfansharif irfansharif removed request for a team, knz and ajwerner October 10, 2022 13:41
Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @a-robinson.

For example, when run in a mixed version cluster does debug zip run all of its sql commands against the local node or does it run some against remote nodes?

I'm not certain on this one. I think local node but I'm not familiar here. cc @irfansharif.

The failing test TestGossipPropagation seems due to no component actively removing expired infos. Previously the clientsTimer would tick in this test every 2 seconds so it would eventually succeed.

I think in a real cluster, given the store gossip @ 10s so there is still something removing it, should be fine - just specific to this test. Triggering gossip w/ a key inside the retry loop should fix

mustAdd(remote, "remote-trigger", nil, 2 * time.Minute)

if _, err := g.GetInfo(gossip.MakeGossipClientsKey(d.NodeID)); err == nil {
alive[d.NodeID] = true
var gossipLiveness livenesspb.Liveness
if err := g.GetInfoProto(gossip.MakeNodeLivenessKey(d.NodeID), &gossipLiveness); err == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this change needed to pass some tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't actually tried running any tests without this. It just seemed like it would be pretty bad if crdb_internal.gossip_nodes suddenly switched to always saying that all nodes are not live (due to the gossip clients keys being missing). Especially since IIRC it's the table that's used by CLI commands like cockroach node status.

And basing it on node liveness seems like a reasonable alternative, although I'm open to other ideas.

@a-robinson
Copy link
Contributor Author

I think in a real cluster, given the store gossip @ 10s so there is still something removing it, should be fine - just specific to this test. Triggering gossip w/ a key inside the retry loop should fix

Thanks, I suspect so too. I just haven't had a good chance to fix it up yet, but will do soon.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @a-robinson, @irfansharif, and @kvoli)


pkg/sql/crdb_internal.go line 4234 at r1 (raw file):

// crdbInternalGossipNetwork exposes the local view of the gossip network (i.e
// the gossip client connections from source_id node to target_id node).
var crdbInternalGossipNetworkTable = virtualSchemaTable{

drive-by comment: i'd be slightly more inclined to make this an empty table (or do the idea to only show local clients) rather than removing the table outright

we could also have it do p.BufferClientNotice to warn that this table is no longer reliable

@irfansharif
Copy link
Contributor

I'll just take over this PR + fallout, through next week.

@irfansharif irfansharif self-assigned this Oct 14, 2022
@a-robinson
Copy link
Contributor Author

I'll just take over this PR + fallout, through next week.

Thanks!

@a-robinson
Copy link
Contributor Author

Congrats on getting 22.2 out! I'd hope this is something that will make it into v23.1?

Let me know if it'd help for me to clean something up here, although it's unlikely that I'm the best person to make the call on what to do with the crdb_internal.gossip_network` table (or any other compatibility concerns) at this point.

@irfansharif irfansharif requested a review from a team as a code owner December 19, 2022 16:49
@irfansharif irfansharif requested review from herkolategan and smg260 and removed request for a team December 19, 2022 16:49
Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a reviewer here. @kvoli and @tbg, up for it? Alex did most of the work so you know it's good. Do review it with backportability to 22.2 (at the very least) in mind. Was thinking we'd let this bake for a two-ish weeks pre-backport. If it doesn't feel eligible for a backport now that 22.2.0 was published publicly, I'll be a bit regretful of dropping it earlier but would understand.


I don't know if completely removing the gossip_network table is really the best idea or if it should just be left in and only populated with the clients from the local node.

I've re-added the table but left it empty. It just makes it slightly more palatable for a backport.

we could also have it do p.BufferClientNotice to warn that this table is no longer reliable

Done.

For example, when run in a mixed version cluster does debug zip run all of its sql commands against the local node or does it run some against remote nodes? If an old node ever tries to query the gossip_network table on a different node it could have a bad time.

Runs it against the local node. But we do want to support older version CRDB binaries in a point release querying newer point release CRDB servers, so keeping the table around seems ok. For mixed-version clusters while the cluster version hasn't yet been finalized, we want to be able to use the older binary generally. There too keeping the table around seems helpful. In a future release we can get rid of it entirely.

The failing test TestGossipPropagation seems due to no component actively removing expired infos.

Fixed.

This is clearly going to break the gossip roachtest, but between @irfansharif kindly volunteering to fix that up separately and his existing TODO in that file I've left that out of this change.

Fixed.

It just seemed like it would be pretty bad if crdb_internal.gossip_nodes suddenly switched to always saying that all nodes are not live (due to the gossip clients keys being missing). Especially since IIRC it's the table that's used by CLI commands like cockroach node status. And basing it on node liveness seems like a reasonable alternative, although I'm open to other ideas.

Keeping gossip_nodes backed by the gossiped view of node liveness seems better than using the KV node liveness. It's generally good that these tables are highly available and don't need leaseholder participation. We could get that with liveness too by looking at the in-memory state on the node, but I'll keep it as is.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @a-robinson and @kvoli)

@irfansharif irfansharif requested a review from tbg December 19, 2022 22:13
@irfansharif irfansharif changed the title gossip: Remove frequent gossiping of gossip client connections gossip: remove frequent gossiping of gossip client connections Dec 19, 2022
These gossip-clients keys make up two thirds or more of the gossip
traffic in various large clusters I've inspected, consuming almost an
entire CPU core in the worst case I've seen. They don't provide enough
value to justify that sort of ongoing cost, so this commit removes them
entirely as well as the periodic logging of the gossip network and the
crdb_internal.gossip_network table, both of which relied on them.

Release note (backward-incompatible change): We've stopped
supporting/populating the crdb_internal.gossip_network table. It was an
internal table with no API guarantees (so perhaps no meriting a release
note?).

Release note (performance improvement): Significantly reduced CPU usage
of the underlying gossip network in large clusters.
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: thanks for pushing this over the finish line!

Reviewed 5 of 22 files at r1, 8 of 17 files at r2, 1 of 2 files at r3, 3 of 11 files at r4, all commit messages.
Dismissed @kvoli from a discussion.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @a-robinson and @irfansharif)


-- commits line 9 at r5:
Still reading the PR, but is there a chance in the ./cockroach node status data sources (see cli roachtest change)? If so, worth pointing out.


pkg/cmd/roachtest/tests/cli.go line 112 at r5 (raw file):

	waitUntil([]string{
		"is_available is_live",
		"false false",

I am confused by this change, what's going on here?


pkg/sql/crdb_internal.go line 4108 at r5 (raw file):

			if err := g.GetInfoProto(gossip.MakeNodeLivenessKey(d.NodeID), &gossipLiveness); err == nil {
				if now.Before(gossipLiveness.Expiration.ToTimestamp().GoTime()) {
					alive[d.NodeID] = true

This works, but I wonder if this is the canonical way to do this sort of thing in unified architecture. I would've thought the tenant would have a liveness view of KV around somewhere (possibly powered by Gossip under the hood). Not to block the merge - after all this works - but maybe @knz can take a look and see if this is palatable long-term.

Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do review it with backportability to 22.2 (at the very least) in mind. Was thinking we'd let this bake for a two-ish weeks pre-backport. If it doesn't feel eligible for a backport now that 22.2.0 was published publicly, I'll be a bit regretful of dropping it earlier but would understand.

I don't see an issue backporting.

:lgtm:

Reviewed 2 of 22 files at r1, 1 of 17 files at r2, 8 of 11 files at r4, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @a-robinson and @irfansharif)

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @a-robinson and @tbg)


-- commits line 9 at r5:

Previously, tbg (Tobias Grieger) wrote…

Still reading the PR, but is there a chance in the ./cockroach node status data sources (see cli roachtest change)? If so, worth pointing out.

There's no change.


pkg/cmd/roachtest/tests/cli.go line 112 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I am confused by this change, what's going on here?

It's because of the changes under pkg/sql/crdb_internal.go. Before the is_live column was computed off of the existence of a gossip clients key, which for n1 in test test, would've existed (n1 isn't taken down). But now that we no longer have these keys, is_live is backed by gossiped back liveness. And n1 can't heartbeat its liveness because we've killed 2/3 nodes.


pkg/sql/crdb_internal.go line 4108 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This works, but I wonder if this is the canonical way to do this sort of thing in unified architecture. I would've thought the tenant would have a liveness view of KV around somewhere (possibly powered by Gossip under the hood). Not to block the merge - after all this works - but maybe @knz can take a look and see if this is palatable long-term.

I'll defer to Rafa to sort out what a tenant-with-capabilities view of this would look like. I assume it's just going to be a table populated by data returned through the Connector interface, which in turn would do a very similar thing to what we're doing here.

Copy link
Contributor Author

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @irfansharif and @tbg)


pkg/cmd/roachtest/tests/cli.go line 112 at r5 (raw file):

Previously, irfansharif (irfan sharif) wrote…

It's because of the changes under pkg/sql/crdb_internal.go. Before the is_live column was computed off of the existence of a gossip clients key, which for n1 in test test, would've existed (n1 isn't taken down). But now that we no longer have these keys, is_live is backed by gossiped back liveness. And n1 can't heartbeat its liveness because we've killed 2/3 nodes.

FWIW it may be possible to use gossiped store descriptors instead of gossiped liveness since my understanding is that those get refreshed every couple minutes, but I don't know whether that would really be better overall. Using liveness seemed better to me at the time, but I didn't consider what the output should look like during cluster instability.

@craig
Copy link
Contributor

craig bot commented Dec 21, 2022

Build succeeded:

@blathers-crl
Copy link

blathers-crl bot commented Dec 21, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


Backport to branch 22.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@irfansharif
Copy link
Contributor

blathers backport 22.2

@blathers-crl
Copy link

blathers-crl bot commented Dec 21, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


Backport to branch 22.2 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@erikgrinaker
Copy link
Contributor

I think the 22.2 backport here broke backwards compatibility with 22.1. In a mixed 22.2/22.1 cluster, running cockroach node status on a 22.1 node will report is_live = false for 22.2 nodes. I'll have a closer look at exactly why.

@erikgrinaker
Copy link
Contributor

erikgrinaker commented May 23, 2023

Ah, right, 22.1 uses GossipClientsKey to determine liveness, but 22.2 no longer gossips that key and relies on NodeLivenessKey instead. This should've been behind a version gate.

I don't know if there's much we can do about this now. 22.1 is out of support with no further releases planned. We can consider backporting a retroactive 22.2 version gate, but it's not clear that it's worth the risk and effort.

@erikgrinaker
Copy link
Contributor

I guess we'll have the same problem with mixed 22.2 patch releases before/after this backport, and with mixed 22.2/23.1 clusters before this backport. The right approach here would've been to add a 23.1 version gate and not backport to 22.2.

I think the best option here would be to backport a 23.1 version gate, and to semi-revert the 22.2 backport such that it still gossips the GossipClientsKey but doesn't actually use it for anything. Alternatively, live with a broken is_live in cockroach node status with certain combinations of 22.2 mixed-version clusters.

@erikgrinaker
Copy link
Contributor

For reference, this was initially released in v22.2.3.

@erikgrinaker
Copy link
Contributor

erikgrinaker commented May 23, 2023

Opened #103788 with a partial 22.2 revert. I'm not entirely convinced we want to do that, considering the cost of gossiping these, but breaking backwards compatibility isn't great either.

@irfansharif
Copy link
Contributor

This should've been behind a version gate.

This was intentional BTW, since we did want to backport it to 22.2 and backports can't carry version gates. The is_live breakage in mixed version settings is slightly unfortunate but it's less meaningful IMO than the CPU% the backport reduces.

I'm not entirely convinced we want to do that, considering the cost of gossiping these

I'm with you. I feel like a release note pointing to newer patch releases is sufficient: #103788 (review).

@erikgrinaker
Copy link
Contributor

erikgrinaker commented May 23, 2023

We can't add version gates, but we can piggyback on the existing V22_2 version gate.

Of course, that still breaks patch release compat, but at least we stop the bleeding in mixed-major clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants