Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: cannot enable always-on heartbeats if on demand heartbeats are configured #17828

Open
deepthi opened this issue Feb 19, 2025 · 2 comments · May be fixed by #17838
Open

Bug Report: cannot enable always-on heartbeats if on demand heartbeats are configured #17828

deepthi opened this issue Feb 19, 2025 · 2 comments · May be fixed by #17838

Comments

@deepthi
Copy link
Member

deepthi commented Feb 19, 2025

Overview of the Issue

I tried to enable heartbeats on a shard where all tablets had --heartbeat_on_demand_duration=10s. Once all tablets had been rolled with the new flag --heartbeat_enable=true it was observed that no heartbeats were actually being written into the _vt.heartbeat table.

Workaround: set --heartbeat_on_demand_duration=0s.

Reproduction Steps

Run the operator example, with vttablet extraFlags --heartbeat_enable=true and --heartbeat_on_demand_duration=10s

Binary Version

v19

Operating System and Environment details

any

Log Fragments

@deepthi deepthi added Needs Triage This issue needs to be correctly labelled and triaged Type: Bug labels Feb 19, 2025
@deepthi
Copy link
Member Author

deepthi commented Feb 20, 2025

We should handle the config.ReplicationTracker.Mode == tabletenv.Heartbeat case before config.ReplicationTracker.HeartbeatOnDemand > 0. See https://github.com/vitessio/vitess/blob/main/go/vt/vttablet/tabletserver/repltracker/writer.go#L101-L110

@mattlord mattlord added Component: Cluster management and removed Needs Triage This issue needs to be correctly labelled and triaged labels Feb 20, 2025
@deepthi deepthi linked a pull request Feb 21, 2025 that will close this issue
5 tasks
@shlomi-noach
Copy link
Contributor

See also:

Right now the flags are working as documented and as designed. If we now switch the "override" priority such that --heartbeat_enable overrides --heartbeat_on_demand_duration, this will break the expected behavior, and setups that expect heartbeat to be on-demand only, will suddenly see a surge of heartbeats.

It's a bit of a mess and if we want to break this behavior we might need a two-version rollout? (1st rollout is to announce, warn; 2nd rollout is to possibly make the flags mutually exclusive, force the new behavior?).

From #17838 (comment):

The default for heartbeats is 1s, but you might want to make them more frequent when they are on-demand without affecting the normal frequency. It was probably not a good idea to use the same frequency for both kinds of heartbeats.

That doesn't play well with the way the heartbeat writer is designed. Not impossible. If we want to do this, we may wish to rewrite the heartbeat writer mechanism, something I've avoided thus far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants