Skip to content

Commit

Permalink
fix(diff): always resyncMonitors the first time after acquiring watch…
Browse files Browse the repository at this point in the history
… leader lease

When watch leader lease gets acquired the first time during a normal
startup, discoveryResyncCh got its initial signal from the first cluster
discovery sync, thus resyncMonitors runs exactly once as expected.

However, when the watch leader lease is lost after a while and later
re-acquired, discoveryResyncCh would not generate new signals if the
cluster is unchanged, so the watcher is stuck at the initial state with
no monitors, thus workers never receive any events. When the same
instance later acquires the diff writer lease, since diff workers never
receive any events, the diff writer would not write any diff.

This is fixed by always ensuring a resyncMonitors run during each
resyncMonitorsLoop call (which is called during each watch leader term).
  • Loading branch information
SOF3 committed Feb 14, 2025
1 parent f00842a commit 38ac4a9
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions pkg/diff/controller/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,17 @@ func (ctrl *controller) resyncMonitorsLoop(ctx context.Context) {
go writeElector.RunLeaderMetricLoop(ctx)

for {
err := retry.OnError(retry.DefaultBackoff, func(_ error) bool { return true }, func() error {
err := ctrl.resyncMonitors(ctx)
if err != nil {
logger.WithError(err).Warn("resync monitors")
}
return err
})
if err != nil {
logger.WithError(err).Error("resync monitors failed")
}

stopped := false

select {
Expand All @@ -317,17 +328,6 @@ func (ctrl *controller) resyncMonitorsLoop(ctx context.Context) {
if stopped {
break
}

err := retry.OnError(retry.DefaultBackoff, func(_ error) bool { return true }, func() error {
err := ctrl.resyncMonitors(ctx)
if err != nil {
logger.WithError(err).Warn("resync monitors")
}
return err
})
if err != nil {
logger.WithError(err).Error("resync monitors failed")
}
}

ctrl.monitorsLock.Lock()
Expand Down

0 comments on commit 38ac4a9

Please sign in to comment.