Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Commit

Permalink
Fix sharded federation sender sometimes using 100% CPU.
Browse files Browse the repository at this point in the history
We pull all destinations requiring catchup from the DB in batches.
However, if all those destinations get filtered out (due to the
federation sender being sharded), then the `last_processed` destination
doesn't get updated, and we keep requesting the same set repeatedly.
  • Loading branch information
erikjohnston committed Apr 8, 2021
1 parent 48d44ab commit 3a569fb
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 2 deletions.
1 change: 1 addition & 0 deletions changelog.d/9770.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix bug where sharded federation senders could get stuck repeatedly querying the DB in a loop, using lots of CPU.
6 changes: 4 additions & 2 deletions synapse/federation/sender/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -734,16 +734,18 @@ async def _wake_destinations_needing_catchup(self) -> None:
self._catchup_after_startup_timer = None
break

last_processed = destinations_to_wake[-1]

destinations_to_wake = [
d
for d in destinations_to_wake
if self._federation_shard_config.should_handle(self._instance_name, d)
]

for last_processed in destinations_to_wake:
for destination in destinations_to_wake:
logger.info(
"Destination %s has outstanding catch-up, waking up.",
last_processed,
)
self.wake_destination(last_processed)
self.wake_destination(destination)
await self.clock.sleep(CATCH_UP_STARTUP_INTERVAL_SEC)

0 comments on commit 3a569fb

Please sign in to comment.