Skip to content

Commit

Permalink
Skip gradient averaging if there are no other peers (#440)
Browse files Browse the repository at this point in the history
Optimizer will now skip grad averaging if there are no peers to average gradients with. Previously, it would invoke grad_averager.step and wait for averaging_timeout seconds.

Co-authored-by: Qidong Su <[email protected]>
Co-authored-by: Alexander Borzunov <[email protected]>
  • Loading branch information
3 people authored Jan 4, 2022
1 parent cfc5200 commit c868989
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions hivemind/optim/optimizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -524,8 +524,12 @@ def _begin_averaging_gradients(self, grad_scaler: Optional[GradScaler]) -> bool:
logger.exception(e)

if not began_averaging_gradients and self.scheduled_grads is not None and not self.scheduled_grads.done():
logger.log(self.status_loglevel, f"Tagging along for a pre-scheduled gradient averaging round")
self._tag_along_with_zero_weight(self.scheduled_grads)
if self.tracker.global_progress.num_peers > 1:
logger.log(self.status_loglevel, f"Tagging along for a pre-scheduled gradient averaging round")
self._tag_along_with_zero_weight(self.scheduled_grads)
else:
logger.log(self.status_loglevel, f"Skipping pre-scheduled averaging round: there are no other peers")
self.scheduled_grads.cancel()
self.scheduled_grads = None
return began_averaging_gradients

Expand Down

0 comments on commit c868989

Please sign in to comment.