You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
Thanks for the accelerate project, looks very similar to my own implementation, so... I have a few proposals ;)
What do you think about:
accelerator= ...
accelerator.all_gather(obj) # already implementedaccelerator.sum_reduce(obj) # not yetaccelerator.mean_reduce(obj) # not yet
so you could use accelerator as a backend for metrics computation.
As for examples,
you always have to use mean_reduce for such simple batch-based metrics as accuracy
you always have to use all_gather for distributed computation of dataset-based metrics like AUC or Precision/Recall/F1 micro/macro/weighted aggregations.
Another proposal - add something similar to sequential_ddp_run or ddp_sync_run, so you could run some selected code on 0-rank first. Usecase is simple - dataset preparation - no one want to download MNIST on all process simultaneously ;)
The text was updated successfully, but these errors were encountered:
Thanks for the issue, always appreciate having some feedback and ideas of new features! Having more primitives like sum_reduce and mean_reduce definitely sounds like a good idea. I'm probably not going to have time to dig into that before going on holiday, but I can look into that when I'm back the second week of January. If you want to work on some however, feel free to open a PR :-)
It think your second proposition is already implemented as a context manager:
with accelerator.main_process_first():
or the local_main_process_first version if it's something that needs to happen on every node.
Hi!
Thanks for the accelerate project, looks very similar to my own implementation, so... I have a few proposals ;)
What do you think about:
so you could use accelerator as a backend for metrics computation.
As for examples,
mean_reduce
for such simple batch-based metrics asaccuracy
all_gather
for distributed computation of dataset-based metrics likeAUC
orPrecision/Recall/F1
micro/macro/weighted aggregations.Another proposal - add something similar to
sequential_ddp_run
orddp_sync_run
, so you could run some selected code on 0-rank first. Usecase is simple - dataset preparation - no one want to download MNIST on all process simultaneously ;)The text was updated successfully, but these errors were encountered: