Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: sum/mean reduce, sequential_ddp_run #215

Closed
Scitator opened this issue Dec 23, 2021 · 1 comment · Fixed by #326
Closed

Feature Request: sum/mean reduce, sequential_ddp_run #215

Scitator opened this issue Dec 23, 2021 · 1 comment · Fixed by #326
Assignees
Labels
enhancement New feature or request

Comments

@Scitator
Copy link
Contributor

Hi!
Thanks for the accelerate project, looks very similar to my own implementation, so... I have a few proposals ;)


What do you think about:

accelerator = ...
accelerator.all_gather(obj)  # already implemented
accelerator.sum_reduce(obj)  # not yet
accelerator.mean_reduce(obj)  # not yet 

so you could use accelerator as a backend for metrics computation.

As for examples,

  • you always have to use mean_reduce for such simple batch-based metrics as accuracy
  • you always have to use all_gather for distributed computation of dataset-based metrics like AUC or Precision/Recall/F1 micro/macro/weighted aggregations.

Another proposal - add something similar to sequential_ddp_run or ddp_sync_run, so you could run some selected code on 0-rank first. Usecase is simple - dataset preparation - no one want to download MNIST on all process simultaneously ;)

@sgugger
Copy link
Collaborator

sgugger commented Dec 23, 2021

Thanks for the issue, always appreciate having some feedback and ideas of new features! Having more primitives like sum_reduce and mean_reduce definitely sounds like a good idea. I'm probably not going to have time to dig into that before going on holiday, but I can look into that when I'm back the second week of January. If you want to work on some however, feel free to open a PR :-)

It think your second proposition is already implemented as a context manager:

with accelerator.main_process_first():

or the local_main_process_first version if it's something that needs to happen on every node.

@muellerzr muellerzr self-assigned this Apr 26, 2022
@muellerzr muellerzr added the enhancement New feature or request label Apr 26, 2022
@muellerzr muellerzr linked a pull request Apr 26, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants