Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DDPLoaderWrapper update #1385

Merged
merged 3 commits into from
Dec 23, 2021
Merged

DDPLoaderWrapper update #1385

merged 3 commits into from
Dec 23, 2021

Conversation

Scitator
Copy link
Member

Pull Request FAQ

Description

Related Issue

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Checklist

  • Have you updated tests for the new functionality?
  • Have you added your new classes/functions to the docs?
  • Have you updated the CHANGELOG?
  • Have you run colab minimal CI/CD with latest and minimal requirements?
  • Have you checked XLA integration with single and multiple processes?

@pep8speaks
Copy link

pep8speaks commented Dec 23, 2021

Hello @Scitator! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-12-23 20:06:08 UTC

@Scitator Scitator changed the title ddp sampler DDPLoaderWrapper update Dec 23, 2021
# https://github.com/huggingface/accelerate/blob/main/src/accelerate/data_loader.py
class BatchSamplerShard(BatchSampler):
"""
Wraps a PyTorch :obj:`BatchSampler` to generate batches for one of the processes only. Instances of this class will

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
E501 line too long (119 > 99 characters)

# https://github.com/huggingface/accelerate/blob/main/src/accelerate/data_loader.py
class BatchSamplerShard(BatchSampler):
"""
Wraps a PyTorch :obj:`BatchSampler` to generate batches for one of the processes only. Instances of this class will

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
W505 doc line too long (119 > 99 characters)

class BatchSamplerShard(BatchSampler):
"""
Wraps a PyTorch :obj:`BatchSampler` to generate batches for one of the processes only. Instances of this class will
always yield a number of batches that is a round multiple of :obj:`num_processes` and that all have the same size.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
E501 line too long (118 > 99 characters)

class BatchSamplerShard(BatchSampler):
"""
Wraps a PyTorch :obj:`BatchSampler` to generate batches for one of the processes only. Instances of this class will
always yield a number of batches that is a round multiple of :obj:`num_processes` and that all have the same size.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
W505 doc line too long (118 > 99 characters)

"""
Wraps a PyTorch :obj:`BatchSampler` to generate batches for one of the processes only. Instances of this class will
always yield a number of batches that is a round multiple of :obj:`num_processes` and that all have the same size.
Depending on the value of the :obj:`drop_last` attribute of the batch sampler passed, it will either stop the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
E501 line too long (113 > 99 characters)

length = len(self.batch_sampler) // self.num_processes
return length if self.drop_last else length + 1

def __iter__(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
D105 Missing docstring in magic method

# We gather the initial indices in case we need to circle back at the end.
if not self.drop_last and idx < self.num_processes:
initial_data += batch
# We identify the batch to yield but wait until we ar sure every process gets a full batch before actually

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
E501 line too long (118 > 99 characters)

# We gather the initial indices in case we need to circle back at the end.
if not self.drop_last and idx < self.num_processes:
initial_data += batch
# We identify the batch to yield but wait until we ar sure every process gets a full batch before actually

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
W505 doc line too long (118 > 99 characters)

while len(initial_data) < self.num_processes * self.batch_size:
initial_data += initial_data

# If the last batch seen was of the proper size, it has been yielded by its process so we move to the next

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
E501 line too long (118 > 99 characters)

while len(initial_data) < self.num_processes * self.batch_size:
initial_data += initial_data

# If the last batch seen was of the proper size, it has been yielded by its process so we move to the next

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
W505 doc line too long (118 > 99 characters)

if not self.drop_last and idx < self.num_processes:
initial_data += batch
# We identify the batch to yield
# but wait until we are sure every process gets a full batch

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
W291 trailing whitespace

@Scitator Scitator merged commit 9376f33 into master Dec 23, 2021
@mergify mergify bot deleted the data/ddp-loader-2 branch December 23, 2021 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants