Skip to content

Commit

Permalink
Merge branch 'master' into bugfix/batch-device
Browse files Browse the repository at this point in the history
  • Loading branch information
awaelchli committed Jul 1, 2021
2 parents 5ddeaec + d51b0ae commit 88ca10d
Show file tree
Hide file tree
Showing 96 changed files with 1,813 additions and 407 deletions.
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ jobs:
docker:
- image: circleci/python:3.7
environment:
- XLA_VER: 1.7
- XLA_VER: 1.8
- MAX_CHECKS: 240
- CHECK_SPEEP: 5
steps:
Expand Down
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@

# Specifics
/pytorch_lightning/trainer/connectors/logger_connector @tchaton @carmocca
/pytorch_lightning/trainer/progress.py @tchaton @awaelchli @carmocca

# Metrics
/pytorch_lightning/metrics/ @SkafteNicki @ananyahjha93 @justusschock
Expand Down
16 changes: 9 additions & 7 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

Welcome to the PyTorch Lightning community! We're building the most advanced research platform on the planet to implement the latest, best practices that the amazing PyTorch team rolls out!

If you are new to open source, check out [this blog to get started with your first Open Source contribution](https://devblog.pytorchlightning.ai/quick-contribution-guide-86d977171b3a).

## Main Core Value: One less thing to remember

Simplify the API as much as possible from the user perspective.
Expand Down Expand Up @@ -58,13 +60,13 @@ Have a favorite feature from other libraries like fast.ai or transformers? Those

## Contribution Types

We are always looking for help implementing new features or fixing bugs.
We are always open to contributions of new features or bug fixes.

A lot of good work has already been done in project mechanics (requirements.txt, setup.py, pep8, badges, ci, etc...) so we're in a good state there thanks to all the early contributors (even pre-beta release)!

### Bug Fixes:

1. If you find a bug please submit a github issue.
1. If you find a bug please submit a GitHub issue.

- Make sure the title explains the issue.
- Describe your setup, what you are trying to do, expected vs. actual behaviour. Please add configs and code samples.
Expand All @@ -79,12 +81,12 @@ A lot of good work has already been done in project mechanics (requirements.txt,

3. Submit a PR!

_**Note**, even if you do not find the solution, sending a PR with a test covering the issue is a valid contribution and we can help you or finish it with you :]_
_**Note**, even if you do not find the solution, sending a PR with a test covering the issue is a valid contribution, and we can help you or finish it with you :]_

### New Features:

1. Submit a github issue - describe what is the motivation of such feature (adding the use case or an example is helpful).
2. Let's discuss to determine the feature scope.
1. Submit a GitHub issue - describe what is the motivation of such feature (adding the use case, or an example is helpful).
2. Determine the feature scope with us.
3. Submit a PR! We recommend test driven approach to adding new features as well:

- Write a test for the functionality you want to add.
Expand Down Expand Up @@ -199,7 +201,7 @@ Note: if your computer does not have multi-GPU nor TPU these tests are skipped.
**GitHub Actions:** For convenience, you can also use your own GHActions building which will be triggered with each commit.
This is useful if you do not test against all required dependency versions.

**Docker:** Another option is utilize the [pytorch lightning cuda base docker image](https://hub.docker.com/repository/docker/pytorchlightning/pytorch_lightning/tags?page=1&name=cuda). You can then run:
**Docker:** Another option is to utilize the [pytorch lightning cuda base docker image](https://hub.docker.com/repository/docker/pytorchlightning/pytorch_lightning/tags?page=1&name=cuda). You can then run:

```bash
python -m pytest pytorch_lightning tests pl_examples -v
Expand Down Expand Up @@ -230,7 +232,7 @@ We welcome any useful contribution! For your convenience here's a recommended wo
- Make sure all tests are passing.
- Make sure you add a GitHub issue to your PR.
5. Use tags in PR name for following cases:
- **[blocked by #<number>]** if you work is depending on others changes.
- **[blocked by #<number>]** if your work is dependent on other PRs.
- **[wip]** when you start to re-edit your work, mark it so no one will accidentally merge it in meantime.

### Question & Answer
Expand Down
9 changes: 5 additions & 4 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,14 @@ wget https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master
python collect_env_details.py
```

- PyTorch Version (e.g., 1.0):
- OS (e.g., Linux):
- How you installed PyTorch (`conda`, `pip`, source):
- Build command you used (if compiling from source):
- PyTorch Lightning Version (e.g., 1.3.0):
- PyTorch Version (e.g., 1.8)
- Python version:
- OS (e.g., Linux):
- CUDA/cuDNN version:
- GPU models and configuration:
- How you installed PyTorch (`conda`, `pip`, source):
- If compiling from source, the output of `torch.__config__.show()`:
- Any other relevant information:

### Additional context
Expand Down
52 changes: 46 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Added support for checkpointing based on a provided time interval during training ([#7515](https://github.com/PyTorchLightning/pytorch-lightning/pull/7515))


- Added dataclasses for progress tracking (
[#6603](https://github.com/PyTorchLightning/pytorch-lightning/pull/6603),
[#7574](https://github.com/PyTorchLightning/pytorch-lightning/pull/7574))
- Progress tracking
* Added dataclasses for progress tracking ([#6603](https://github.com/PyTorchLightning/pytorch-lightning/pull/6603), [#7574](https://github.com/PyTorchLightning/pytorch-lightning/pull/7574), [#8140](https://github.com/PyTorchLightning/pytorch-lightning/pull/8140))
* Add `{,load_}state_dict` to the progress tracking dataclasses ([#8140](https://github.com/PyTorchLightning/pytorch-lightning/pull/8140))


- Added support for passing a `LightningDataModule` positionally as the second argument to `trainer.{validate,test,predict}` ([#7431](https://github.com/PyTorchLightning/pytorch-lightning/pull/7431))
Expand Down Expand Up @@ -84,11 +84,14 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).


- Fault-tolerant training
* Add `{,load_}state_dict` to `ResultCollection` ([#7948](https://github.com/PyTorchLightning/pytorch-lightning/pull/7948))
* Checkpoint the loop results ([#7966](https://github.com/PyTorchLightning/pytorch-lightning/pull/7966))
* Added `{,load_}state_dict` to `ResultCollection` ([#7948](https://github.com/PyTorchLightning/pytorch-lightning/pull/7948))
* Added `{,load_}state_dict` to `Loops` ([#8197](https://github.com/PyTorchLightning/pytorch-lightning/pull/8197))


- Add `rank_zero_only` to `LightningModule.log` function ([#7966](https://github.com/PyTorchLightning/pytorch-lightning/pull/7966))
- Added `rank_zero_only` to `LightningModule.log` function ([#7966](https://github.com/PyTorchLightning/pytorch-lightning/pull/7966))


- Added `metric_attribute` to `LightningModule.log` function ([#7966](https://github.com/PyTorchLightning/pytorch-lightning/pull/7966))


- Added a warning if `Trainer(log_every_n_steps)` is a value too high for the training dataloader ([#7734](https://github.com/PyTorchLightning/pytorch-lightning/pull/7734))
Expand All @@ -115,9 +118,18 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Add support for calling scripts using the module syntax (`python -m package.script`) ([#8073](https://github.com/PyTorchLightning/pytorch-lightning/pull/8073))


- Add support for optimizers and learning rate schedulers to `LightningCLI` ([#8093](https://github.com/PyTorchLightning/pytorch-lightning/pull/8093))


- Add torchelastic check when sanitizing GPUs ([#8095](https://github.com/PyTorchLightning/pytorch-lightning/pull/8095))


- Added XLA Profiler ([#8014](https://github.com/PyTorchLightning/pytorch-lightning/pull/8014))


- Added `max_depth` parameter in `ModelSummary` ([#8062](https://github.com/PyTorchLightning/pytorch-lightning/pull/8062))


### Changed


Expand Down Expand Up @@ -220,6 +232,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- `Trainer(resume_from_checkpoint=...)` now restores the model directly after `LightningModule.setup()`, which is before `LightningModule.configure_sharded_model()` ([#7652](https://github.com/PyTorchLightning/pytorch-lightning/pull/7652))


- Added a mechanism to detect `deadlock` for `DDP` when only 1 process trigger an `Exception`. The mechanism will `kill the processes` when it happens ([#8167](https://github.com/PyTorchLightning/pytorch-lightning/pull/8167))


### Deprecated


Expand Down Expand Up @@ -253,9 +268,15 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Deprecated the use of `CheckpointConnector.hpc_load()` in favor of `CheckpointConnector.restore()` ([#7652](https://github.com/PyTorchLightning/pytorch-lightning/pull/7652))


- Deprecated `DDPPlugin.task_idx` in favor of `DDPPlugin.local_rank` ([#8203](https://github.com/PyTorchLightning/pytorch-lightning/pull/8203))


- Deprecated the `Trainer.train_loop` property in favor of `Trainer.fit_loop` ([#8025](https://github.com/PyTorchLightning/pytorch-lightning/pull/8025))


- Deprecated `mode` parameter in `ModelSummary` in favor of `max_depth` ([#8062](https://github.com/PyTorchLightning/pytorch-lightning/pull/8062))


### Removed

- Removed `ProfilerConnector` ([#7654](https://github.com/PyTorchLightning/pytorch-lightning/pull/7654))
Expand Down Expand Up @@ -285,6 +306,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
### Fixed


- Fixed SWA to also work with `IterableDataset` ([#8172](https://github.com/PyTorchLightning/pytorch-lightning/pull/8172))

- Fixed `lr_scheduler` checkpointed state by calling `update_lr_schedulers` before saving checkpoints ([#7877](https://github.com/PyTorchLightning/pytorch-lightning/pull/7877))


Expand Down Expand Up @@ -315,6 +338,23 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
- Fixed a DDP info message that was never shown ([#8111](https://github.com/PyTorchLightning/pytorch-lightning/pull/8111))


- Fixed metrics generated during `validation sanity checking` are cleaned on end ([#8171](https://github.com/PyTorchLightning/pytorch-lightning/pull/8171))


- Fixed a bug where an infinite recursion would be triggered when using the `BaseFinetuning` callback on a model that contains a `ModuleDict` ([#8170](https://github.com/PyTorchLightning/pytorch-lightning/pull/8170))


- Fixed NCCL error when selecting non-consecutive device ids ([#8165](https://github.com/PyTorchLightning/pytorch-lightning/pull/8165))


- Fixed `log_gpu_memory` metrics not being added to `logging` when nothing else is logged ([#8174](https://github.com/PyTorchLightning/pytorch-lightning/pull/8174))


- Fixed a bug where calling `log` with a `Metric` instance would raise an error if it was a nested attribute of the model ([#8181](https://github.com/PyTorchLightning/pytorch-lightning/pull/8181))


- Fixed a bug where using `precision=64` would cause buffers with complex dtype to be cast to real ([#8208](https://github.com/PyTorchLightning/pytorch-lightning/pull/8208))

- Fixes access to `callback_metrics` in ddp_spawn ([#7916](https://github.com/PyTorchLightning/pytorch-lightning/pull/7916))


Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -369,7 +369,9 @@ class LitAutoEncoder(pl.LightningModule):

The lightning community is maintained by
- [10+ core contributors](https://pytorch-lightning.readthedocs.io/en/latest/governance.html) who are all a mix of professional engineers, Research Scientists, and Ph.D. students from top AI labs.
- 400+ community contributors.
- 480+ active community contributors.

Want to help us build Lightning and reduce boilerplate for thousands of researchers? [Learn how to make your first contribution here](https://devblog.pytorchlightning.ai/quick-contribution-guide-86d977171b3a)

Lightning is also part of the [PyTorch ecosystem](https://pytorch.org/ecosystem/) which requires projects to have solid testing, documentation and support.

Expand Down
1 change: 1 addition & 0 deletions dockers/tpu-tests/tpu_test_cases.jsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ local tputests = base.BaseTest {
|||
cd pytorch-lightning
coverage run --source=pytorch_lightning -m pytest -v --capture=no \
tests/profiler/test_xla_profiler.py \
pytorch_lightning/utilities/xla_device.py \
tests/accelerators/test_tpu_backend.py \
tests/models/test_tpu.py
Expand Down
10 changes: 10 additions & 0 deletions docs/source/_templates/layout.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{% extends "!layout.html" %}
<link rel="canonical" href="{{ theme_canonical_url }}{{ pagename }}.html" />

{% block footer %}
{{ super() }}
<script script type="text/javascript">
var collapsedSections = ['Best practices', 'Lightning API', 'Optional extensions', 'Tutorials', 'API References', 'Bolts', 'Examples', 'Common Use Cases', 'Partner Domain Frameworks', 'Community'];
</script>

{% endblock %}
2 changes: 2 additions & 0 deletions docs/source/_templates/theme_variables.jinja
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,7 @@
'blog': 'https://www.pytorchlightning.ai/blog',
'resources': 'https://pytorch-lightning.readthedocs.io/en/latest/#community-examples',
'support': 'https://pytorch-lightning.rtfd.io/en/latest/',
'community': 'https://pytorch-lightning.slack.com',
'forums': 'https://pytorch-lightning.slack.com',
}
-%}
119 changes: 117 additions & 2 deletions docs/source/common/lightning_cli.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
.. testsetup:: *
:skipif: not _JSONARGPARSE_AVAILABLE

import torch
from unittest import mock
from typing import List
from pytorch_lightning.core.lightning import LightningModule
Expand Down Expand Up @@ -385,7 +386,7 @@ instantiating the trainer class can be found in :code:`self.config['trainer']`.


Configurable callbacks
~~~~~~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^^^^^^

As explained previously, any callback can be added by including it in the config via :code:`class_path` and
:code:`init_args` entries. However, there are other cases in which a callback should always be present and be
Expand Down Expand Up @@ -417,7 +418,7 @@ To change the configuration of the :code:`EarlyStopping` in the config it would
Argument linking
~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^

Another case in which it might be desired to extend :class:`~pytorch_lightning.utilities.cli.LightningCLI` is that the
model and data module depend on a common parameter. For example in some cases both classes require to know the
Expand Down Expand Up @@ -470,3 +471,117 @@ Instantiation links are used to automatically determine the order of instantiati
The linking of arguments can be used for more complex cases. For example to derive a value via a function that takes
multiple settings as input. For more details have a look at the API of `link_arguments
<https://jsonargparse.readthedocs.io/en/stable/#jsonargparse.core.ArgumentParser.link_arguments>`_.


Optimizers and learning rate schedulers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Optimizers and learning rate schedulers can also be made configurable. The most common case is when a model only has a
single optimizer and optionally a single learning rate scheduler. In this case the model's
:class:`~pytorch_lightning.core.lightning.LightningModule` could be left without implementing the
:code:`configure_optimizers` method since it is normally always the same and just adds boilerplate. The following code
snippet shows how to implement it:

.. testcode::

import torch
from pytorch_lightning.utilities.cli import LightningCLI

class MyLightningCLI(LightningCLI):

def add_arguments_to_parser(self, parser):
parser.add_optimizer_args(torch.optim.Adam)
parser.add_lr_scheduler_args(torch.optim.lr_scheduler.ExponentialLR)

cli = MyLightningCLI(MyModel)

With this the :code:`configure_optimizers` method is automatically implemented and in the config the :code:`optimizer`
and :code:`lr_scheduler` groups would accept all of the options for the given classes, in this example :code:`Adam` and
:code:`ExponentialLR`. Therefore, the config file would be structured like:

.. code-block:: yaml
optimizer:
lr: 0.01
lr_scheduler:
gamma: 0.2
model:
...
trainer:
...
And any of these arguments could be passed directly through command line. For example:

.. code-block:: bash
$ python train.py --optimizer.lr=0.01 --lr_scheduler.gamma=0.2
There is also the possibility of selecting among multiple classes by giving them as a tuple. For example:

.. testcode::

class MyLightningCLI(LightningCLI):

def add_arguments_to_parser(self, parser):
parser.add_optimizer_args((torch.optim.SGD, torch.optim.Adam))

In this case in the config the :code:`optimizer` group instead of having directly init settings, it should specify
:code:`class_path` and optionally :code:`init_args`. Sub-classes of the classes in the tuple would also be accepted.
A corresponding example of the config file would be:

.. code-block:: yaml
optimizer:
class_path: torch.optim.Adam
init_args:
lr: 0.01
model:
...
trainer:
...
And the same through command line:

.. code-block:: bash
$ python train.py --optimizer='{class_path: torch.optim.Adam, init_args: {lr: 0.01}}'
The automatic implementation of :code:`configure_optimizers` can be disabled by linking the configuration group. An
example can be :code:`ReduceLROnPlateau` which requires to specify a monitor. This would be:

.. testcode::

from pytorch_lightning.utilities.cli import instantiate_class, LightningCLI

class MyModel(LightningModule):

def __init__(self, optimizer_init: dict, lr_scheduler_init: dict):
super().__init__()
self.optimizer_init = optimizer_init
self.lr_scheduler_init = lr_scheduler_init

def configure_optimizers(self):
optimizer = instantiate_class(self.parameters(), self.optimizer_init)
scheduler = instantiate_class(optimizer, self.lr_scheduler_init)
return {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}

class MyLightningCLI(LightningCLI):

def add_arguments_to_parser(self, parser):
parser.add_optimizer_args(
torch.optim.Adam,
link_to='model.optimizer_init',
)
parser.add_lr_scheduler_args(
torch.optim.lr_scheduler.ReduceLROnPlateau,
link_to='model.lr_scheduler_init',
)

cli = MyLightningCLI(MyModel)

For both possibilities of using :meth:`pytorch_lightning.utilities.cli.LightningArgumentParser.add_optimizer_args` with
a single class or a tuple of classes, the value given to :code:`optimizer_init` will always be a dictionary including
:code:`class_path` and :code:`init_args` entries. The function
:func:`~pytorch_lightning.utilities.cli.instantiate_class` takes care of importing the class defined in
:code:`class_path` and instantiating it using some positional arguments, in this case :code:`self.parameters()`, and the
:code:`init_args`. Any number of optimizers and learning rate schedulers can be added when using :code:`link_to`.
Loading

0 comments on commit 88ca10d

Please sign in to comment.