Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future 2/n: stand-alone examples #13294

Merged
merged 7 commits into from
Jun 15, 2022
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions .azure-pipelines/gpu-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -109,10 +109,11 @@ jobs:

- script: |
set -e
python -m pytest pl_examples -v --maxfail=2 --durations=0
bash pl_examples/run_examples.sh --trainer.accelerator=gpu --trainer.devices=1
bash pl_examples/run_examples.sh --trainer.accelerator=gpu --trainer.devices=2 --trainer.strategy=ddp
bash pl_examples/run_examples.sh --trainer.accelerator=gpu --trainer.devices=2 --trainer.strategy=ddp --trainer.precision=16
bash run_ddp_examples.sh
bash run_pl_examples.sh --trainer.accelerator=gpu --trainer.devices=1
bash run_pl_examples.sh --trainer.accelerator=gpu --trainer.devices=2 --trainer.strategy=ddp
bash run_pl_examples.sh --trainer.accelerator=gpu --trainer.devices=2 --trainer.strategy=ddp --trainer.precision=16
workingDirectory: examples
env:
PL_USE_MOCKED_MNIST: "1"
displayName: 'Testing: examples'
Expand Down
3 changes: 2 additions & 1 deletion .azure-pipelines/hpu-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ jobs:

- bash: |
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
python "pl_examples/hpu_examples/simple_mnist/mnist.py"
python "pl_hpu/mnist_sample.py"
workingDirectory: examples
displayName: 'Testing: HPU examples'

- task: PublishTestResults@2
Expand Down
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ assignees: ''
Please reproduce using the BoringModel!

You can use the following Colab link:
https://colab.research.google.com/github/PytorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report/bug_report_model.ipynb
https://colab.research.google.com/github/PytorchLightning/pytorch-lightning/blob/master/examples/pl_bug_report/bug_report_model.ipynb
IMPORTANT: has to be public.

or this simple template:
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report/bug_report_model.py
https://github.com/PyTorchLightning/pytorch-lightning/blob/master/examples/pl_bug_report/bug_report_model.py

If you could not reproduce using the BoringModel and still think there's a bug, please post here
but remember, bugs with code are fixed faster!
Expand Down
23 changes: 12 additions & 11 deletions .github/workflows/ci_test-full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,7 @@ jobs:
run: |
flag=$(python -c "print('--pre' if '${{matrix.release}}' == 'pre' else '')" 2>&1)
url=$(python -c "print('test/cpu/torch_test.html' if '${{matrix.release}}' == 'pre' else 'cpu/torch_stable.html')" 2>&1)
pip install -r requirements.txt --upgrade $flag --find-links "https://download.pytorch.org/whl/${url}"
pip install -r requirements/test.txt --upgrade
pip install -e . -r requirements/test.txt --upgrade $flag --find-links "https://download.pytorch.org/whl/${url}"
pip list
shell: bash

Expand Down Expand Up @@ -123,21 +122,13 @@ jobs:
key: pl-dataset

- name: Sanity check
run: |
python requirements/check-avail-extras.py
run: python requirements/check-avail-extras.py

- name: UnitTests
run: |
# NOTE: do not include coverage report here, see: https://github.com/nedbat/coveragepy/issues/1003
coverage run --source pytorch_lightning -m pytest pytorch_lightning tests -v --durations=50 --junitxml=junit/test-results-${{ runner.os }}-py${{ matrix.python-version }}-${{ matrix.requires }}-${{ matrix.release }}.xml

- name: Examples
run: |
# adjust versions according installed Torch version
python ./requirements/adjust-versions.py requirements/examples.txt
pip install -r requirements/examples.txt --find-links https://download.pytorch.org/whl/cpu/torch_stable.html --upgrade
python -m pytest pl_examples -v --durations=10

- name: Upload pytest results
uses: actions/upload-artifact@v2
with:
Expand All @@ -146,6 +137,16 @@ jobs:
if-no-files-found: error
if: failure()

- name: Prepare Examples
run: |
# adjust versions according installed Torch version
python ./requirements/adjust-versions.py requirements/examples.txt
pip install -r requirements/examples.txt --find-links https://download.pytorch.org/whl/cpu/torch_stable.html --upgrade

- name: Run Examples
working-directory: ./examples
run: python -m pytest test_pl_examples.py -v --durations=10

- name: Statistics
if: success()
run: |
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ test: clean
pip install -r requirements/devel.txt
pip install -r requirements/strategies.txt
# run tests with coverage
python -m coverage run --source pytorch_lightning -m pytest pytorch_lightning tests pl_examples -v
python -m coverage run --source pytorch_lightning -m pytest pytorch_lightning tests -v
python -m coverage report

docs: clean
Expand Down
2 changes: 1 addition & 1 deletion dockers/nvidia/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ RUN \
fi && \
# save the examples
mv pytorch-lightning/_notebooks/.notebooks/ notebooks && \
mv pytorch-lightning/pl_examples . && \
mv pytorch-lightning/examples . && \

# Installations \
pip install -q fire && \
Expand Down
2 changes: 1 addition & 1 deletion dockers/release/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ COPY ./ /home/pytorch-lightning/
RUN \
cd /home && \
mv pytorch-lightning/_notebooks notebooks && \
mv pytorch-lightning/pl_examples . && \
mv pytorch-lightning/examples . && \
# replace by specific version if asked
if [ ! -z "$LIGHTNING_VERSION" ] ; then \
rm -rf pytorch-lightning ; \
Expand Down
2 changes: 1 addition & 1 deletion docs/source/accelerators/hpu_intermediate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ This enables advanced users to provide their own BF16 and FP32 operator list ins
accelerator="hpu",
devices=1,
# Optional Habana mixed precision params to be set
# Checkout `pl_examples/hpu_examples/simple_mnist/ops_bf16_mnist.txt` for the format
# Checkout `examples/pl_hpu/ops_bf16_mnist.txt` for the format
plugins=[
HPUPrecisionPlugin(
precision=16,
Expand Down
2 changes: 1 addition & 1 deletion docs/source/accelerators/ipu_basic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Known limitations

Currently there are some known limitations that are being addressed in the near future to make the experience seamless when moving from different devices.

Please see the `MNIST example <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/ipu_examples/mnist.py>`__ which displays most of the limitations and how to overcome them till they are resolved.
Please see the `MNIST example <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/examples/pl_ipu/mnist_sample.py>`__ which displays most of the limitations and how to overcome them till they are resolved.

* ``self.log`` is not supported in the ``training_step``, ``validation_step``, ``test_step`` or ``predict_step``. This is due to the step function being traced and sent to the IPU devices. We're actively working on fixing this
* Multiple optimizers are not supported. ``training_step`` only supports returning one loss from the ``training_step`` function as a result
Expand Down
2 changes: 1 addition & 1 deletion docs/source/clouds/cluster_advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ in a `HyperOptArgumentParser

Here is an example where you run a grid search of 9 combinations of hyperparameters.
See also the multi-node examples
`here <https://github.com/PyTorchLightning/pytorch-lightning/tree/master/pl_examples/basic_examples>`__.
`here <https://github.com/PyTorchLightning/pytorch-lightning/tree/master/examples/pl_basics>`__.

.. code-block:: python

Expand Down
2 changes: 1 addition & 1 deletion docs/source/clouds/fault_tolerant_training_faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ If you believe this to be useful, please open a `feature request <https://github
What are the performance impacts?
*********************************
Fault-tolerant Training was tested on common and worst-case scenarios in order to measure the impact of the internal state tracking on the total training time.
On tiny models like the `BoringModel and RandomDataset <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report/bug_report_model.py>`_
On tiny models like the `BoringModel and RandomDataset <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/examples/pl_bug_report/bug_report_model.py>`_
which has virtually no data loading and processing overhead, we noticed up to 50% longer training time with fault tolerance enabled.
In this worst-case scenario, fault-tolerant adds an overhead that is noticeable in comparison to the compute time for dataloading itself.
However, for more realistic training workloads where data loading and preprocessing is more expensive, the constant overhead that fault tolerance adds becomes less noticeable or not noticeable at all.
Expand Down
4 changes: 2 additions & 2 deletions docs/source/extensions/loops.rst
Original file line number Diff line number Diff line change
Expand Up @@ -441,12 +441,12 @@ Advanced Examples

* - Link to Example
- Description
* - `K-fold Cross Validation <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/loop_examples/kfold.py>`_
* - `K-fold Cross Validation <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/examples/pl_loops/kfold.py>`_
- `KFold / Cross Validation <https://en.wikipedia.org/wiki/Cross-validation_(statistics)>`__ is a machine learning practice in which the training dataset is being partitioned into ``num_folds`` complementary subsets.
One cross validation round will perform fitting where one fold is left out for validation and the other folds are used for training.
To reduce variability, once all rounds are performed using the different folds, the trained models are ensembled and their predictions are
averaged when estimating the model's predictive performance on the test dataset.
* - `Yielding Training Step <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/loop_examples/yielding_training_step.py>`_
* - `Yielding Training Step <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/examples/pl_loops/yielding_training_step.py>`_
- This loop enables you to write the :meth:`~pytorch_lightning.core.module.LightningModule.training_step` hook
as a Python Generator for automatic optimization with multiple optimizers, i.e., you can :code:`yield` loss
values from it instead of returning them. This can enable more elegant and expressive implementations, as shown
Expand Down
2 changes: 1 addition & 1 deletion docs/source/starter/lightning_lite.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ Here are five required steps to convert to :class:`~pytorch_lightning.lite.Light
Lite(...).run(args)


That's all. You can now train on any kind of device and scale your training. Check out `this <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/basic_examples/mnist_examples/image_classifier_2_lite.py>`_ full MNIST training example with LightningLite.
That's all. You can now train on any kind of device and scale your training. Check out `this <https://github.com/PyTorchLightning/pytorch-lightning/blob/master/examples/convert_from_pt_to_pl/image_classifier_2_lite.py>`_ full MNIST training example with LightningLite.

:class:`~pytorch_lightning.lite.LightningLite` takes care of device management, so you don't have to.
You should remove any device-specific logic within your code.
Expand Down
53 changes: 53 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Examples

Our most robust examples showing all sorts of implementations
can be found in our sister library [Lightning Bolts](https://pytorch-lightning.readthedocs.io/en/latest/ecosystem/bolts.html).

______________________________________________________________________

## MNIST Examples

5 MNIST examples showing how to gradually convert from pure PyTorch to PyTorch Lightning.

The transition through [LightningLite](https://pytorch-lightning.readthedocs.io/en/latest/starter/lightning_lite.html) from pure PyTorch is optional, but it might be helpful to learn about it.

- [MNIST with vanilla PyTorch](convert_from_pt_to_pl/image_classifier_1_pytorch.py)
- [MNIST with LightningLite](convert_from_pt_to_pl/image_classifier_2_lite.py)
- [MNIST LightningLite to LightningModule](convert_from_pt_to_pl/image_classifier_3_lite_to_lightning_module.py)
- [MNIST with LightningModule](convert_from_pt_to_pl/image_classifier_4_lightning_module.py)
- [MNIST with LightningModule + LightningDataModule](convert_from_pt_to_pl/image_classifier_5_lightning_datamodule.py)

______________________________________________________________________

## Basic Examples

In this folder, we have 2 simple examples:

- [Image Classifier](pl_basics/backbone_image_classifier.py) (trains arbitrary datasets with arbitrary backbones).
- [Image Classifier + DALI](convert_from_pt_to_pl/image_classifier_4_dali.py) (defines the model inside the `LightningModule`).
- [Autoencoder](pl_basics/autoencoder.py)

______________________________________________________________________

## Domain Examples

This folder contains older examples. You should instead use the examples
in [Lightning Bolts](https://pytorch-lightning.readthedocs.io/en/latest/ecosystem/bolts.html)
for advanced use cases.

______________________________________________________________________

## Basic Examples

In this folder, we have 1 simple example:

- [Image Classifier + DALI](pl_integrations/dali_image_classifier.py) (defines the model inside the `LightningModule`).

______________________________________________________________________

## Loop examples

Contains implementations leveraging [loop customization](https://pytorch-lightning.readthedocs.io/en/latest/extensions/loops.html) to enhance the Trainer with new optimization routines.

- [K-fold Cross Validation Loop](pl_loops/kfold.py): Implementation of cross validation in a loop and special datamodule.
- [Yield Loop](pl_loops/yielding_training_step.py): Enables yielding from the training_step like in a Python generator. Useful for automatic optimization with multiple optimizers.
Original file line number Diff line number Diff line change
Expand Up @@ -12,44 +12,20 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import argparse
from os import path

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision.transforms as T
from torch.optim.lr_scheduler import StepLR

from pl_examples.basic_examples.mnist_datamodule import MNIST

# Credit to the PyTorch Team
# Taken from https://github.com/pytorch/examples/blob/master/mnist/main.py and slightly adapted.
from pytorch_lightning.demos.boring_classes import Net
from pytorch_lightning.demos.mnist_datamodule import MNIST


class Net(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout(0.25)
self.dropout2 = nn.Dropout(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)

def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
DATASETS_PATH = path.join(path.dirname(__file__), "..", "..", "Datasets")


def run(hparams):
Expand All @@ -60,8 +36,8 @@ def run(hparams):
device = torch.device("cuda" if use_cuda else "cpu")

transform = T.Compose([T.ToTensor(), T.Normalize((0.1307,), (0.3081,))])
train_dataset = MNIST("./data", train=True, download=True, transform=transform)
test_dataset = MNIST("./data", train=False, transform=transform)
train_dataset = MNIST(DATASETS_PATH, train=True, download=True, transform=transform)
test_dataset = MNIST(DATASETS_PATH, train=False, transform=transform)
train_loader = torch.utils.data.DataLoader(
train_dataset,
batch_size=hparams.batch_size,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
"""

import argparse
from os import path

import torch
import torch.nn.functional as F
Expand All @@ -37,11 +38,13 @@
from torch.optim.lr_scheduler import StepLR
from torchmetrics.classification import Accuracy

from pl_examples.basic_examples.mnist_datamodule import MNIST
from pl_examples.basic_examples.mnist_examples.image_classifier_1_pytorch import Net
from pytorch_lightning import seed_everything
from pytorch_lightning.demos.boring_classes import Net
from pytorch_lightning.demos.mnist_datamodule import MNIST
from pytorch_lightning.lite import LightningLite # import LightningLite

DATASETS_PATH = path.join(path.dirname(__file__), "..", "..", "Datasets")


class Lite(LightningLite):
def run(self, hparams):
Expand All @@ -51,10 +54,10 @@ def run(self, hparams):
transform = T.Compose([T.ToTensor(), T.Normalize((0.1307,), (0.3081,))])
# This is meant to ensure the data are download only by 1 process.
if self.is_global_zero:
MNIST("./data", download=True)
MNIST(DATASETS_PATH, download=True)
self.barrier()
train_dataset = MNIST("./data", train=True, transform=transform)
test_dataset = MNIST("./data", train=False, transform=transform)
train_dataset = MNIST(DATASETS_PATH, train=True, transform=transform)
test_dataset = MNIST(DATASETS_PATH, train=False, transform=transform)
train_loader = torch.utils.data.DataLoader(
train_dataset,
batch_size=hparams.batch_size,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
"""

import argparse
from os import path

import torch
import torch.nn.functional as F
Expand All @@ -33,11 +34,13 @@
from torch.optim.lr_scheduler import StepLR
from torchmetrics import Accuracy

from pl_examples.basic_examples.mnist_datamodule import MNIST
from pl_examples.basic_examples.mnist_examples.image_classifier_1_pytorch import Net
from pytorch_lightning import seed_everything
from pytorch_lightning.demos.boring_classes import Net
from pytorch_lightning.demos.mnist_datamodule import MNIST
from pytorch_lightning.lite import LightningLite

DATASETS_PATH = path.join(path.dirname(__file__), "..", "..", "Datasets")


class Lite(LightningLite):
"""Lite is starting to look like a LightningModule."""
Expand Down Expand Up @@ -135,14 +138,14 @@ def transform(self):
return T.Compose([T.ToTensor(), T.Normalize((0.1307,), (0.3081,))])

def prepare_data(self) -> None:
MNIST("./data", download=True)
MNIST(DATASETS_PATH, download=True)

def train_dataloader(self):
train_dataset = MNIST("./data", train=True, download=False, transform=self.transform)
train_dataset = MNIST(DATASETS_PATH, train=True, download=False, transform=self.transform)
return torch.utils.data.DataLoader(train_dataset, batch_size=self.hparams.batch_size)

def test_dataloader(self):
test_dataset = MNIST("./data", train=False, download=False, transform=self.transform)
test_dataset = MNIST(DATASETS_PATH, train=False, download=False, transform=self.transform)
return torch.utils.data.DataLoader(test_dataset, batch_size=self.hparams.batch_size)


Expand Down
Loading