-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Future structure #13265
[RFC] Future structure #13265
Changes from all commits
f2c5287
4a11e74
41e2dc1
701f5b6
bf07902
b01c18a
c4d737c
5fa7f0e
276f011
14bb3be
e02daef
35c2275
94f4bdf
68028db
20cab40
ebcf029
180e330
64090ec
bc973d6
f3f852c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,9 @@ | ||
#!/bin/bash | ||
# Run this script from the project root. | ||
URL="https://pl-public-data.s3.amazonaws.com/legacy/checkpoints.zip" | ||
mkdir -p legacy | ||
mkdir -p test/legacy | ||
# wget is simpler but does not work on Windows | ||
python -c "from urllib.request import urlretrieve; urlretrieve('$URL', 'legacy/checkpoints.zip')" | ||
ls -l legacy/ | ||
unzip -o legacy/checkpoints.zip -d legacy/ | ||
ls -l legacy/checkpoints/ | ||
python -c "from urllib.request import urlretrieve; urlretrieve('$URL', 'test/legacy/checkpoints.zip')" | ||
ls -l test/legacy/ | ||
unzip -o test/legacy/checkpoints.zip -d test/legacy/ | ||
ls -l test/legacy/checkpoints/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,7 @@ trigger: | |
include: | ||
- "master" | ||
- "release/*" | ||
- "future/*" | ||
- "refs/tags/*" | ||
|
||
pr: none | ||
|
@@ -34,8 +35,14 @@ jobs: | |
clean: all | ||
|
||
steps: | ||
- bash: | | ||
python -m pytest tests/benchmarks -v --durations=0 | ||
displayName: 'Testing: benchmarks' | ||
env: | ||
PL_RUNNING_BENCHMARKS: 1 | ||
|
||
- bash: | | ||
pip install -e . -r requirements/strategies.txt | ||
pip list | ||
displayName: 'Install package' | ||
Comment on lines
+39
to
+42
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not necessary |
||
|
||
- bash: python -m pytest unittests_pl/benchmarks -v --durations=0 | ||
env: | ||
PL_RUNNING_BENCHMARKS: 1 | ||
workingDirectory: test | ||
displayName: 'Testing: benchmarks' |
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -11,13 +11,15 @@ trigger: | |||||||||||
include: | ||||||||||||
- "master" | ||||||||||||
- "release/*" | ||||||||||||
- "future/*" | ||||||||||||
- "refs/tags/*" | ||||||||||||
pr: | ||||||||||||
- "master" | ||||||||||||
- "release/*" | ||||||||||||
- "future/*" | ||||||||||||
|
||||||||||||
jobs: | ||||||||||||
- job: pytest | ||||||||||||
- job: testing | ||||||||||||
strategy: | ||||||||||||
matrix: | ||||||||||||
'PyTorch - LTS': | ||||||||||||
|
@@ -28,15 +30,12 @@ jobs: | |||||||||||
timeoutInMinutes: "100" | ||||||||||||
# how much time to give 'run always even if cancelled tasks' before stopping them | ||||||||||||
cancelTimeoutInMinutes: "2" | ||||||||||||
|
||||||||||||
pool: azure-jirka-spot | ||||||||||||
|
||||||||||||
container: | ||||||||||||
image: $(image) | ||||||||||||
# default shm size is 64m. Increase it to avoid: | ||||||||||||
# 'Error while creating shared memory: unhandled system error, NCCL version 2.7.8' | ||||||||||||
options: "--runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all --shm-size=512m" | ||||||||||||
|
||||||||||||
workspace: | ||||||||||||
clean: all | ||||||||||||
|
||||||||||||
|
@@ -56,8 +55,9 @@ jobs: | |||||||||||
python -c "fname = 'requirements/strategies.txt' ; lines = [line for line in open(fname).readlines() if 'horovod' not in line] ; open(fname, 'w').writelines(lines)" | ||||||||||||
CUDA_VERSION_MM=$(python -c "import torch ; print(''.join(map(str, torch.version.cuda.split('.')[:2])))") | ||||||||||||
pip install "bagua-cuda$CUDA_VERSION_MM>=0.9.0" | ||||||||||||
pip install . --requirement requirements/devel.txt | ||||||||||||
pip install . --requirement requirements/strategies.txt | ||||||||||||
pip install -e . | ||||||||||||
pip install --requirement requirements/devel.txt | ||||||||||||
pip install --requirement requirements/strategies.txt | ||||||||||||
Comment on lines
+58
to
+60
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||
pip list | ||||||||||||
displayName: 'Install dependencies' | ||||||||||||
|
||||||||||||
|
@@ -72,12 +72,16 @@ jobs: | |||||||||||
- bash: bash .actions/pull_legacy_checkpoints.sh | ||||||||||||
displayName: 'Get legacy checkpoints' | ||||||||||||
|
||||||||||||
- bash: | | ||||||||||||
python -m coverage run --source pytorch_lightning -m pytest pytorch_lightning tests --ignore tests/benchmarks -v --junitxml=$(Build.StagingDirectory)/test-results.xml --durations=50 | ||||||||||||
displayName: 'Testing: standard' | ||||||||||||
- bash: python -m coverage run --source pytorch_lightning -m pytest pytorch_lightning | ||||||||||||
workingDirectory: src | ||||||||||||
displayName: 'Testing: doctests' | ||||||||||||
|
||||||||||||
- bash: | | ||||||||||||
bash tests/standalone_tests.sh | ||||||||||||
- bash: python -m coverage run --source pytorch_lightning -m pytest unittests_pl --ignore unittests_pl/benchmarks -v --junitxml=$(Build.StagingDirectory)/test-results.xml --durations=50 | ||||||||||||
displayName: 'Testing: unittests' | ||||||||||||
workingDirectory: test | ||||||||||||
|
||||||||||||
- bash: bash run_standalone_tests.sh | ||||||||||||
workingDirectory: test | ||||||||||||
env: | ||||||||||||
PL_USE_MOCKED_MNIST: "1" | ||||||||||||
displayName: 'Testing: standalone' | ||||||||||||
|
@@ -86,8 +90,9 @@ jobs: | |||||||||||
python -m coverage report | ||||||||||||
python -m coverage xml | ||||||||||||
python -m coverage html | ||||||||||||
python -m codecov --token=$(CODECOV_TOKEN) --commit=$(Build.SourceVersion) --flags=gpu,pytest --name="GPU-coverage" --env=linux,azure | ||||||||||||
python -m codecov --token=$(CODECOV_TOKEN) --commit=$(Build.SourceVersion) --flags=gpu,unittest --name="GPU-coverage" --env=linux,azure | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||
ls -l | ||||||||||||
workingDirectory: test | ||||||||||||
displayName: 'Statistics' | ||||||||||||
|
||||||||||||
- task: PublishTestResults@2 | ||||||||||||
|
@@ -109,14 +114,15 @@ jobs: | |||||||||||
|
||||||||||||
- script: | | ||||||||||||
set -e | ||||||||||||
python -m pytest pl_examples -v --maxfail=2 --durations=0 | ||||||||||||
bash pl_examples/run_examples.sh --trainer.accelerator=gpu --trainer.devices=1 | ||||||||||||
bash pl_examples/run_examples.sh --trainer.accelerator=gpu --trainer.devices=2 --trainer.strategy=ddp | ||||||||||||
bash pl_examples/run_examples.sh --trainer.accelerator=gpu --trainer.devices=2 --trainer.strategy=ddp --trainer.precision=16 | ||||||||||||
bash run_ddp_examples.sh | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's keep the test here with the other standalone tests |
||||||||||||
bash run_pl_examples.sh --trainer.accelerator=gpu --trainer.devices=1 | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This script should be examples/pytorch/run_examples.sh |
||||||||||||
bash run_pl_examples.sh --trainer.accelerator=gpu --trainer.devices=2 --trainer.strategy=ddp | ||||||||||||
bash run_pl_examples.sh --trainer.accelerator=gpu --trainer.devices=2 --trainer.strategy=ddp --trainer.precision=16 | ||||||||||||
workingDirectory: examples | ||||||||||||
env: | ||||||||||||
PL_USE_MOCKED_MNIST: "1" | ||||||||||||
displayName: 'Testing: examples' | ||||||||||||
|
||||||||||||
- bash: | | ||||||||||||
python -m pytest tests/benchmarks -v --maxfail=2 --durations=0 | ||||||||||||
- bash: python -m pytest unittests_pl/benchmarks -v --maxfail=2 --durations=0 | ||||||||||||
workingDirectory: test | ||||||||||||
displayName: 'Testing: benchmarks' |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,21 +8,20 @@ trigger: | |
include: | ||
- "master" | ||
- "release/*" | ||
- "future/*" | ||
- "refs/tags/*" | ||
pr: | ||
- "master" | ||
- "release/*" | ||
- "future/*" | ||
|
||
jobs: | ||
- job: tests | ||
|
||
- job: testing | ||
# how long to run the job before automatically cancelling | ||
timeoutInMinutes: "10" | ||
# how much time to give 'run always even if cancelled tasks' before stopping them | ||
cancelTimeoutInMinutes: "2" | ||
|
||
pool: intel-hpus | ||
|
||
workspace: | ||
clean: all | ||
|
||
|
@@ -33,25 +32,32 @@ jobs: | |
displayName: 'Instance HW info' | ||
|
||
- bash: | | ||
pip install . --requirement requirements/extra.txt | ||
pip install -e .[extra] | ||
pip install . --requirement requirements/test.txt | ||
displayName: 'Install dependencies' | ||
|
||
- bash: | | ||
python -m pytest -sv tests/accelerators/test_hpu.py --forked --junitxml=hpu1_test-results.xml | ||
python -m pytest -sv unittests_pl/accelerators/test_hpu.py --forked --junitxml=hpu1_test-results.xml | ||
workingDirectory: test | ||
displayName: 'Single card HPU test' | ||
|
||
- bash: | | ||
python -m pytest -sv tests/accelerators/test_hpu.py --forked --hpus 8 --junitxml=hpu8_test-results.xml | ||
python -m pytest -sv unittests_pl/accelerators/test_hpu.py --forked --hpus 8 --junitxml=hpu8_test-results.xml | ||
workingDirectory: test | ||
displayName: 'Multi card(8) HPU test' | ||
|
||
- bash: | | ||
python -m pytest -sv tests/plugins/precision/hpu/test_hpu.py --hmp-bf16 'tests/plugins/precision/hpu/ops_bf16.txt' --hmp-fp32 'tests/plugins/precision/hpu/ops_fp32.txt' --forked --junitxml=hpu1_precision_test-results.xml | ||
python -m pytest -sv unittests_pl/plugins/precision/hpu/test_hpu.py --hmp-bf16 \ | ||
'unittests_pl/plugins/precision/hpu/ops_bf16.txt' --hmp-fp32 \ | ||
'unittests_pl/plugins/precision/hpu/ops_fp32.txt' --forked \ | ||
--junitxml=hpu1_precision_test-results.xml | ||
workingDirectory: test | ||
displayName: 'HPU precision test' | ||
|
||
- bash: | | ||
export PYTHONPATH="${PYTHONPATH}:$(pwd)" | ||
python "pl_examples/hpu_examples/simple_mnist/mnist.py" | ||
python "pl_hpu/mnist_sample.py" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The structure here should be |
||
workingDirectory: examples | ||
displayName: 'Testing: HPU examples' | ||
|
||
- task: PublishTestResults@2 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,23 +5,23 @@ trigger: | |
branches: | ||
include: | ||
- master | ||
- release/* | ||
- refs/tags/* | ||
- "release/*" | ||
- "future/*" | ||
- "refs/tags/*" | ||
pr: | ||
- master | ||
- release/* | ||
- "release/*" | ||
- "future/*" | ||
|
||
variables: | ||
- name: poplar_sdk | ||
value: "poplar_sdk-ubuntu_20_04-2.3.1+793-89796d462d" | ||
|
||
jobs: | ||
- job: tests | ||
|
||
- job: testing | ||
# how long to run the job before automatically cancelling | ||
timeoutInMinutes: "15" | ||
pool: graphcore-ipus | ||
|
||
workspace: | ||
clean: all | ||
|
||
|
@@ -55,7 +55,7 @@ jobs: | |
export GIT_TERMINAL_PROMPT=1 | ||
python ./requirements/adjust-versions.py requirements/extra.txt | ||
python ./requirements/adjust-versions.py requirements/examples.txt | ||
pip install . --requirement ./requirements/devel.txt | ||
pip install -e . --requirement ./requirements/devel.txt | ||
pip list | ||
displayName: 'Install dependencies' | ||
|
||
|
@@ -68,16 +68,23 @@ jobs: | |
set -eux | ||
source ${{ variables.poplar_sdk }}/poplar-ubuntu*/enable.sh | ||
source ${{ variables.poplar_sdk }}/popart-ubuntu*/enable.sh | ||
|
||
python -c "import poptorch; print(poptorch.__version__)" | ||
displayName: "Check poptorch installation" | ||
|
||
- bash: | | ||
source ${{ variables.poplar_sdk }}/poplar-ubuntu*/enable.sh | ||
source ${{ variables.poplar_sdk }}/popart-ubuntu*/enable.sh | ||
export POPTORCH_WAIT_FOR_IPU=1 | ||
export PL_RUN_IPU_TESTS=1 | ||
python -m coverage run --source pytorch_lightning -m pytest tests -vv --junitxml=$(Build.StagingDirectory)/test-results.xml --durations=50 | ||
cd src | ||
python -m pytest pytorch_lightning | ||
displayName: 'DocTests' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't add doctesting to these accelerator jobs as it complicates the pipeline for contributors |
||
|
||
- bash: | | ||
source ${{ variables.poplar_sdk }}/poplar-ubuntu*/enable.sh | ||
source ${{ variables.poplar_sdk }}/popart-ubuntu*/enable.sh | ||
cd test | ||
python -m coverage run --source pytorch_lightning -m pytest unittests_pl -vv --durations=50 | ||
env: | ||
MKL_THREADING_LAYER: "GNU" | ||
displayName: 'Testing: standard' | ||
POPTORCH_WAIT_FOR_IPU: 1 | ||
PL_RUN_IPU_TESTS: 1 | ||
displayName: 'UnitTests' |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,48 +16,48 @@ | |
/docs/ @edenlightning @tchaton @borda @awaelchli @RobertLaurella | ||
/.github/*.md @edenlightning @williamfalcon @borda | ||
/.github/ISSUE_TEMPLATE/ @edenlightning @borda @tchaton | ||
/docs/source/conf.py @borda @awaelchli @carmocca | ||
/docs/source/index.rst @williamfalcon | ||
/docs/source/levels @williamfalcon @RobertLaurella | ||
/docs/source/expertise_levels @williamfalcon @RobertLaurella | ||
/docs/source-PL/conf.py @borda @awaelchli @carmocca | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we need this
|
||
/docs/source-PL/index.rst @williamfalcon | ||
/docs/source-PL/levels @williamfalcon @RobertLaurella | ||
/docs/source-PL/expertise_levels @williamfalcon @RobertLaurella | ||
|
||
# Packages | ||
/pytorch_lightning/accelerators @williamfalcon @tchaton @SeanNaren @awaelchli @justusschock @kaushikb11 | ||
/pytorch_lightning/callbacks @williamfalcon @tchaton @carmocca @borda @kaushikb11 | ||
/pytorch_lightning/core @tchaton @SeanNaren @borda @carmocca @justusschock @kaushikb11 | ||
/pytorch_lightning/distributed @williamfalcon @tchaton @awaelchli @kaushikb11 | ||
/pytorch_lightning/lite @tchaton @awaelchli @carmocca | ||
/pytorch_lightning/loggers @tchaton @awaelchli @borda | ||
/pytorch_lightning/loggers/wandb.py @borisdayma | ||
/pytorch_lightning/loggers/neptune.py @shnela @HubertJaworski @pkasprzyk @pitercl @Raalsky @aniezurawski @kamil-kaczmarek | ||
/pytorch_lightning/loops @tchaton @awaelchli @justusschock @carmocca | ||
/pytorch_lightning/overrides @tchaton @SeanNaren @borda | ||
/pytorch_lightning/plugins @tchaton @SeanNaren @awaelchli @justusschock | ||
/pytorch_lightning/profiler @williamfalcon @tchaton @borda @carmocca | ||
/pytorch_lightning/profiler/pytorch.py @nbcsm @guotuofeng | ||
/pytorch_lightning/strategies @tchaton @SeanNaren @awaelchli @justusschock @kaushikb11 | ||
/pytorch_lightning/trainer @williamfalcon @borda @tchaton @SeanNaren @carmocca @awaelchli @justusschock @kaushikb11 | ||
/pytorch_lightning/trainer/connectors @tchaton @SeanNaren @carmocca @borda | ||
/pytorch_lightning/tuner @SkafteNicki @borda @awaelchli | ||
/pytorch_lightning/utilities @borda @tchaton @SeanNaren @carmocca | ||
/src/pytorch_lightning/accelerators @williamfalcon @tchaton @SeanNaren @awaelchli @justusschock @kaushikb11 | ||
/src/pytorch_lightning/callbacks @williamfalcon @tchaton @carmocca @borda @kaushikb11 | ||
/src/pytorch_lightning/core @tchaton @SeanNaren @borda @carmocca @justusschock @kaushikb11 | ||
/src/pytorch_lightning/distributed @williamfalcon @tchaton @awaelchli @kaushikb11 | ||
/src/pytorch_lightning/lite @tchaton @awaelchli @carmocca | ||
/src/pytorch_lightning/loggers @tchaton @awaelchli @borda | ||
/src/pytorch_lightning/loggers/wandb.py @borisdayma | ||
/src/pytorch_lightning/loggers/neptune.py @shnela @HubertJaworski @pkasprzyk @pitercl @Raalsky @aniezurawski @kamil-kaczmarek | ||
/src/pytorch_lightning/loops @tchaton @awaelchli @justusschock @carmocca | ||
/src/pytorch_lightning/overrides @tchaton @SeanNaren @borda | ||
/src/pytorch_lightning/plugins @tchaton @SeanNaren @awaelchli @justusschock | ||
/src/pytorch_lightning/profiler @williamfalcon @tchaton @borda @carmocca | ||
/src/pytorch_lightning/profiler/pytorch.py @nbcsm @guotuofeng | ||
/src/pytorch_lightning/strategies @tchaton @SeanNaren @awaelchli @justusschock @kaushikb11 | ||
/src/pytorch_lightning/trainer @williamfalcon @borda @tchaton @SeanNaren @carmocca @awaelchli @justusschock @kaushikb11 | ||
/src/pytorch_lightning/trainer/connectors @tchaton @SeanNaren @carmocca @borda | ||
/src/pytorch_lightning/tuner @SkafteNicki @borda @awaelchli | ||
/src/pytorch_lightning/utilities @borda @tchaton @SeanNaren @carmocca | ||
|
||
# Specifics | ||
/pytorch_lightning/trainer/connectors/logger_connector @tchaton @carmocca | ||
/pytorch_lightning/trainer/progress.py @tchaton @awaelchli @carmocca | ||
/src/pytorch_lightning/trainer/connectors/logger_connector @tchaton @carmocca | ||
/src/pytorch_lightning/trainer/progress.py @tchaton @awaelchli @carmocca | ||
|
||
# API | ||
/pytorch_lightning/callbacks/base.py @williamfalcon @awaelchli @ananthsub @carmocca | ||
/pytorch_lightning/core/datamodule.py @williamFalcon @awaelchli @ananthsub @carmocca | ||
/pytorch_lightning/trainer/trainer.py @williamfalcon @tchaton @awaelchli | ||
/pytorch_lightning/core/hooks.py @williamfalcon @tchaton @awaelchli @ananthsub @carmocca | ||
/pytorch_lightning/core/lightning.py @williamfalcon @tchaton @awaelchli | ||
/src/pytorch_lightning/callbacks/base.py @williamfalcon @awaelchli @ananthsub @carmocca | ||
/src/pytorch_lightning/core/datamodule.py @williamFalcon @awaelchli @ananthsub @carmocca | ||
/src/pytorch_lightning/trainer/trainer.py @williamfalcon @tchaton @awaelchli | ||
/src/pytorch_lightning/core/hooks.py @williamfalcon @tchaton @awaelchli @ananthsub @carmocca | ||
/src/pytorch_lightning/core/lightning.py @williamfalcon @tchaton @awaelchli | ||
|
||
# Testing | ||
/tests/helpers/boring_model.py @williamfalcon @tchaton @borda | ||
|
||
/.github/CODEOWNERS @williamfalcon | ||
/.github/approve_config.yml @williamfalcon | ||
/SECURITY.md @williamfalcon | ||
/README.md @williamfalcon @edenlightning @borda | ||
/setup.py @williamfalcon @borda @carmocca | ||
/pytorch_lightning/__about__.py @williamfalcon @borda @carmocca | ||
/test/unittests_pl/helpers/boring_model.py @williamfalcon @tchaton @borda | ||
|
||
/.github/CODEOWNERS @williamfalcon | ||
/.github/approve_config.yml @williamfalcon | ||
/SECURITY.md @williamfalcon | ||
/README.md @williamfalcon @edenlightning @borda | ||
/setup.py @williamfalcon @borda @carmocca | ||
/src/pytorch_lightning/__about__.py @williamfalcon @borda @carmocca |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed before merge, right?