-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FSDP integration #6152
Closed
FSDP integration #6152
Changes from 1 commit
Commits
Show all changes
77 commits
Select commit
Hold shift + click to select a range
78f1eb4
Add initial FSDP integration
c36e00a
Fix error in refactor
59dbb83
update
tchaton 19a1440
Revert "update"
3b38615
Address reviews
5ff06ab
Fix doc string
36434f0
Even moar code review
c61a190
Add deprecation
1c4f011
Merge branch 'master' into feat/fsdp
02599e6
Fix name of test
e79977a
Integrate nesting, fix bugs across implementation
d15d4b5
Merge branch 'master' into feat/fsdp
ebf1818
Formatting types
290e8fd
Add additional tests for accelerator model
5c5f762
Fix import
d28438b
Few test fixes, expose params
ab591a8
Allow training_type_plugin to delay optimizer configure
23ccdb8
Merge branch 'feat/fsdp_2n' into feat/fsdp
a60f2c0
Add missing references to trainer, add a CPU accelerator based test
3d4e6df
Merge branch 'feat/fsdp_2n' into feat/fsdp
516bd04
Update for latest API changes to fairscale
9f8864f
Add base hook for model parallel
eac5344
fix callback signature
kaushikb11 32df0cb
Simplify hook
282a133
Add hook logic
7a94e72
add tests
kaushikb11 8091481
add property setter
kaushikb11 633fc77
add logic for being called once
kaushikb11 c99a36f
Update changelog
kaushikb11 a68c8d7
Merge branch 'master' into feat/model_parallel_hook
kaushikb11 9529a22
Fix
kaushikb11 3c1c782
fix return type
kaushikb11 7daba43
Merge branch 'master' into feat/fsdp
87ec222
Fix property name
966b2e5
Merge branch 'feat/model_parallel_hook' into feat/fsdp
5f6e039
Updaet wrapper, use latest fixes for hooks
b512e72
Swap hook order
8ba82df
Merge branch 'master' into feat/fsdp
1e5ca37
Small changes
936dc1a
Fixes
a6de18e
Remove activation checkpointing
8684f94
Turn off auto wrap by default
76091ae
Move to trainer.model
226d498
fix reference
cd63c10
Merge branch 'master' into feat/fsdp
b881e2f
Remove flag
e8959be
Fix imports
52478ac
Fix versions, update docs
b7f1896
Fix clip gradients
a62f8d8
Merge branch 'master' into feat/fsdp
69c33f1
Merge branch 'master' into feat/fsdp
9fa26c0
Fixes
56f23ce
pull
9ca3f0c
Few changes across the board
b53ba36
Fix imports
0da5249
Set none
90c6479
Swap to warnings
69d8178
Remove fairscale from container
a459d10
pull
a7842d9
Update dockers/base-cuda/Dockerfile
48ee83f
Add defaults, add test to ensure nested wrapper is set correctly
57a696c
Remove deprecation as this will be removed completely
36889b8
Check for nested FSDP wrappers, and omit wrapping algorithm
89b8cb5
Merge branch 'master' into feat/fsdp
0c1d2de
Update pytorch_lightning/trainer/connectors/accelerator_connector.py
592bb28
Address code review points
4e230c9
Merge branch 'master' into feat/fsdp
ca8e586
Add back missing model that was removed from clipping signature
54f501d
Do not pass model through, accelerator does it
02925cc
Merge branch 'master' into feat/fsdp
b67f1a9
Fix merge
132eb64
Fix imports
e6ce3cf
Changes to precision plugin
01153af
Require 2 GPU for multi gpu test
6cfe57d
Merge branch 'master' into feat/fsdp
efa81ab
Use callback in test, swap to DynamicLossScaler from fairscale to tes…
78d52b5
Disable loss scaler for now
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Merge branch 'master' into feat/fsdp
- Loading branch information
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
You are viewing a condensed version of this merge commit. You can view the full changes here.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's unclear to me what
use_ddp
stands for at this point with all these distributed types supportedThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the primary usage is knowing whether to use a distributed sampler, but this logic should ideally be re-written to be a property of the training type plugin. I was hoping to get to that in #6090