[WIP] Further memory optimization of SPHINX series models #118
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR currently introduces 3 changes to fit SPHINX-13B FP16 on 4*16GB GPUs:
nvidia-smi with the model running on 4*V100-16GB after this PR:
![image](https://private-user-images.githubusercontent.com/53928811/286529531-c14a320d-5be0-4a4e-8346-79bd08394a77.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0NTQwMzEsIm5iZiI6MTczOTQ1MzczMSwicGF0aCI6Ii81MzkyODgxMS8yODY1Mjk1MzEtYzE0YTMyMGQtNWJlMC00YTRlLTgzNDYtNzliZDA4Mzk0YTc3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjEzVDEzMzUzMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTZkMzZjNDhhNmRiZTI3ZGI0MzBiM2Q5NDc3YmViNTMzMTg0ZmIxOTZlZTdhYzZjNWJiMzk4Y2VjNDFmNGFhMzAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.lTIF8lL9UbZS5eAkZwA_jGpfT9SpWGOfe7BBDATmEds)