Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to HF transformers 4.3.2 #16

Merged
merged 1 commit into from
Feb 28, 2021

Conversation

vblagoje
Copy link
Contributor

@lvwerra here is a proposal for HT Transformers update to 4.3.2. The only important compatibility breaking issue I found was related to a parameter name change in GPT2Model#forward function. I also upgraded simpletransformers to a version that uses Transformers 4.3.2 as well. I have tried all notebooks in nbs directory to make sure everything works as it used to. I did not upgrade any other libs as I am not that familiar with nbdev environment.

04-gpt2-sentiment-ppo-training.ipynb works as expected and trains in about 2 hours. However, for 05-gpt2-sentiment-control.ipynb I had to lower the batch size to 128 and forward_batch_size to 4 in order to make the training work and avoid CUDA out of memory errors. The estimated running time on tqdm is 5 hours.

@vblagoje
Copy link
Contributor Author

vblagoje commented Feb 26, 2021

@lvwerra I tried this branch on both imdb ppo notebooks (the basic ppo sentiment training and the controlled sentiment ppo). They both work as expected, please try it as well. Let me know if any other checks should be done.

@lvwerra
Copy link
Member

lvwerra commented Feb 26, 2021

awesome! did you also use weights and biases? in case you did, would you mind sharing the logs?

@vblagoje
Copy link
Contributor Author

Yes, I did but I deleted the first report for 04-gpt2-sentiment-ppo-training.ipynb. Here is the report for 05-gpt2-sentiment-control.ipynb

@lvwerra lvwerra merged commit 750f5fd into huggingface:master Feb 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants