Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix transformer smaller training vocab #3155

Conversation

helpmefindaname
Copy link
Member

this PR fixes the usage of the transformer smaller training vocab and improves documentation:

  • fix that transformer smaller training vocab are actually used for training and not just temporarily reduce the vocab size before the training starts.
  • enforce using a newer version of the library. There are 2 changes:
    • embedding-parameters that are not trainable, but present in the optimizer, will now be rightfully kept not trainable. Hence, running embeddings.model.embeddings.word_embeddings.requires_grad_(False) before training will work with reduce=True.
    • The config rightfully sets the vocab size and therefore reduced models can be saved and later loaded as such.
  • the transformer smaller training vocab is now documented in the tutorials
  • the onnx tutorial is now slightly improved to fit the newer structure

@alanakbik
Copy link
Collaborator

Thanks for this @helpmefindaname! I tested on our cluster - it works with 'distilbert-base-uncased', but running a FLERT training script with 'xlm-roberta-large' throws a weird CUDA error (RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)``). Can you check?

@helpmefindaname
Copy link
Member Author

Thanks for this @helpmefindaname! I tested on our cluster - it works with 'distilbert-base-uncased', but running a FLERT training script with 'xlm-roberta-large' throws a weird CUDA error (RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)``). Can you check?

Thanks for pointing this out, this should be fixed with version 0.2.3

@alanakbik
Copy link
Collaborator

@helpmefindaname thanks, it works now!

@alanakbik alanakbik merged commit 25ebf38 into flairNLP:master Mar 29, 2023
@helpmefindaname helpmefindaname deleted the fix_transformer_smaller_training_vocab branch March 29, 2023 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants