-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Fine-Tunable Transformers to Flair #1492
Comments
Supporting longer texts (more than 512 subtokens) would be helpful (at least for prediction). My research show that processing paragraphs rather than sentences decreases error by 10%. |
Yes good point - what is the 'standard' way of working around the 512 subtoken limitation of transformers? I guess easiest would be to truncate the text to max length 512, but maybe there is a better way? |
I have in mind sequence tagging so truncating in prediction mode is unacceptable. The text should be divided into splits with some overlapping context and then reconstructed. For text classification there are some truncating strategies. However, in simple-transformers text is divided and each part is predicted separately, then the mode of text predictions is a final result. |
Thanks - yes for |
Just for reference, some truncation strategies are evaluated in this paper. |
GH-1492: added new BERT embeddings implementation
Fine-tuning now part of Flair 0.5. |
We currently support word embeddings from Huggingface's various transformer models (BERT, XLM, etc.), but two important features are missing: (1) we don't yet support sentence embeddings extracted directly from the transformer model using the [CLS] token and (2) the transformers currently are not fine-tuneable via Flair. This is a shame since transformers really shine when sentence embeddings are directly extracted from a fine-tuned transformer.
So with this issue, we want to add
The text was updated successfully, but these errors were encountered: