-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with max_sequence_length in BertEmbeddings #1519
Comments
Hi @ayushjaiswal we are in the process of refactoring the transformer-based embeddings classes. See #1494. Instead of separate classes for each transformer embedding, we will have a unified class that gets the transformer model as string in the constructor. So initialization will be like this: # example sentence
sentence = Sentence('The grass is green')
# a BERT model
embeddings = TransformerWordEmbeddings(model="bert-base-uncased", layers="-1", pooling_operation='first')
embeddings.embed(sentence)
# a roBERTa model
embeddings = TransformerWordEmbeddings(model="distilroberta-base", layers="-1", pooling_operation='first')
embeddings.embed(sentence) There is now also a corresponding We're also looking at different ways for handling overlong sequences as part of the refactoring. We will add handling for this soon. |
@alanakbik Thanks for the quick response! Great to hear about the refactoring and handling of overlong sequences. |
@alanakbik Would love to see this feature in flair! |
Thanks for the pointer - yes this looks promising so we might integrate it! |
Looking forward to this 😄 |
@alanakbik is there any update on this? 🙂 |
Unfortunately, we haven't gotten around to this yet. But you could try the recently added "longformer" models which can handle longer sequences: embeddings = TransformerWordEmbeddings('allenai/longformer-base-4096')
embeddings.embed(sentence) |
Currently,
BertEmbeddings
does not account for the maximum sequence length supported by the underlying (transformers
)BertModel
. Since BERT creates subtokens, it becomes somewhat challenging to check sequence-length and trim sentence externally before feeding it toBertEmbeddings
inflair
.I see a problem in https://github.com/flairNLP/flair/blob/master/flair/embeddings.py#L2678--L2687
This is passed to
which sets
max_sequence_length
in:https://github.com/flairNLP/flair/blob/master/flair/embeddings.py#L2620-L2622
But this does not account for or check the max-sequence-length supported by the BERT model, which is accessible in either of the above functions through
self.model.config.max_position_embeddings
.The text was updated successfully, but these errors were encountered: