-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new ELMoTransformerEmbeddings class #399
Conversation
I am getting the following error when running the script: Traceback (most recent call last):
File "/home/aakbik/PycharmProjects/flair/local_test_local.py", line 15, in <module>
ELMoTransformerEmbeddings(model_file='/home/aakbik/Documents/Data/Embeddings/eu-elmo-transformer-model.tar.gz')
File "/home/aakbik/PycharmProjects/flair/flair/embeddings.py", line 337, in __init__
self.allen_nlp_utils = ELMoTransformerEmbeddings.AllenNlpUtils(model_file)
File "/home/aakbik/PycharmProjects/flair/flair/embeddings.py", line 360, in __init__
requires_grad=False
File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/modules/token_embedders/bidirectional_language_model_token_embedder.py", line 68, in __init__
archive = load_archive(archive_file, overrides=json.dumps(overrides))
File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/models/archival.py", line 156, in load_archive
cuda_device=cuda_device)
File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/models/model.py", line 321, in load
return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
File "/home/aakbik/.environments/flair/lib/python3.6/site-packages/allennlp/common/registrable.py", line 58, in by_name
raise ConfigurationError("%s is not a registered name for %s" % (name, cls.__name__))
allennlp.common.checks.ConfigurationError: 'language_model is not a registered name for Model' Since I haven't worked with allennlp much: Any ideas where this error is coming from? |
@alanakbik You should use a recent |
Ah ok, I will! |
@alanakbik It would work with the latest torch.save(model_state, str(model_file), pickle_module=dill) Do you think we can add a new parameter to |
@stefan-it yes I think that would be better since it would make it easier for people to use the class (just install allennlp and dill)! |
I rebased the code on latest Testing was done successfully on CPU, on GPU I'm currently not able to train a model, see #407. |
mmhh.. I guess something went wrong with the rebasing as the changes which are already in the master branch are shown as changes of this PR. Would you mind updating the branch again so that we only see the changes you actually did? Thanks! |
This supports different pickle modules (like dill library). SequenceTagger class gets new variable for specifying a pickle model. Default is standard pickle method. In order to use the new ELMoTransformerEmbeddings class, just use dill.
👍 Thanks! Looks good. |
Sorry for the confusion! |
Hi @stefan-it - I'm testing the current version locally with allennlp 8.1. When running this code: embeddings = ELMoTransformerEmbeddings('eu-elmo-transformer-model.tar.gz')
embeddings.embed(Sentence('I love Berlin')) I get the error: Traceback (most recent call last):
File "/home/aakbik/PycharmProjects/stefan-flair/flair/train.py", line 21, in <module>
embeddings = ELMoTransformerEmbeddings('eu-elmo-transformer-model.tar.gz')
File "/home/aakbik/PycharmProjects/stefan-flair/flair/flair/embeddings.py", line 356, in __init__
requires_grad=False
File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/modules/token_embedders/bidirectional_language_model_token_embedder.py", line 68, in __init__
archive = load_archive(archive_file, overrides=json.dumps(overrides))
File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/models/archival.py", line 156, in load_archive
cuda_device=cuda_device)
File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/models/model.py", line 321, in load
return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
File "/home/aakbik/.environments/stefan/lib/python3.6/site-packages/allennlp/common/registrable.py", line 58, in by_name
raise ConfigurationError("%s is not a registered name for %s" % (name, cls.__name__))
allennlp.common.checks.ConfigurationError: 'language_model is not a registered name for Model' Any idea where this error comes from? |
0.8.1 is too old :( Please try the latest With the latest from flair.data import Sentence
from flair.embeddings import ELMoTransformerEmbeddings
sentence = Sentence("It is no longer snowing in Munich .")
embeddings = ELMoTransformerEmbeddings(model_file='eu-elmo-transformer-model.tar.gz')
embeddings.embed(sentence)
for token in sentence.tokens:
print(token.embedding) tensor([ 6.3030, -25.0204, 3.3425, ..., 4.0137, -29.3740, 7.8902])
tensor([ 12.4846, -19.3540, -3.5304, ..., 6.7518, -6.8744, -7.1798])
tensor([ 16.5951, -19.1923, -6.8334, ..., 12.1934, -0.0000, -0.0000])
tensor([ 4.6706, -0.0000, -0.0000, ..., 6.8242, -8.0350, -3.6316])
tensor([ 5.2432, -15.5042, -4.4524, ..., 2.1873, -12.7269, -5.6938])
tensor([ 20.1224, -21.2971, 2.6443, ..., 9.2506, 1.6562, -4.3060])
tensor([ 2.4384, -0.0000, -7.1844, ..., 0.0000, -0.0000, 3.2789])
tensor([ 16.3643, -0.0000, -0.0000, ..., -14.8744, 16.0890, 13.4172]) |
Thanks this works - sorry with the dill change I somehow thought this meant that it works with pip install allennlp :) For now, we could include this class but advise people that it is experimental and requires to check out the current master branch of allennlp. As soon as a new version of allennlp is pushed to pip that allows this class to work we could remove the experimental tag. What do you think? |
I fully agree with you :) Even in the
You can add me as a kind of maintainer for the |
Ok, sounds good! Will merge as soon as tests run through. |
Thanks for adding this - really look forward to seeing what people do with this and how it compares! |
Hi,
this PR introduces a new
ELMoTransformerEmbeddings
class. With the help from @brendan-ai2 it is possible to get embeddings from a transformer-based ELMo model.That model was proposed in Dissecting Contextual Word Embeddings: Architecture and Representation.
Embeddings from a transformed-based ELMo model can now be used in
flair
. New CUDA semantic is also used. Training a model both on CPU and GPU works.A pretrained transformer-based ELMo model for Basque can be downloaded from:
Downstream task example
To train a model for PoS tagging (in this example for Basque) with the new
ELMoTransformerEmbeddings
just follow the following instructions:Clone a recent version of
allennlp
and install it, e.g.:I tested it with commit 4c5de57 of
allennlp
.Then download a pretrained transformer-based ELMo model.
The training can be started with the following script: