-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-2640: tensor forward #2643
GH-2640: tensor forward #2643
Conversation
The onnx conversions can be tested via the following example code: import torch
from flair.data import Sentence
from flair.datasets import (
GLUE_MNLI,
NEL_ENGLISH_AQUAINT,
RE_ENGLISH_CONLL04,
UD_ENGLISH, CONLL_03,
)
from flair.embeddings import TransformerDocumentEmbeddings, WordEmbeddings
from flair.models import (
DependencyParser,
EntityLinker,
RelationExtractor,
SequenceTagger,
TextClassifier,
TextPairClassifier, WordTagger,
)
from flair.models.diagnosis.distance_prediction_model import DistancePredictor
from flair.models.text_regression_model import TextRegressor
def convert_text_regression():
model = TextRegressor(
TransformerDocumentEmbeddings("distilbert-base-uncased"),
)
example_sentence = Sentence("This is a sentence.")
tensors = model._prepare_tensors([example_sentence])
torch.onnx.export(
model,
tensors,
"textregression.onnx",
input_names=["text_embedding_tensor"],
output_names=["scores"],
opset_version=12,
verbose=True,
)
def convert_distance_predictor():
model = DistancePredictor(WordEmbeddings("turian"))
example_sentence = Sentence("This is a sentence.")
tensors = model._prepare_tensors([example_sentence])
torch.onnx.export(
model,
tensors,
"distance_predictor.onnx",
input_names=["text_embedding_tensor"],
output_names=["label_scores"],
opset_version=12,
verbose=True,
)
def convert_word_tagger():
corpus = CONLL_03()
dictionary = corpus.make_label_dictionary("ner")
model = WordTagger(embeddings=WordEmbeddings("turian"), tag_dictionary=dictionary, tag_type="ner")
example_sentence = corpus.train[0]
longer_sentence = corpus.train[1]
tensors = model._prepare_tensors([example_sentence, longer_sentence])
torch.onnx.export(
model,
tensors,
"word_tagger.onnx",
input_names=["embedded_tokens"],
output_names=["scores"],
opset_version=12,
verbose=True,
)
def convert_text_classifier():
model = TextClassifier.load("de-offensive-language")
example_sentence = Sentence("This is a sentence.")
tensors = model._prepare_tensors([example_sentence])
torch.onnx.export(
model,
tensors,
"textclassifier.onnx",
input_names=["text_embedding_tensor"],
output_names=["scores"],
opset_version=12,
verbose=True,
)
def convert_sequence_tagger():
model = SequenceTagger.load("ner-fast")
example_sentence = Sentence("This is a sentence.")
longer_sentence = Sentence("This is a way longer sentence to ensure varying lengths work with LSTM.")
tensors = model._prepare_tensors([example_sentence, longer_sentence])
torch.onnx.export(
model,
tensors,
"sequencetagger.onnx",
input_names=["sentence_tensor"],
output_names=["scores"],
opset_version=12,
verbose=True,
)
def convert_dependency_parser():
corpus = UD_ENGLISH()
dictionary = corpus.make_label_dictionary("dependency")
model = DependencyParser(token_embeddings=WordEmbeddings("turian"), relations_dictionary=dictionary)
example_sentence = Sentence("This is a sentence.")
longer_sentence = Sentence("This is a way longer sentence to ensure varying lengths work with LSTM.")
tensors = model._prepare_tensors([example_sentence, longer_sentence])
torch.onnx.export(
model,
tensors,
"dependencyparser.onnx",
input_names=["sentence_tensor", "lengths"],
output_names=["score_arc", "score_rel"],
opset_version=12,
verbose=True,
)
def convert_entity_linker():
corpus = NEL_ENGLISH_AQUAINT()
dictionary = corpus.make_label_dictionary("nel")
model = EntityLinker(word_embeddings=WordEmbeddings("turian"), label_dictionary=dictionary)
example_sentence = corpus.train[0]
longer_sentence = corpus.train[1]
tensors = model._prepare_tensors([example_sentence, longer_sentence])
torch.onnx.export(
model,
tensors,
"entity_linker.onnx",
input_names=["text_embedding_tensor"],
output_names=["scores"],
opset_version=12,
verbose=True,
)
def convert_text_pair_classifier():
corpus = GLUE_MNLI()
dictionary = corpus.make_label_dictionary("entailment")
model = TextPairClassifier(
document_embeddings=TransformerDocumentEmbeddings("distilbert-base-uncased"),
label_type="entailment",
label_dictionary=dictionary,
)
example_sentence = corpus.train[0]
longer_sentence = corpus.train[1]
tensors = model._prepare_tensors([example_sentence, longer_sentence])
torch.onnx.export(
model,
tensors,
"textpair_classifier.onnx",
input_names=["text_pair_embedding_tensor"],
output_names=["scores"],
opset_version=12,
verbose=True,
)
def convert_extractor_model():
model = RelationExtractor.load("relations")
corpus = RE_ENGLISH_CONLL04()
example_sentence = corpus.train[0]
longer_sentence = corpus.train[1]
tensors = model._prepare_tensors([example_sentence, longer_sentence])
torch.onnx.export(
model,
tensors,
"textpair_classifier.onnx",
input_names=["text_pair_embedding_tensor"],
output_names=["scores"],
opset_version=12,
verbose=True,
)
if __name__ == "__main__":
convert_distance_predictor()
convert_word_tagger()
convert_text_regression()
convert_extractor_model()
convert_text_pair_classifier()
convert_entity_linker()
convert_dependency_parser()
convert_sequence_tagger()
convert_text_classifier() |
5d02677
to
912a371
Compare
289e616
to
8209eec
Compare
Any chance of this making into main tree? I love flair, and ONNX models are a must for production code! |
hello @bratao yes we are looking into this now - sorry @helpmefindaname for taking so long, I first wanted to release Flair 0.11 (done yesterday) before making bigger changes. |
6a2067a
to
46ab211
Compare
46ab211
to
25816ed
Compare
@helpmefindaname I started reviewing, but it will take some time to think through all the changes. The logic of splitting out the tensor and non-tensor stuff in the forward pass makes sense for ONNX but I worry about code readability with the logic now distributed across many methods and different parent classes ( There also is a small problem with models that require candidates and get passed sentences without candidates, see: # init a random linker for testing
linker: EntityLinker = EntityLinker(TransformerWordEmbeddings(model='distilbert-base-uncased'),
label_dictionary=Dictionary())
# sentence with candidate label - works
sentence = Sentence("I live in Berlin")
sentence[3:4].add_label('nel', 'LOC')
linker.predict(sentence)
print(sentence)
# sentence without candidate - fails
sentence = Sentence("I live in Berlin")
linker.predict(sentence) |
Hi @alanakbik I made an attempt to simplify the interfaces you can see it by looking at this single commit I understand that it might be easy to add labels and embeddings that diverge, so I let the so now the Model itself only needs to implement 3 methods which can be implemented within 4 LOC each (when held simple) and has two additional ( What do you think about that? |
Hello @helpmefindaname from a first look-through I like this structure a lot! Some suggestions:
That would reduce the code needed to 6 methods for a standard default classifier with no extras (for instance the
with all actual important logic in the first three methods, which is cool. |
A small change to the interface: The RelationExtractor logic could be simplified, as the logic for extracting labels was the same as the logic created in |
if labels.size(0) == 0: | ||
return torch.tensor(0.0, requires_grad=True, device=flair.device), 1 | ||
|
||
embedded_tensor = self._prepare_tensors(sentences) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could cause some problems I think. 4 lines above the predict_data_points
are extracted using self._get_prediction_data_points(sentences)
and labels extracted from them.
But here now the sentences
are passed into self._prepare_tensors
which first applies a filter, then again calls self._get_prediction_data_points(sentences)
. So it is possible that the datapoints extracted after the filter diverge from the datapoints used to get the labels.
Maybe this _prepare_tensors
could take as input predict_data_points
instead of sentences
? The filtering could be done beforehand as first line in forward_loss
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good catch!
Although the filtering is usually for sentences that do not have any labels, it could be misused and create some divergence.
I moved the filtering to be the first thing to be done to a batch. Even before calling _prepare_tensors
.
Sadly, I cannot change the signature of _prepare_tensors
, as otherwise it wouldn't be in line with the general flair.nn.Model
and would make it very complicated to use jit or onnx exports.
ba27fa4
to
fa43b08
Compare
3e8f85a
to
ee56324
Compare
@helpmefindaname thanks for fixing the rebase conflicts! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for adding this to Flair!
First PR about #2640: refactoring models, such that each model has a
_prepare_tensors
and aforward
method, where one extracts all tensors out of the data points and the latter only does tensor computations.That way, JIT tracing and ONNX conversation of the models should be possible.
Notice: this is not expected to give a huge speedup, as most computation is within the embeddings, which won't be affected. Due to that, the ONNX conversion is also not documented.
This PR also adds unit scaling for model downloads when a non-huggingface model is downloaded.
This PR also fixes a bug, that relation extraction models without weight_dict set can be loaded.
This PR also fixes an encoding error, if glue-mnli is loaded on a windows machine
This PR also adds an option to not add a unk token to labels.