Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Update transformers requirement from <3.6,>=3.4 to >=3.4,<4.1 #4831

Merged
merged 6 commits into from
Dec 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- Renamed module `allennlp.data.tokenizers.token` to `allennlp.data.tokenizers.token_class` to avoid
[this bug](https://github.com/allenai/allennlp/issues/4819).
- `transformers` dependency updated to version 4.0.1.

### Fixed

Expand Down
8 changes: 4 additions & 4 deletions allennlp/nn/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -1767,10 +1767,10 @@ def find_embedding_layer(model: torch.nn.Module) -> torch.nn.Module:
"""
# We'll look for a few special cases in a first pass, then fall back to just finding a
# TextFieldEmbedder in a second pass if we didn't find a special case.
from transformers.modeling_gpt2 import GPT2Model
from transformers.modeling_bert import BertEmbeddings
from transformers.modeling_albert import AlbertEmbeddings
from transformers.modeling_roberta import RobertaEmbeddings
from transformers.models.gpt2.modeling_gpt2 import GPT2Model
from transformers.models.bert.modeling_bert import BertEmbeddings
from transformers.models.albert.modeling_albert import AlbertEmbeddings
from transformers.models.roberta.modeling_roberta import RobertaEmbeddings
from allennlp.modules.text_field_embedders.text_field_embedder import TextFieldEmbedder
from allennlp.modules.text_field_embedders.basic_text_field_embedder import (
BasicTextFieldEmbedder,
Expand Down
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,8 @@
"scikit-learn",
"scipy",
"pytest",
"transformers>=3.4,<3.6",
"transformers>=4.0,<4.1",
"sentencepiece",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the release notes, sentencepiece is not required as a dependency in transformers by default. But we use certain tokenizers from HF that require it.

"jsonpickle",
"dataclasses;python_version<'3.7'",
"filelock>=3.0,<3.1",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ def test_transformers_vocab_sizes(self, model_name):

def test_transformers_vocabs_added_correctly(self):
namespace, model_name = "tags", "roberta-base"
tokenizer = cached_transformers.get_tokenizer(model_name)
tokenizer = cached_transformers.get_tokenizer(model_name, use_fast=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use_fast=False?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RobertaTokenizerFast does not have the attribute encoder which we use in this test case.

allennlp_tokenizer = PretrainedTransformerTokenizer(model_name)
indexer = PretrainedTransformerIndexer(model_name=model_name, namespace=namespace)
allennlp_tokens = allennlp_tokenizer.tokenize("AllenNLP is great!")
Expand Down