Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

New version of transformers #4018

Merged
merged 3 commits into from
Apr 4, 2020
Merged

New version of transformers #4018

merged 3 commits into from
Apr 4, 2020

Conversation

dirkgr
Copy link
Member

@dirkgr dirkgr commented Apr 4, 2020

No description provided.

Copy link
Contributor

@matt-gardner matt-gardner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions, but if you think things are ok, then go for it.

@@ -258,7 +269,7 @@ def test_token_idx_sentence_pairs(self):
".",
"</s>",
"</s>",
"It",
"ĠIt",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why it's adding spaces to the first token? Is that expected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a change in defaults.

At first, people thought that Ġ means that this is the beginning of the word (opposite of ## in BERT). So then people were complaining about the "bug" where the Ġ wasn't there at the beginning of the first token, and they made a fix. But now it means "space", so not having it makes sense ... I want to not get involved and just stick with the huggingface default.

setup.py Outdated
@@ -119,7 +119,7 @@
"flaky",
"responses>=0.7",
"conllu==2.3.2",
"transformers>=2.4.0,<2.5.0",
"transformers>=2.6.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you don't want to keep an upper bound?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea. Added in 3264dcc.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants