Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Announcement] Changing model type of Barthez #9422

Closed
patrickvonplaten opened this issue Jan 5, 2021 · 14 comments
Closed

[Announcement] Changing model type of Barthez #9422

patrickvonplaten opened this issue Jan 5, 2021 · 14 comments

Comments

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Jan 5, 2021

We are currently undergoing some major refactoring of Bart-like models as shown in: #9343.

After the refactoring, the Barthez models would not work anymore with the AutoModel and AutoModelForSeq2SeqLM classes because Barthez actually corresponds more to the mbart model structure than to the Bart structure (compare to PR in #9343), but has bart and BartForConditionalGeneration defined as their default models.

In order to make the Barthez models work after merging the PR, the model type needs to be changed online to mbart for those models: https://huggingface.co/models?search=barthez . Since MBart is identical to Bart previous to merging the above PR the change won't affect older versions.

I want to do the change soon, just wanted to ping you @moussaKam. Please do let me know if you have are not happy with it or have any questions

@patrickvonplaten
Copy link
Contributor Author

Applied the change

@moussaKam
Copy link
Contributor

Hi @patrickvonplaten. Sorry for the late reply.

Actually I tested the model with BartForConditionalGeneration and everything was working well. On the other hand, after the modification I am getting the following error:

Unrecognized configuration class for this kind of AutoModel: AutoModelForMaskedLM. Model type should be one of LayoutLMConfig, DistilBertConfig, AlbertConfig, BartConfig, CamembertConfig, XLMRobertaConfig, LongformerConfig, RobertaConfig, SqueezeBertConfig, BertConfig, MobileBertConfig, FlaubertConfig, XLMConfig, ElectraConfig, ReformerConfig, FunnelConfig, MPNetConfig, TapasConfig.

Maybe we only need to change model_type (even if I am not sure why) and not the architecture, because mBART itself is using BartForConditionalGeneration.

We still have the problem of the tokenizer when using AutoTokenizer:

Tokenizer class BarthezTokenizer does not exist or is not currently imported.

Is it possible to force the api to import and use BarthezTokenizer instead of AutoTokenizer?

@patrickvonplaten
Copy link
Contributor Author

Hey @moussaKam,

Thanks for your answer! Yeah the AutoTokenizer is still a problem and actually showcases a deeper problem we're having for the AutoTokenziers in the lib. We'll need a new design, something like proposed here: #9305 to fix this issue. It's on my Todo list.

@patrickvonplaten
Copy link
Contributor Author

Regarding the error with AutoTokenizer I cannot reproduce it :-/ Could you maybe provide code snippet showcasing the problem?

@moussaKam
Copy link
Contributor

Hi @patrickvonplaten,

Here's a snippet:

text_sentence = "Paris est la capitale de la <mask>"
import torch

from transformers import (
    AutoTokenizer,
    BartForConditionalGeneration
)

barthez_tokenizer = AutoTokenizer.from_pretrained("moussaKam/barthez")
barthez_model = BartForConditionalGeneration.from_pretrained("moussaKam/barthez")

input_ids = torch.tensor(
    [barthez_tokenizer.encode(text_sentence, add_special_tokens=True)]
)
mask_idx = torch.where(input_ids == barthez_tokenizer.mask_token_id)[1].tolist()[0]

predict = barthez_model.forward(input_ids)[0]

barthez_tokenizer.decode(predict[:, mask_idx, :].topk(5).indices[0])
----> 9 barthez_tokenizer = AutoTokenizer.from_pretrained("moussaKam/barthez")
     10 barthez_model = BartForConditionalGeneration.from_pretrained("moussaKam/barthez")
     11 

~/anaconda3/envs/transformers/lib/python3.8/site-packages/transformers-4.1.1-py3.8.egg/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
    357 
    358             if tokenizer_class is None:
--> 359                 raise ValueError(
    360                     "Tokenizer class {} does not exist or is not currently imported.".format(tokenizer_class_candidate)
    361                 )

ValueError: Tokenizer class BarthezTokenizer does not exist or is not currently imported.

The expected output (if we use BarthezTokenizer instead of AutoTokenizer):

'France culture francophonie gastronomie mode'

@patrickvonplaten
Copy link
Contributor Author

Ok, @LysandreJik found a nice fix for the tokenizer. Regarding the model, I think from now on we should use MBart for Barthez since after the new release Bart is not compatible with Barthez anymore

@LysandreJik
Copy link
Member

However, there seems to be an issue remaining with the BarthezTokenizer, as the code shared by @moussaKam outputs the following in v4.1.0:

France culture francophonie gastronomie mode

but outputs the following on master:

ompeolin corporelleenfin1-1

It also mentions the following:

Some weights of the model checkpoint at moussaKam/barthez were not used when initializing BartForConditionalGeneration: ['encoder.layer_norm.weight', 'encoder.layer_norm.bias', 'decoder.layer_norm.weight', 'decoder.layer_norm.bias']

@LysandreJik
Copy link
Member

My bad, changing from BartForConditionalGeneration to MBartForConditionalGeneration fixes the issue.

@patrickvonplaten
Copy link
Contributor Author

Yeah, Barthez is the only model that is not longer compatible with Bart looking forward - we have to stick to MBart. But the model architecture corresponds 1-to-1 to MBart, so I think it's fine. Hope it's ok for you @moussaKam

@moussaKam
Copy link
Contributor

It's OK @patrickvonplaten if BARThez works well with AutoModel. Currently the shared code outputs (on the master):

'France culture francophonie gastronomie mode' if we use MBartForConditionalGeneration
'édappraiav comme' if we use AutoModel
'ompeolin corporelleenfin1-1' if we use BartForConditionalGeneration

@patrickvonplaten
Copy link
Contributor Author

Ah yeah, so instead of AutoModel, you'll have to use AutoModelForSeq2SeqLM.
And it should not work anymore on master with BartForConditionalGeneration, but only with MBartForConditionalGeneration. Is the output of MBartForConditionalGeneration correct/reasonable in your opinion?

=> so the model classes to use in the future are AutoModelForSeq2SeqLM (as before) and MBartForConditionalGeneration (this worked before as well), but now BartForConditionalGeneration should not work anymore.

If you could verify that this is actually the case on master now, that would be super nice

@moussaKam
Copy link
Contributor

moussaKam commented Jan 13, 2021

yes the output is reasonable with MBartForConditionalGeneration and AutoModelForSeq2SeqLM.

However we still have one last (I hope) problem when using pipeline.
The following code returns an error:

from transformers import pipeline

pbase = pipeline(task="fill-mask", model="moussaKam/barthez")
src_text = ["Paris est la capitale de la <mask>"]
results = [x["token_str"] for x in pbase(src_text)]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-12-d7b2e5a78b7c> in <module>
      1 from transformers import pipeline
      2 
----> 3 pbase = pipeline(task="fill-mask", model="moussaKam/barthez")
      4 src_text = ["Paris est la capitale de la <mask>"]
      5 results = [x["token_str"] for x in pbase(src_text)]

/datadisks/datadisk1/transformers/src/transformers/pipelines/__init__.py in pipeline(task, model, config, tokenizer, framework, revision, use_fast, **kwargs)
    403             )
    404 
--> 405         model = model_class.from_pretrained(model, config=config, revision=revision, **model_kwargs)
    406         if task == "translation" and model.config.task_specific_params:
    407             for key in model.config.task_specific_params:

/datadisks/datadisk1/transformers/src/transformers/models/auto/modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   1123                 pretrained_model_name_or_path, *model_args, config=config, **kwargs
   1124             )
-> 1125         raise ValueError(
   1126             "Unrecognized configuration class {} for this kind of AutoModel: {}.\n"
   1127             "Model type should be one of {}.".format(

ValueError: Unrecognized configuration class <class 'transformers.models.mbart.configuration_mbart.MBartConfig'> for this kind of AutoModel: AutoModelForMaskedLM.
Model type should be one of LayoutLMConfig, DistilBertConfig, AlbertConfig, BartConfig, CamembertConfig, XLMRobertaConfig, LongformerConfig, RobertaConfig, SqueezeBertConfig, BertConfig, MobileBertConfig, FlaubertConfig, XLMConfig, ElectraConfig, ReformerConfig, FunnelConfig, MPNetConfig, TapasConfig.

We got the same error when using the the inference api.

@patrickvonplaten
Copy link
Contributor Author

Ah yeah, that's something unrelated to the Bart Split PR I think. Do you mind opening a new issue where you can copy paste your code example from above? Feel free to tag me on it :-)

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants