-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dpr<>bart config for RAG #8808
Fix dpr<>bart config for RAG #8808
Conversation
@@ -104,7 +104,8 @@ def prepare_config_and_inputs(self): | |||
token_labels = ids_tensor([self.batch_size, self.seq_length], self.num_labels) | |||
choice_labels = ids_tensor([self.batch_size], self.num_choices) | |||
|
|||
config = BertConfig( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is dangerous when new params are added to Bert, it'll lead to silent errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks for fixing it !
@@ -178,7 +178,7 @@ def __init__(self, config): | |||
|
|||
# position_ids (1, len position emb) is contiguous in memory and exported when serialized | |||
self.register_buffer("position_ids", torch.arange(config.max_position_embeddings).expand((1, -1))) | |||
self.position_embedding_type = config.position_embedding_type | |||
self.position_embedding_type = getattr(config, "position_embedding_type", "absolute") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment that mentions backward compatibility ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing this, your fix looks good to me.
@@ -104,7 +104,8 @@ def prepare_config_and_inputs(self): | |||
token_labels = ids_tensor([self.batch_size, self.seq_length], self.num_labels) | |||
choice_labels = ids_tensor([self.batch_size], self.num_choices) | |||
|
|||
config = BertConfig( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
* correct dpr test and bert pos fault * fix dpr bert config problem * fix layoutlm * add config to dpr as well
* correct dpr test and bert pos fault * fix dpr bert config problem * fix layoutlm * add config to dpr as well
What does this PR do?
There was a big silent bug between the interaction of DPR and BERT. Bert added a new config parameter that DPR does not have -> so a "normal" dpr config crashes when used with BERT. This is actually a very big bug and was introduces in #8276 as pointed out by @lhoestq - thanks!
Two things went wrong here.
We should be more careful in general when introducing new config parameters and calling them via
config.<new_parameter>
especially for models like BERT that can be used with other configs.The DPR tests should have could that, but instead of using a normal DPR config, a BERT-like DPR config was used in the tests, which is dangerous because it exactly doesn't catch errors like those.
This PR fixes 1) and 2) by calling the config in the case of the newly introduces parameter only with
getattr(config, <param_name>, <default_value>)
and adds the config functionality to DPR as well (DPR also should have this functionality over BERT's positional embedding). IMOgetattr(config, <param_name>, <default_value>)
should be used for models like BERT in general because they could be used and wrapped in many different ways.Also the DPR test is fixed to use a DPR config instead of a BERT config.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.