Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

register_model_architecture not working in version 0.10.1 #3097

Closed
voidmagic opened this issue Jan 5, 2021 · 3 comments
Closed

register_model_architecture not working in version 0.10.1 #3097

voidmagic opened this issue Jan 5, 2021 · 3 comments
Assignees

Comments

@voidmagic
Copy link
Contributor

🐛 Bug

Using register_model_architecture to register new hyper parameter set failed. The hyper parameter is overrided by default settings.

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

  1. Run cmd 'fairseq-train --task language_modeling data-bin --arch transformer_lm_big --batch-size 1 --optimizer adam'
  2. See error
    There are 6 layers in the model, which should be 12 as defined in transformer_lm_big . https://github.com/pytorch/fairseq/blob/v0.10.1/fairseq/models/transformer_lm.py#L311

Code sample

Expected behavior

Environment

  • fairseq Version (e.g., 1.0 or master): 0.10.1
  • PyTorch Version (e.g., 1.0)
  • OS (e.g., Linux):
  • How you installed fairseq (pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

@myleott
Copy link
Contributor

myleott commented Jan 5, 2021

Good catch. It's only broken for TransformerLanguageModel in 0.10.1, because that model was migrated to Hydra (dataclass-based config) before the migration was completed.

The fix is to patch this commit: b7d8b9d. I'll release 0.10.2 with the fix shortly.

@myleott myleott self-assigned this Jan 5, 2021
@myleott myleott added the 0.10.2 label Jan 5, 2021
@myleott
Copy link
Contributor

myleott commented Jan 5, 2021

Fixed in 0.10.2

@myleott myleott closed this as completed Jan 5, 2021
@xuuHuang
Copy link

The hyper parameters of transformer model are still overrided by default settings in TransformerConfig in 0.10.2.
If I set arch=transformer_tiny, there 6 layers in the model, which should be 2.
I think that's because add_args() in TransformerModelBase sets the default hyper parameters.

def add_args(parser):
    gen_parser_from_dataclass(
        parser, TransformerConfig(), delete_default=False, with_prefix=""
    )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants