Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config class(decorated by dataclass) can not receive kwargs from the config dict. #30

Closed
wuyongfa-genius opened this issue Apr 18, 2021 · 6 comments

Comments

@wuyongfa-genius
Copy link

wuyongfa-genius commented Apr 18, 2021

First thanks for your excellent work. But it seems that i have come across a very strange problem. After I finished "accelerate config", I launched my script using "accelerate launch my_script.py", and I got the error as follows:

Traceback (most recent call last):
  File "/home/wuyongfa/anaconda3/envs/mmdet/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/wuyongfa/anaconda3/envs/mmdet/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 41, in main
    args.func(args)
  File "/home/wuyongfa/anaconda3/envs/mmdet/lib/python3.7/site-packages/accelerate/commands/launch.py", line 297, in launch_command
    defaults = load_config_from_file(args.config_file)
  File "/home/wuyongfa/anaconda3/envs/mmdet/lib/python3.7/site-packages/accelerate/commands/config/config_args.py", line 61, in load_config_from_file
    return config_class.from_yaml_file(yaml_file=config_file)
  File "/home/wuyongfa/anaconda3/envs/mmdet/lib/python3.7/site-packages/accelerate/commands/config/config_args.py", line 100, in from_yaml_file
    return cls(**config_dict)
TypeError: __init__() got an unexpected keyword argument 'machine_rank'

I assume that this is because the BaseConfig class(decorated by dataclass) can not receive kwargs from the config dict. Could anyone help me to find why?

@sgugger
Copy link
Collaborator

sgugger commented Apr 19, 2021

That is strange. Could you copy here the content of your config file? Should be in ~/.cache/huggingface/accelerate/defautl_config.yml

@wuyongfa-genius
Copy link
Author

compute_environment: LOCAL_MACHINE
distributed_type: MULTI_GPU
fp16: false
machine_rank: 0
main_process_ip: null
main_process_port: null
main_training_function: main
num_machines: 1
num_processes: 1

@sgugger
Copy link
Collaborator

sgugger commented Apr 19, 2021

Could reproduce, and the fix has been merged in #31 .
Would you mind making a source install to check it solves your problem? I'll make a patch release with the fix as soon as we've confirmed it now works as intended.

@wuyongfa-genius
Copy link
Author

Thanks for your effort. It is Ok now. By the way, I have a question that "When a new epoch starts, can the accelerate automatically shuffle the dataset on each device like the old way DistributedSampler.set_epoch(epoch) does?"

@sgugger
Copy link
Collaborator

sgugger commented Apr 19, 2021

If you defined your dataloader with shuffle=True, this will be done yes :-)
See the blog post (section how does it work) for more details!

@sgugger sgugger closed this as completed Apr 19, 2021
@sgugger
Copy link
Collaborator

sgugger commented Apr 19, 2021

Closing the issue since you confirmed it resolved the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants