-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Argparse usability issues #3720
Comments
Hi! thanks for your contribution!, great first issue! |
I had the same problem when using an
It should use the parent option if the provided parser is an Now, about using jsonargparse, @nateraw has been working on improving the CLI arguments API. For now, it is available in pytorch-lightning-bolts. If core chooses to support jsonargparse, I suspect it would be through bolts aswell. |
@nateraw has been working on a few issues with the current argparse setup. |
@nateraw please look at this issue and comment. I have looked at what you have done in lightning-bolts. However, with that it still wouldn't be possible to use an extension of argparse that provides more features than the standard argparse. |
@mauvilsa I'm not sure I'd be comfortable adding a 3rd party package to do this w/o full confidence in its stability, especially now that we're on >1.0.0 ...that being said, I'll dive into your codebase and get your perspective. 😄 I'd be much more comfortable implementing the solution directly into lightning. That way all the tests live in one place. |
@nateraw nice to have you here now 👍. I am not sure why you think it is better to reimplement things directly into lightning. Modularity and separation of concerns is a basic software engineering principle. pytorch-lightning is not an argument parsing library. It makes much more sense to reuse existing software whose sole purpose is parsing arguments. If you go on and implement an extension of argparse in linghtning, then what would you do when people ask for config file support? Are you going to implement this from scratch and keep people waiting until this is done, well tested and stable? This code will end up not being used because people would simply find alternatives. Another example is that the Trainer class has some complex arguments. Take for instance Regarding stability and testing. In the company I work we heavily use jsonargparse and there are also many other people already using it in their projects despite jsonargparse not yet being popular. Being used in many projects means that it has been tested quite a lot. Not to mention that obviously there are many unit tests. Furthermore, jsonargparse is an extension of argparse, so large part of its stability comes from objective of intending to be mostly an argparse compatible replacement. If lightning uses jsonargparse it should also include unit tests to verify that the correct behavior is obtained, much like how it should be for all the other dependencies that lightning has. In a few minutes I will add here an example to illustrate to you better what the benefits of using jsonargparse would be. |
In the master branch of the jsonargparse github repo you can find the latest changes I have been working on. This is needed to test the code below. To install the Without integrating jsonargparse into lightning, a way to add the parameters of the #!/usr/bin/env python3
from jsonargparse import ArgumentParser, ActionConfigFile
from pytorch_lightning import Trainer
parser = ArgumentParser(description='My pytorch-lightning-based training tool.')
parser.add_argument('--cfg', action=ActionConfigFile, help='Config file.')
parser.add_argument('--op1', default='val1', help='Tool option 1.')
parser.add_class_arguments(Trainer, 'train', as_group=True)
cfg = parser.parse_args()
print(parser.dump(cfg, skip_none=False)) Some of the features that this already provides are:
The following assumes the code above has been saved to a file called An example of a complex argument is Be sure you look at the help by running Some of these features are work in progress. But I do plan to make them stable and have good unit tests rather soon. All of this will be included in the release of v3.0.0. |
@nateraw regarding your proposal in issue #4119 certainly it would be convenient to have something like that to standardize how arguments are added and simplify how to do it. But as you can imagine I would suggest to use jsonargparse. It could be an extension of jsonargparse's ArgumentParser. I would only propose a few minor changes as follows: from pytorch_lightning.utils import ArgumentParser
from my_models import MyModel
from my_datamodules import MyDM
parser = ArgumentParser()
parser.add_trainer_args('train')
parser.add_datamodule_args(MyDM, 'data')
parser.add_model_args(MyModel, 'model')
cfg = parser.parse_args() The differences are:
|
What would this integration look like?
It doesn't seem to work (using current master):
This is sweet! Including the argument descriptions is a pending TODO. Also adding to groups in a nested namespace is exactly what I wanted.
I know there has been some effort already on integrating Lightning with Hydra to configure projects, so there may be some clashing here (cc @tchaton ?).
I agree with these comments. @mauvilsa What are the disadvantages of just having a jsonargparse API for Lightning in pl-bolts without changing Lightning's code? |
Would be very simple, mostly the code in from jsonargparse import ArgumentParser, ActionConfigFile
from pytorch_lightning import Trainer
class LightningArgumentParser(ArgumentParser):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwards)
self.add_argument('--cfg', action=ActionConfigFile, help='Config file.')
def add_trainer_args(self, parent='train'):
self.add_class_arguments(Trainer, parent, as_group=True)
def add_datamodule_args(...
Probably with the code I wrote rather fast this morning I broke it. If you can, try it out with python 3.6 which is the only version I tested up to now. I will investigate what the issue is and let you know.
I am not too familiar with Hydra. I guess the scope of that would be bigger than argument parsing and config files. Certainly there could be an overlap here.
As you can see jsonargparse could be used without integrating in lightning or bolts. The benefit that I would see of integrating it is that you standardize the format of config files, much like the way lightning is standardizing where you place different parts of the code. To me clearly this fits better in lightning than in bolts, since the focus of bolts is models, callbacks, and datasets for research. I am personally not currently interested in bolts since unfortunately I don't have too much time for research. Why should users be forced to install bolts just to use the argparse feature? What is the issue with changing lightning's code? It wouldn't interfere with anything core and would be big improvement over the current argparse that is already there. |
@carmocca I now tried the code using python 3.8 and I did not get any issues. What exactly did you do? Was it in a new virtual env? Which version of pytorch-lightning? |
Sorry for the confusion. I was reusing a previous venv which caused the issue. Works with a clean environment!
No issue on my part, just asking to assess the different options 😄 |
Lightning is a collection of tools to make machine learning w/ PyTorch easier. I believe its not out of the question for us to come up with a solution for argparsing that solves basic argparse problems related to lightning.
I think That being said, I did read through your code. I really like where Now, as per our discussion above on the API, I wrote a draft PR that updates to the API we discussed. I'm not convinced this is right either (after playing with it)...will be discussing this with the core team on Monday so I can bring more people into this discussion. As a bonus, I used the same argparser code from the above draft PR in this repo to show a real example of how I'm using it. |
The issue is closed but it doesn't seem to be resolved - I just tried to use PL with jsonargparse, and it fails at the stage of config reading. What is the officially endorsed way of using configs for argparse together with PL? I looked in the docs but it only mentions argparse integration, nothing about configs, and 3rd party modules don't seem to work together with PL... |
@pzelasko The jsonargparse integration is now merged to master. You can read about it at https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_cli.html Note that it is in beta. Feel free to open issues if you encounter any bugs or limitations. |
Thanks! Will try it out. |
🚀 Feature
Improve the usability of the argparse functionality by allowing to use extensions of argparse.
Motivation
Currently in pytorch-lightning argparse is used as
parser = ArgumentParser(parents=[parent_parser])
which prevents many behaviors that users might want. The most basic example, someone implements atrainer.py
tool, thus writes:If you then run
trainer.py --help
the description of the parser is not shown. This is because the parser thatadd_argparse_args
returns is not the one that the user created. It is a new one that just copied the defined arguments, the rest being lost. Another feature lost is specifying aformatter_class
for example doingparser = ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
.Another consequence of the way argparse is currently used is that it is not possible to use argparse-like parsers. The argparse module is rather limited, so it would be nice to be able to use extensions of argparse. More specifically, I would really like to use jsonargparse with pytorch-lightning. This would for example make it trivial to parse arguments from a json or yaml configuration file. When you have so many options that can be configured such as with the Trainer class, a config file is simply a must. Reinventing the wheel by adding config file support to pytorch-lightning would not make much sense. Much better to use an existing package that already provides this.
I tried changing the pytorch-lightning code from
parser = ArgumentParser(parents=[parent_parser])
toparser = parent_parser
and use a jsonargparse parser. Unfortunately it failed with anAttributeError: 'bool' object has no attribute 'lower'
. Without looking at the source code I imagine that a bool argument support has been added since argparse does not support it. However, jsonargparse already supports defining arguments withtype=bool
.Honestly I can't imagine any benefit or good reason for using argparse's parent option here. If there is, please comment.
Pitch
I propose the following:
add_argparse_args
if the provided parser is a jsonargparse parser.add_argparse_args
would return the parser and use the parent option if the provided parser is not a jsonargparse parser.To pitch further, I would be more than willing to implement and create a pull request for this. I am the developer of jsonargparse and continuously work on machine learning. Also I think pytorch-lightning is an awesome project and plan to use it a lot and hopefully contribute to it.
The text was updated successfully, but these errors were encountered: