Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add set_choices to define a list of classes to choose from #84

Closed
carmocca opened this issue Aug 19, 2021 · 9 comments
Closed

Add set_choices to define a list of classes to choose from #84

carmocca opened this issue Aug 19, 2021 · 9 comments
Labels
enhancement New feature or request

Comments

@carmocca
Copy link
Contributor

carmocca commented Aug 19, 2021

Pitch

add_subclass_arguments provides a lot of flexibility for the cases where you don't know which implementation will be used, however, this flexibility can come with an increase in boilerplate as the users need to specify:

--an_argument.class_path=the.full.class.path.to.object --an_argument.init_args.the_arg=value

Additionally, knowing the full class path can be non-trivial if the object does not come from an installed package.

For the cases where you want to provide a set of potential classes to choose from but without the requirement of supporting subclasses of those, a parser.set_choices() method could be implemented. This method would simplify the input arguments to:

--an_argument.class=object --an_argument.the_arg=value

Note: I have chosen the .class key because doing --an_argument=object --an_argument.the_arg=value wouldn't produce a valid YAML/JSON/dict. .class has an advantage over alternatives like .cls in that it is a reserved Python word, so there cannot be a collision with __init__ arguments with the same name, which wouldn't be the case with __init__(cls)

Code

from pytorch_lightning.utilities.cli import ArgumentParser

class A:
    def __init__(self, the_arg: int = 0):
        ...

class B:
    def __init__(self, the_arg: int = 0):
        ...


parser = ArgumentParser(parse_as_dict=True)

# ALREADY SUPPORTED
parser.add_subclass_arguments((A, B), nested_key="foo")
args = parser.parse_args(['--foo.class_path=__main__.A', '--foo.init_args.the_arg=5'])
assert args == {'foo': {'class_path': '__main__.A', 'init_args': {'the_arg': 5}}}

# IDEA
parser.add_subclass_arguments((A, B), nested_key="foo")
# narrows down the possible types
parser.set_choices({"foo": (A, B)})  # same API as `parser.set_defaults()`
args = parser.parse_args(['--foo.class=A', '--foo.the_arg=5'])
assert args == {'foo': {'class': 'A', 'the_arg': 5}}

Notes:

  • The previous idea of add_class_choices wasn't good enough because it meant this could only be done to arguments added with add_subclass_arguments. However, this also needs to be supported:
class Trainer:
    def __init__(self, callbacks: List[str], ...):
        ...

parser.add_class_arguments(Trainer, "trainer")
# set the choices for only a subset of the arguments
parser.set_choices({"trainer.callbacks", supported_callbacks})

Additional context:

We are implementing this in Lightning-AI/pytorch-lightning#9565

Almost all Lightning users will choose an optimizer already provided by torch. This method would enable changing them seamlessly with minimal boilerplate.

In that PR, we use a Registry object which holds the list of classes to be used as choices for add_subclass_arguments.
I don't think jsonargparse needs to expose this. We would use it to generate the list of classes passed to add_class_choices:

parser.set_choices({"optimizer": optimizer_registry.classes})
@mauvilsa
Copy link
Member

Questions:

  • How would the value of an argument added with add_class_choices be specified in a config file? Exactly the same as with add_subclass_arguments?

  • Say that for some reason two classes from different packages have exactly the same name and you want to add both of them with add_class_choices. Would this not be supported? Or how would this be done?

@carmocca
Copy link
Contributor Author

carmocca commented Aug 26, 2021

How would the value of an argument added with add_class_choices be specified in a config file? Exactly the same as with add_subclass_arguments?

# configA
foo:
    class: A
    the_arg: 5

which would be equivalent to using add_subclass_arguments with:

# configB
foo:
    class_path: something.A
    init_args:
        the_arg: 5

Perhaps both configA and configB should work when passed to add_class_choices, whereas only configB would work when passed to add_subclass_arguments.

How would the value of an argument added with add_class_choices be specified in a config file? Exactly the same as with add_subclass_arguments?

I'd say not supported. In that case, or when you want to pass a subclass, the users should use add_subclass_arguments instead

@carmocca
Copy link
Contributor Author

carmocca commented Sep 16, 2021

For future reference, this has been implemented here: Lightning-AI/pytorch-lightning#9565 by modifying the sys.argv.

The .class suffix wasn't added to the Lightning implementation so that the jsonargparse can have more freedom when it is implemented.

@carmocca carmocca changed the title Add add_class_choices as a stricter add_subclass_arguments alternative Add set_choices to define a list of classes to choose from Sep 16, 2021
@mauvilsa
Copy link
Member

Since getting all of the subclasses for a given class is trivial (see get_all_subclasses), adding a set_choices method seems unnecessary. A simpler feature is a better one and easier to implement. If a user gives a name instead of a class path, i.e. a string without dots, then this name will be searched in all known subclasses and the corresponding one used if there is a match. Subclasses are found only after they have been imported, so the registries in LightningCLI still make sense to make sure these can be found by the jsonargparse parser. But no need to explicitly add the subclasses to the parser.

Both in command line and in config, giving just the class name would work, e.g. --subclass_key=ClassName or --subclass_key.class_path=ClassName and {"subclass_key": "ClassName"} or {"subclass_key": {"class_path": "ClassName"}}. Internally these would always be expanded to the full class path. Thus, dumped configs for subclasses will always be a dict including class_path and the full path to the class. The help of the parser would show all the known subclasses.

@carmocca
Copy link
Contributor Author

set_choices was my implementation idea for the patch we have in Lightning to support this, but in the end what's important is that we keep the command line and config behaviour more than how it's actually enabled in the parser.

Both in command line and in config, giving just the class name would work

So if this is the case, that's good with me!

Thus, dumped configs for subclasses will always be a dict including class_path and the full path to the class.

What about input configs? Would they support this "shorthand" notation?

@mauvilsa
Copy link
Member

What about input configs? Would they support this "shorthand" notation?

Yes, like the json examples I wrote.

mauvilsa added a commit that referenced this issue Apr 5, 2022
…lasses #84.

- add_argument with subclass type now also adds --*.help option.
- Shorter subclass command line arguments by not requiring to have .init_args.
- class_path's on parse are now normalized to shortest form.
- Fixes for latest version of mypy.
@mauvilsa
Copy link
Member

mauvilsa commented Apr 5, 2022

I committed an initial version of this to master and created a pre-release v4.6.0.dev1.

@mauvilsa
Copy link
Member

mauvilsa commented Apr 5, 2022

One issue of all imported subclasses supported is that there could be multiple with the same class name. Still, I think is preferable to do this than having to explicitly register with set_choices(self, nested_key, classes). Only an explicit call would be needed to specify which to chose when there are multiple with the same name. But for this I would be inclined for the following:

  • No requirement to import the class. Importing the modules is a waste of resources, thus you can hard code the class path given as a string.
  • No association to a nested key (or be optional). Multiple parameters might have the same base class and it would be redundant to have to register each.
  • The name is not necessarily the same as the class name.

Thus I am thinking of something like def some_name(class_path: str, short_name: str).

@mauvilsa
Copy link
Member

mauvilsa commented Apr 8, 2022

I am considering this feature implemented and closing this issue. My previous comment would be a different feature and I don't consider it important. If multiple subclasses have the same name and you specify a class path with this name, a warning is shown telling which class was used and recommending to use the full class path. Though I wonder, instead of a warning should it be a parser error?

This feature will be part of jsonargparse v4.6.0.

@mauvilsa mauvilsa closed this as completed Apr 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants