Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support passing lists of class_path items via command line #85

Closed
carmocca opened this issue Aug 19, 2021 · 4 comments · Fixed by #138
Closed

Support passing lists of class_path items via command line #85

carmocca opened this issue Aug 19, 2021 · 4 comments · Fixed by #138
Labels
enhancement New feature or request

Comments

@carmocca
Copy link
Contributor

carmocca commented Aug 19, 2021

When one wants to pass a list of objects to a parser, this works very well using configs:

from typing import List, Optional

from jsonargparse import ActionConfigFile

from pytorch_lightning.utilities.cli import ArgumentParser


class A:
    def __init__(self, name: str):
        self.name = name

    def __repr__(self):
        return self.name

class B:
    def __init__(self, people: Optional[List[A]] = None):
        self.people = people

    def __repr__(self):
        return repr(list(self.people))


parser = ArgumentParser(parse_as_dict=True)
parser.add_argument("--config", action=ActionConfigFile)
parser.add_class_arguments(B, nested_key="foo")
args = parser.parse_args()
args = parser.instantiate_classes(args)

print(args)
foo:
  people:
    - class_path: __main__.A
      init_args:
        name: one
    - class_path: __main__.A
      init_args:
        name: two
    - class_path: __main__.A
      init_args:
        name: three

(ignore the silly names)

python script.py --config=config.yaml
{'foo': [one, two, three]}

However, for users who do not want or cannot use configs, their only option is to write the entire config through command line:

$ python script.py --foo.people='[{class_path: __main__.A, init_args: {name: one}}, {class_path: __main__.A, init_args: {name: two}}, {class_path: __main__.A, init_args: {name: three}}]'
{'foo': [one, two, three]}

This is very long to write and error prone.

Pitch

Support the previous by doing:

python script.py \
    --foo.people.class_path=__main__.A --foo.people.init_args.name=one \
    --foo.people.class_path=__main__.A --foo.people.init_args.name=two \
    --foo.people.class_path=__main__.A --foo.people.init_args.name=three

where the order is important, as the init_args apply to the previous item with the same nested key.

Note: after #84 is implemented, this would become:

python script.py \
   --foo.people=A --foo.people.name=one \
   --foo.people=A --foo.people.name=two \
   --foo.people=A --foo.people.name=three

Additional context:

We are implementing this in Lightning-AI/pytorch-lightning#8815

@mauvilsa mauvilsa added the enhancement New feature or request label Aug 26, 2021
@mauvilsa
Copy link
Member

Looks reasonable.

@mauvilsa
Copy link
Member

I have been thinking about this a bit to have a clear plan before implementing. Some of the things that I have thought are the following:

  • It does not make much sense to only have support to concatenate to a list when it is a class type. Mixing of types is common and seems simpler to think about support for any type instead of having a bunch of ifs depending on the case.
  • If it is possible to concatenate to a list in the command line, it should also be possible to do via config files.
  • If it is possible to concatenate to a list, it should be possible to replace the entire list instead of concatenating.

In order to support both concatenate to list and replace the list, there needs to be a special syntax. I have thought about if no special syntax is used, then by default it concatenates. This would make it exactly as it is now in pytorch-lightning. However, all other list types (except callbacks) replace the list. Furthermore, I haven't thought of any syntax that makes intuitive that works this way. Thus, I am thinking that there will be a special syntax to concatenate. An idea would be the plus sign + as the following:

python script.py \
   --foo.people+=A --foo.people.name=one \
   --foo.people+=A --foo.people.name=two \
   --foo.people+=A --foo.people.name=three

In a config file it would work the same way by having at the end of the key the + sign. The init args would not need to have any special symbol since this does not conflict with replace/concat operation.

Not having the + symbol would mean that the previous contents in the list would be discarded. If this is the case and the list was not empty, a warning could be shown indicating what has been discarded. This warning would be similar to the current one there is for discarding of init args when the class path changes.

@carmocca
Copy link
Contributor Author

If it is possible to concatenate to a list in the command line, it should also be possible to do via config files.

I don't think this is important, as creating lists from config files is well supported (see top post) and uses an intuitive syntax.

Adding this custom + notation will likely feel foreign to users unfamiliar with the library. It's also similar to the + used by Hydra, but one of jsonargparse's strengths is that it's very familiar to argparse users. Such notation could also break tools that use/parse jsonargparse CLIs that do not expect or support it.

My proposal is about improving the experience of creating lists from command line, not necessarily about differentiating between concatenation and replacement. If we ignore the latter requirement, it shouldn't need custom notation.

@mauvilsa
Copy link
Member

If it is possible to concatenate to a list in the command line, it should also be possible to do via config files.

I don't think this is important, as creating lists from config files is well supported (see top post) and uses an intuitive syntax.

Having the same features/behavior in both command line and config files has already been asked for. And just for consistency it makes sense to do. See for instance #89 and #113. People do use multiple config files, and currently there is no way to concatenate to a list in a second config file.

Adding this custom + notation will likely feel foreign to users unfamiliar with the library. It's also similar to the + used by Hydra, but one of jsonargparse's strengths is that it's very familiar to argparse users. Such notation could also break tools that use/parse jsonargparse CLIs that do not expect or support it.

Actually, not having a syntax difference would be a breaking change. Suppose you have an argument with type Union[A, List[A]] (see pytorch_lightning/trainer.py#L136) and current value [x,y] where x and y are subclasses of A. If you provide z, the current value will be replaced by z, not concatenated to the list.

Furthermore, suppose you have a type Union[A, int, List[A], List[int]]. Is the assumption that for subclasses only concatenate is needed? Would the same assumption hold for other types? I clearly see a need for both replace and concatenate. For subclasses of A it would concatenate but for int it wouldn't? This is inconsistent and would not make much sense to design this way. A type List[A] would not need a syntax difference. But since the previous case does need the syntax difference, for consistency this case should also use it.

Adding this syntax will not break any code that uses plain jsonargparse, since this would be a new feature. Certainly this behavior is different than what LightningCLI currently has, but that is a patch on top of jsonargparse. Having to add + is an additional effort, but I see it as necessary. The user learning can be easily guided by warnings and the syntax is mostly obvious.

My proposal is about improving the experience of creating lists from command line, not necessarily about differentiating between concatenation and replacement. If we ignore the latter requirement, it shouldn't need custom notation.

Sure, but as I explained it doesn't make sense to have concatenate without being able to replace. Just because the proposal/need is only for concatenate, it is not a good reason to compromise on the design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants