Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add configuration option "collect-unknown-options" #181

Merged
merged 10 commits into from
Sep 6, 2019

Conversation

henderea
Copy link
Contributor

@henderea henderea commented Jun 7, 2019

Fixes yargs/yargs#1243

All new code is covered in tests. It shouldn't make any difference, but I used the Wallaby.js VSCode extension for test running and coverage, instead of the npm scripts.

I tried to match the existing code style as well as possible, but please feel free to fix any issues I created.

@bcoe
Copy link
Member

bcoe commented Jun 8, 2019

@henderea thanks for the patch. Mind running ./node_modules/.bin/standard --fix.

@juergba bother you for a code review?

@juergba
Copy link
Contributor

juergba commented Jun 9, 2019

@bcoe I will need some days for my code review.

On my first glance the documentation is missing.

_: an array representing the positional arguments.

I'm not sure wether it is a good idea to use property _ for this purpose. Positional arguments and unkown options are completely different objects.

@henderea
Copy link
Contributor Author

henderea commented Jun 9, 2019

@juergba I can work on documentation, but the whole point of this feature is to treat unknown options like regular arguments. If it doesn't do that, then the whole change is pointless for my needs.

Perhaps the name should be updated? The idea of this mode is to treat unknown options like regular arguments, so I'm pretty sure _ is the right place for that. I'm not always the best at naming in code 😅

If you look at the issue I filed that this PR fixes, yargs/yargs#1243, you'll see that my intention is to allow unknown options to appear in a varargs of a command, so unless _ won't do that, I'm pretty sure I'm putting it in the right place.

Copy link
Contributor

@juergba juergba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's risky to test only with numbers. arrays are luckily not affected by this PR, since hasAnyFlag() should be truthy. But booleans can be very annoying ...

Did you check side-effects with Yargs#strict() and configurations like populate--, short-option-groups and halt-at-non-option?

Travis error TypeError: Object.values is not a function: maybe support of Node 6 should be dropped, otherwise search for "// XXX Switch to [].concat(...Object.values(aliases)) once node.js 6 is dropped" within index.js.

index.js Outdated
@@ -26,7 +26,8 @@ function parse (args, opts) {
'set-placeholder-key': false,
'halt-at-non-option': false,
'strip-aliased': false,
'strip-dashed': false
'strip-dashed': false,
'ignore-unknown-options': false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse-unknown-options: true instead of ignore-unknown-options': false could be an alternative naming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely open to naming suggestions. If you think parse-unknown-options is clearer, then I'd be happy to go with that. If you have other suggestions, I'm open to those too.

index.js Outdated
setArg(m[1], m[2])
} else {
argv._.push(maybeCoerceNumber('_', arg))
Copy link
Contributor

@juergba juergba Jun 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an unknown flag is parsed as positional argument, so why processing its value?
Why processing only numbers, not booleans?
I think the value should not be processed at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll remove the number processing.

index.js Outdated
}
}
if (unmatched.length > 0) {
argv._.push(maybeCoerceNumber('_', ['-', ...unmatched].join('')))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the values are already processed, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I'll remove the number parsing here too.

index.js Outdated
setArg(key, value)
} else {
unmatched.push(key)
unmatched.push('=')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why with "="?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise the = is lost, that's why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit odd too because -f a b c is a valid parse. tldr; you end up with an = that didn't exist.

@henderea
Copy link
Contributor Author

Node 6 support fixed (didn't realize Object.values was an issue, probably should have looked at the build failure 😅). I've also changed it to parse-unknown-options: false, and removed the maybeCoerceNumber from my new code. It seems the existing number coercion for unhandled args in the final else is what caused the numeric arguments in my tests, so my tests were actually not changed by the removal of coercion from my new code.

I'll plan to expand tests over the weekend if I get to it.

@bcoe
Copy link
Member

bcoe commented Jun 16, 2019

@henderea @juergba perhaps we can take a step back here, why do you want this behavior?

Tools like nyc and mocha have similar needs, in which they want to be able to run a command where some options are intended for the application, and some are intended for the program being instrumented, e.g.,

nyc --option 1 foo --option 3 --option 4

We've introduced the configuration [halt-on-non-option(https://github.com/yargs/yargs-parser#halt-at-non-option) to support this use-case, which I think is pretty close to what you're trying to do?

@bcoe
Copy link
Member

bcoe commented Jun 22, 2019

@henderea just bumping up this conversation, was wondering if --halt-on-non-option could potentially give you close to the behavior that you're looking for, for the application you're building.

The behavior of stopping parsing once you hit the first non --option argument is fairly common for tools that instrument a file (wondering if this is close to your use-case).

@henderea
Copy link
Contributor Author

@bcoe: sorry, haven't gotten around to expanding the test suite yet

As for --halt-on-non-option, my use case is actually related to passing a full command string (including flags) as a parameter to a script. Basically, i want to be able to have ./script.js command ls -lh ~ (with command defined as 'command <exe> [args...]') be able to capture -lh ~ in the args.

Does --halt-on-non-option allow flags to be captured in a varargs when appearing after a non-option? if so, then that might do what I need. But if the 'command <exe> [args...]' command definition won't allow me to capture options untouched in the args when <exe> is a non-option, then it won't work for my needs.

@henderea
Copy link
Contributor Author

Also, related to --halt-on-non-option, can that be applied only for certain commands? I would still like other commands to be able to end with a flag, placed after regular arguments.

The tool I want this for is something I'm converting from ruby, and the ruby library I use for command and option handling by default behaves like my parse-unknown-options: false code, treating unknown options as regular positional arguments.

I'll try to get some additional tests written, but I think my needs will only be fully handled by something similar to this pull request.

@henderea
Copy link
Contributor Author

For example, I want to have the following definitions:

'list <pattern>' with boolean option -l
and
'run <exe> [args...]' with no options

I want to be able to run
./script.js list -l pat
and
./script.js list pat -l
with both setting

{
  pattern: 'pat',
  l: true
}

but have
./script.js run ls -l .
set

{
  exe: 'ls',
  args: ['-l', '.']
}

As a note, the list command lists something from a specific JSON file, and run runs an arbitrary shell command. There is nothing in common between list and run ls. Also note that this is a simplified set of commands with different names from my actual intended usage, but they are representative of my desired usage and behavior.

@henderea
Copy link
Contributor Author

@bcoe: If possible, I would like for this feature or something else that will treat unknown options as regular arguments to be added to yargs. It looks like the halt-on-non-option flag would have to apply to all commands, but I want to parse options both before and after positional arguments on commands that have options. If I don't want to parse any options on a given command, I will just not define any.

When I write a script, I will define EVERY option that I want to handle, and anything I don't define shouldn't be treated as an option. The ruby library I use for argument parsing (thor) does this by default, and I personally feel that it is rather odd to set options that haven't been defined. The way I see it, my script should only support options that have been defined, and should fail on undefined options, but only if they don't fit into an argument or varargs.

@bcoe
Copy link
Member

bcoe commented Jun 27, 2019

@henderea 👋 thanks for filling me in regarding your use case, I'll try to get back to you with a response soonish 😄 when I can give this my full attention.

Copy link
Member

@bcoe bcoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommendations

Simplifying Algorithm

I think you could simplify the logic quite a bit, and make a few less code changes, if you moved the check to the very top of the parser, here:

if (arg.match(/^--.+=/) || (

Do something like:

// just below the argv definition.
if (configuration['collect-unknown-options']) {
  argv.unknownOptions = []
}

// ...

if (configuration['collect-unknown-options'] && hasUnknownOption(arg, args[i + 1])) {
   argv.unknownOptions.push(arg)
} else if (arg.match(/^--.+=/) || (
  !configuration['short-option-groups'] && arg.match(/^-.+=/)
)) {
  // do things with argument.
} else {
  // etc.
}

☝️ hasUnknownOption would just need to drop handle:

  • -xyz: ensuring that x, y, and z are all configured.
  • --foo=bar: extracting just the foo and checking that it's configured.
  • --foo: extracting just the foo and making sure it's configured.

Naming things

I agree with @juergba, I would introduce a new key on argv called unknownOptions, which for your use case could just be combined with _ in your application.

I also like the name collect-unknown-options, as the config setting.

Apologies!

Sorry that it took me so long to get back to you with review; my OSS time has been spread thin these days unfortunately, and I wanted to take the time to actually delve into this feature.

I'm feeling like, if we can keep the footprint as small as the new algorithm I described in this PR, I'm happy to land this feature (@juergba how does the algorithm I described sound to you?).

index.js Outdated
setArg(key, value)
} else {
unmatched.push(key)
unmatched.push('=')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a bit odd too because -f a b c is a valid parse. tldr; you end up with an = that didn't exist.

index.js Show resolved Hide resolved
@juergba
Copy link
Contributor

juergba commented Jul 15, 2019

I agree with @juergba, I would introduce a new key on argv called unknownOptions, which for your use case could just be combined with _ in your application.

Meanwhile I have changed my mind on this:

  • halt-at-non-option also uses the _ property, so unparsed arguments are already treated as positional arguments.
  • let's take following example: "--ls aa bb cc". ls is an unknown option, so we don't know anything about its type and how many of the following arguments should have to be consumed.
    • should the values aa, bb, cc be handled as positional arguments ==> arg._ ?
    • should we handle them as input to the unknown option ls ==> argv.unknownOptions ?

We could end up with spreading our unknown arguments over two different properties.

☝️ hasUnknownOption would just need to drop handle:
-xyz: ensuring that x, y, and z are all configured.
--foo=bar: extracting just the foo and checking that it's configured.
--foo: extracting just the foo and making sure it's configured.

This list is not complete, the extraction of the plain key out of the raw argument is more complex than this. E.g. boolean negation, dot anotation are missing. We would have to copy the extraction logic into hasUnknownOption, I'm not convinced of this.

@bcoe
Copy link
Member

bcoe commented Jul 15, 2019

@juergba I'd like to avoid adding conditional logic throughout the parser, to handle the --unknown-options case, I'd also like to avoid logic that requires that a key be rebuilt with less than perfect information, e.g, the case of:

_.push(key)
_.push('=')

we could just continue to enumerate some of these other cases:

--no-[key]
--[key].foo

☝️ it's a finite list.

@henderea
Copy link
Contributor Author

Well, I don't necessarily have an issue with simplifying the logic if it still does the same thing, but putting the unknown options into a separate property and merging them could alter the order of arguments, which would not be desirable.

If there is a way to rewrite the logic to get the same effect with less-invasive changes, I'm all for that. I just want to be sure it still works the same way, or at least that I can get the same result through some method. Altering the order of arguments because they were split between 2 different properties is not desireable.

@bcoe
Copy link
Member

bcoe commented Jul 16, 2019

@henderea I'm open to putting the options in _ 👍 but I like the idea of encapsulating the logic for skipping a key in the first check of our if/else block, this way we don't end up breaking the feature over and over again as we make slight tweaks to different option types (I've seen this happen for other constructs like negation and dot notation).

@bcoe
Copy link
Member

bcoe commented Jul 29, 2019

@henderea just following up 👍 to summarize:

  1. I'm comfortable with the idea of us putting the unknown options in _.
  2. I'd like the check for an unknown option to ideally be its own logic before the rest of option parsing, so that we don't have to perform the check in a variety of places (even though it creates some extra parsing logic in the case of collect-unknown-options, I like that we don't need to add additional checks to ever other type of argument).
  3. let's use collect-unknown-options as the config setting.

@henderea
Copy link
Contributor Author

@bcoe Sure, I can make the changes. The thing is, I'm not 100% sure of the form those changes would take. How would I detect that a key should be skipped before parsing the actual key? If you could give me a bit more direction on that, I could make the changes.

@bcoe
Copy link
Member

bcoe commented Jul 30, 2019

@henderea I can try to pseudo code something, but my thinking was you only have a few forms of keys:

--foo
-foo
--foo=99
-foo=99

☝️ so have a little helper function that runs before the rest of our checks and extracts the foo, if we handle foo in our parser, move to the next steps, otherwise put the whole key --foo, or --foo=99, etc., into unknown arguments.

@henderea henderea changed the title feature: add configuration option "ignore-unknown-options" feature: add configuration option "collect-unknown-options" Aug 5, 2019
@henderea
Copy link
Contributor Author

henderea commented Aug 5, 2019

@bcoe I've made the requested changes. I ended up creating a function that takes arg and some regex patterns, and checks if parsing arg with any of the patterns results in a key (in match group 1) that has been configured. The name I gave it is hasFlagsMatching, but you may want to give it a better name. Code naming isn't a strong point of mine 😅

@henderea
Copy link
Contributor Author

henderea commented Aug 5, 2019

I realized after my first implementation of the updated code that it wasn't checking short flags individually, so I updated my code to handle that, and also added in the check for negative numbers that I saw in the short flags handling, along with renaming checkFlags to hasFlagsMatching so it matches more closely with my hasAnyFlag and new hasAllShortFlags functions.

@bcoe
Copy link
Member

bcoe commented Aug 5, 2019

@henderea thanks for taking this on 👍 will provide review soon.

index.js Outdated Show resolved Hide resolved
This was referenced Mar 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

An option that allows unknown flags to be included in a varargs?
4 participants