Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature discussion: unified include/exclude file #223

Closed
octo opened this issue Nov 26, 2024 · 3 comments · Fixed by #224
Closed

Feature discussion: unified include/exclude file #223

octo opened this issue Nov 26, 2024 · 3 comments · Fixed by #224

Comments

@octo
Copy link
Contributor

octo commented Nov 26, 2024

Objective

Allow users of yamlfmt to define which files to format in a file separate from the configuration, similar to gitignore_excludes. Ideally combine include and exclude into one file.

Background

I'm building a reusable CI job allowing teams to ensure that YAML files in their repository are formatted correctly. I need to give the teams the ability to exclude some files matched by the default pattern (e.g. testdata, templates) and include some additional files not matched by the default pattern (e.g. dotfiles, templates). I do not want to expose the entire yamlfmt.yaml config file to teams, because we want the formatting to be consistent across repositories.

For excludes, I am setting the following configuration in the global config file. This allows teams to define a .yamlfmtignore file in their repository to exclude certain files. This works great so far.

gitignore_path: .yamlfmtignore
gitignore_excludes: true

Allowing teams to define additional include patterns is much harder because there is no equivalent mechanism for includes. Additionally, providing any paths or patterns on the command line overwrites the include setting in the config file.

Design outline

Use one file, using gitignore syntax, to specify includes as well as excludes, using gitignore's negation syntax.

Example:

*.yaml
*.yml
/.some_tool  # File in the repository root without .yaml suffix

# Ignore testdata directories
!testdata/

There are plenty of libraries around that make implementing this easy. I've used github.com/go-git/go-git/v5/plumbing/format/gitignore in the past, but github.com/sabhiram/go-gitignore appears to also support negation.

Interaction with other config options

  • Introduce a new config option / flag that takes the path to the file. For example pattern_file: path/to/file (all names are stand-ins, of course). By and large this takes on the role of both the include and exclude config options.

  • If pattern_file and include are provided, then all include patterns are added before any patterns in pattern_file. This allows us to specify include: ['*.yaml', '*.yml'] in the config and allow exclude patterns in pattern_file to also apply to the include patterns from the config.

  • If pattern_file and exclude are provided, then all exclude patterns are added after any patterns in pattern_file. This allows excludes in exclude to also apply to patterns in pattern_file.

    A case could be made here that all global config directives should be applied before local directives. In other words, the order include, exclude, pattern_file would be a viable alternative, allowing users to overwrite a global exclude.

  • Introduce a new match_type flag, which is an enum accepting standard, d[ouble]star, and gitignore. -dstar becomes an alias for -match_type=dstar.

  • If pattern_file and gitignore_excludes are provided, patterns from pattern_file are applied first, the resulting list of files is filtered using gitignore_path.

Alternatives considered

My current approach works somewhat, but is far from perfect:

declare -a PATTERNS=('*.yaml' '*.yml')
if [[ -e .yamlfmtextra ]]; then
  IFS=' '
  while read -r pattern; do
    PATTERNS+=("${pattern}")
  done <<<$(grep -E -v '^\s*(#|$)' .yamlfmtextra)
fi
yamlfmt -conf /etc/yamlfmt.yaml "${PATTERNS[@]}"

When you start using arrays in shell scripts, you know you're on the wrong track.

@octo octo mentioned this issue Nov 26, 2024
@braydonk
Copy link
Collaborator

braydonk commented Nov 26, 2024

This is something that would be easier to implement if I had designed for the possibility of another matching syntax way back when. It might finally be time for me to implement versioned configs so I can start to deal with some of the less ergonomic choices I made back on day 1 of the project. 😅

To implement this in the current config, here's what I'm thinking.

  • Make a new match type enum with three values path, doublestar, and gitignore
  • When doublestar: true or -dstar are specified, the match type becomes dstar unless match type is specified
  • The include array becomes an array of gitignore format files
  • Implement a new PathCollector that uses the list of gitignore files from include to match. exclude will be ignored, since including and excluding are technically covered by the gitignore syntax itself.
  • Keep the gitignore_excludes functionality the way it is today. The semantics of gitignore_excludes was always that it was supplementary to the include and exclude specified anyway, as it was mainly so people didn't need to double-specify excludes for things that were already gitignored.

There is probably a more ergonomic way to configure this if we could break the config, but despite being v0 still I endeavour not to do that. I will add it to my to-do list to add versioned configs, at which point this can be redesigned to be more ergonomic to configure.

Let me know if you're willing to take this on, otherwise I can try to get to it within the next couple of weeks.

@octo
Copy link
Contributor Author

octo commented Nov 26, 2024

It might finally be time for me to implement versioned configs so I can start to deal with some of the less ergonomic choices I made back on day 1 of the project.

I can tell the project evolved over time. I wanted to clean up the config parsing and flag handling, but got into the weeds quickly.

Let me know if you're willing to take this on, otherwise I can try to get to it within the next couple of weeks.

Initial proposal in #224. I think I covered all the points you had above. I initially used a separate pattern_file option, but have changed this to use include instead.

@braydonk
Copy link
Collaborator

I wanted to clean up the config parsing and flag handling, but got into the weeds quickly.

Yeah I admit it's not great, but it's not mattered much yet; core development has largely been me alone and I perfectly understand the rat's nest of standard library flags I came up with. 😆

braydonk added a commit that referenced this issue Nov 27, 2024
* refactor: Simplify config/flag/default behavior.

The new `pickFirst()` function returns the first non-empty string, making the
code simpler more readable.

* feat(pattern_file): Add the `pattern_file` config option and flag.

Issue: #223

* test(pattern_file): Add integration test for the `pattern_file` option.

* chore(pattern_file): Remove the `pattern_file` option and use `include` instead.

* fix: Allow the `doublestar` option to overwrite the default match type.

* refactor: Rename `PatternFile` to `PatternFileCollector`.

* refactor: Use pointer receiver for `PatternFileCollector`.

* fix: Copy loop variable for Go 1.18.

* docs(match_type): Document the `match_type` option and the `gitignore` mode.

* docs: Improve markup.

Co-authored-by: Braydon Kains <[email protected]>

* chore: Regenerate integration test outputs.

```shell
make integrationtest_update
```

---------

Co-authored-by: Braydon Kains <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants