Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(modules/gitlab_runner): Use correct argument to list all runners #7790

Conversation

mikewadsten
Copy link
Contributor

SUMMARY

python-gitlab 4.0.0 removed support for the as_list=False parameter. This functionality is now available as iterator=True.

Without this change, the module actually only retrieves the first 20 results, which can lead to non-idempotent behavior, such as registering a runner again.

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

gitlab_runner

ADDITIONAL INFORMATION

I am using community.general 8.1.0, python-gitlab 4.2.0 and a self-hosted instance of GitLab,
and was trying to register a new instance runner. Initial registration was successful, but if I run the playbook again,
a new runner (with the same description value) would be registered.

When I ran again using --check, I get AttributeError: 'bool' object has no attribute '_attrs'
(which is a separate bug that I will write up), but module_stderr also includes:

UserWarning: Calling a `list()` method without specifying `get_all=True` or `iterator=True` will return a maximum of 20 items. Your query returned 20 of 46 items. See https://python-gitlab.readthedocs.io/en/v4.2.0/api-usage.html#pagination for more details. If this was done intentionally, then this warning can be supressed by adding the argument `get_all=False` to the `list()` call.

The python-gitlab docs on pagination call this out:

Prior to python-gitlab 3.6.0 the argument as_list was used instead of iterator. as_list=False is the equivalent of iterator=True.

python-gitlab 4.0.0 removed support for the `as_list=False` parameter.
This functionality is now available as `iterator=True`.

Without this change, the module actually only retrieves the first
20 results, which can lead to non-idempotent behavior, such as
registering a runner again.
@ansibullbot ansibullbot added bug This issue/PR relates to a bug module module new_contributor Help guide this first time contributor plugins plugin (any type) small_patch Hopefully easy to review labels Dec 29, 2023
@ansibullbot ansibullbot removed the small_patch Hopefully easy to review label Dec 29, 2023
@mikewadsten
Copy link
Contributor Author

Hmm. It occurs to me that this change would cause issues if using python-gitlab older than 3.6. I know the new runner registration workflow stuff (#7199) requires >= 4.0.0, but the module will work on older versions as long as registration_token is provided.

Should I change this to pass as_list=False, iterator=True, so that we get the same behavior on all versions?

@felixfontein felixfontein added check-before-release PR will be looked at again shortly before release and merged if possible. backport-7 backport-8 Automatically create a backport for the stable-8 branch labels Dec 29, 2023
@felixfontein
Copy link
Collaborator

See also #7690. I don't know whether passing both as_list=False, iterator=True is a good idea; in that issue @nejch suggested as_list=False, all=True, so I guess it also makes sense.

@mikewadsten
Copy link
Contributor Author

Thanks, I tried to search for this issue but somehow missed 7690. This PR is indeed a fix for that.

The only issue I would take with using as_list=False, all=True as opposed to as_list=False, iterator=True, is that on new-ish versions of python-gitlab (which will ignore the as_list parameter entirely), that will retrieve all runners and store them into a list before returning anything (see https://github.com/python-gitlab/python-gitlab/blob/b8824a6c2f06903dc8975aaf5f8807a7b1245a6a/gitlab/client.py#L924 for example), whereas iterator=True will do lazy pagination.

But it looks like 3.6.0 through 3.15.0 don't let you pass both (see https://github.com/python-gitlab/python-gitlab/blob/2fbd1bf73c15959f4345befd03a28df882b461bd/gitlab/client.py#L900).

So I guess the options are:

  1. Use as_list=False, all=True for maximum succinctness and compatibility, at the expense of potentially-wasted time and memory with python-gitlab 3.6.0+ (due to eager pagination), depending on the number of existing runners.
  2. Look at the python-gitlab version (as the module already does to check compatibility with the new registration workflow), and use as_list=False on < 3.6.0, iterator=True on >= 3.6.0.

My preference would be for option 2. Thoughts?

@ansibullbot ansibullbot added the stale_ci CI is older than 7 days, rerun before merging label Jan 12, 2024
@ansibullbot ansibullbot added gitlab module_utils module_utils and removed stale_ci CI is older than 7 days, rerun before merging labels Jan 23, 2024
@felixfontein
Copy link
Collaborator

@nejch does the PR look good to you?

@nejch
Copy link
Contributor

nejch commented Jan 27, 2024

@nejch does the PR look good to you?

Sorry for the delay @felixfontein. LGTM, I would maybe even provide per_page in the Gitlab instantiation but that can be changed later as it's not user facing in this case. 👍

…-collections#7790)

I did not change every instance of all=True or all=False, only those
which could obviously benefit from simplifying:

  * Code using `all=True` but then searching for any items that match a
    condition (no need to collect the full list).
  * Code that basically reimplements `all=True` with manual pagination.
    (These could be changed to `all=True`, but `list_all_kwargs` also
    sets per_page to 100, to gather data faster.)
@ansibullbot
Copy link
Collaborator

@mikewadsten
Copy link
Contributor Author

@felixfontein @nejch I've implemented the requested changes (using list_all_kwargs where it would be useful).

There are a few instances that I left alone. For instance, gitlab_group.py:

if not force and len(group.projects.list(all=False)) >= 1:

This is just a quick check for whether any groups still exist. No need to ask for "all" items, when the GitLab API defaults pages to size 20, so a single call here will be sufficient.

But I did refactor a few "all=True then look for a single match" to next((...), None), which a) is semantically equivalent, b) is more efficient (not collecting everything before looking), and c) is a pattern used in other modules already.

@mikewadsten
Copy link
Contributor Author

Two things I'm not sure about:

  1. Should I add a changelog fragment of some kind, calling out that this pagination change has been applied to more than gitlab_runner?
  2. Should I rebase my branch, or is that unnecessary?

@ansibullbot
Copy link
Collaborator

cc @benibr
click here for bot help

@felixfontein
Copy link
Collaborator

1. Should I add a changelog fragment of some kind, calling out that this pagination change has been applied to more than `gitlab_runner`?

Yes, every changed module should be mentioned in a changelog fragment. (You can also mention them all in one.)

2. Should I rebase my branch, or is that unnecessary?

That's not necessary, unless there are conflicts. (Or you know that some of the GitLab modules have been modified and you need to rebase to get in the modifications because you need to change the corresponding code.)

@mikewadsten
Copy link
Contributor Author

Yes, every changed module should be mentioned in a changelog fragment. (You can also mention them all in one.)

Done. Hopefully I've formatted it properly. The Ansible development docs say a changelog fragment can have multiple sections in it, so I kept gitlab_runner under bugfixes and called out the rest of them as minor_changes.

@ansibullbot ansibullbot added needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR and removed needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR labels Jan 30, 2024
@ansibullbot ansibullbot added the stale_ci CI is older than 7 days, rerun before merging label Feb 10, 2024
@felixfontein
Copy link
Collaborator

Can someone please check all the new changes to this PR? Thanks :)

@felixfontein
Copy link
Collaborator

@markuman @nejch do you think this is OK to merge?

Copy link
Contributor

@nejch nejch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markuman @nejch do you think this is OK to merge?

@felixfontein LGTM, I left a comment about the redundant argument there but I don't want to block this any longer. It will have no effect.

Comment on lines +32 to +34
return {'as_list': False, 'get_all': True, 'per_page': 100}
else:
return {'as_list': False, 'all': True, 'per_page': 100}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick note: as_list=False in older versions and iterator=True in newer versions takee precedence over all and get_all so IMO including the all / get_all arguments here is not necessary. But not a blocker.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll merge for now so I can do a release this evening, if someone wants to create a follow-up PR to clean this up, feel free :)

@felixfontein felixfontein removed the check-before-release PR will be looked at again shortly before release and merged if possible. label Feb 25, 2024
@felixfontein felixfontein merged commit 787fa46 into ansible-collections:main Feb 25, 2024
121 checks passed
Copy link

patchback bot commented Feb 25, 2024

Backport to stable-7: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply 787fa46 on top of patchback/backports/stable-7/787fa4621763879dae26c57aaf0028941dcaaefc/pr-7790

Backporting merged PR #7790 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/ansible-collections/community.general.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/stable-7/787fa4621763879dae26c57aaf0028941dcaaefc/pr-7790 upstream/stable-7
  4. Now, cherry-pick PR fix(modules/gitlab_runner): Use correct argument to list all runners #7790 contents into that branch:
    $ git cherry-pick -x 787fa4621763879dae26c57aaf0028941dcaaefc
    If it'll yell at you with something like fatal: Commit 787fa4621763879dae26c57aaf0028941dcaaefc is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x 787fa4621763879dae26c57aaf0028941dcaaefc
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR fix(modules/gitlab_runner): Use correct argument to list all runners #7790 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/stable-7/787fa4621763879dae26c57aaf0028941dcaaefc/pr-7790
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

Copy link

patchback bot commented Feb 25, 2024

Backport to stable-8: 💚 backport PR created

✅ Backport PR branch: patchback/backports/stable-8/787fa4621763879dae26c57aaf0028941dcaaefc/pr-7790

Backported as #8032

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback bot pushed a commit that referenced this pull request Feb 25, 2024
…7790)

* fix(modules/gitlab_runner): Use correct argument to list all runners

python-gitlab 4.0.0 removed support for the `as_list=False` parameter.
This functionality is now available as `iterator=True`.

Without this change, the module actually only retrieves the first
20 results, which can lead to non-idempotent behavior, such as
registering a runner again.

* Add changelog entry (#7790)

* gitlab_runner: Check python-gitlab version when listing runners

* gitlab: Add list_all_kwargs variable to module_utils

* refactor(gitlab modules): use list_all_kwargs where it helps (#7790)

I did not change every instance of all=True or all=False, only those
which could obviously benefit from simplifying:

  * Code using `all=True` but then searching for any items that match a
    condition (no need to collect the full list).
  * Code that basically reimplements `all=True` with manual pagination.
    (These could be changed to `all=True`, but `list_all_kwargs` also
    sets per_page to 100, to gather data faster.)

* gitlab_instance_variable: Use list_all_kwargs

* Add new changelog entry for gitlab module changes (#7790)

(cherry picked from commit 787fa46)
@felixfontein
Copy link
Collaborator

@mikewadsten thanks for your contribution!
@nejch @markuman thanks for reviewing this!

@felixfontein
Copy link
Collaborator

If someone is interested in a 7.5.x backport, you have to cherry-pick the commit manually (git cherry-pick -x ...) and create a PR for it.

felixfontein pushed a commit that referenced this pull request Feb 25, 2024
…e correct argument to list all runners (#8032)

fix(modules/gitlab_runner): Use correct argument to list all runners (#7790)

* fix(modules/gitlab_runner): Use correct argument to list all runners

python-gitlab 4.0.0 removed support for the `as_list=False` parameter.
This functionality is now available as `iterator=True`.

Without this change, the module actually only retrieves the first
20 results, which can lead to non-idempotent behavior, such as
registering a runner again.

* Add changelog entry (#7790)

* gitlab_runner: Check python-gitlab version when listing runners

* gitlab: Add list_all_kwargs variable to module_utils

* refactor(gitlab modules): use list_all_kwargs where it helps (#7790)

I did not change every instance of all=True or all=False, only those
which could obviously benefit from simplifying:

  * Code using `all=True` but then searching for any items that match a
    condition (no need to collect the full list).
  * Code that basically reimplements `all=True` with manual pagination.
    (These could be changed to `all=True`, but `list_all_kwargs` also
    sets per_page to 100, to gather data faster.)

* gitlab_instance_variable: Use list_all_kwargs

* Add new changelog entry for gitlab module changes (#7790)

(cherry picked from commit 787fa46)

Co-authored-by: Mike Wadsten <[email protected]>
lgatellier pushed a commit to lgatellier/community.general that referenced this pull request May 5, 2024
…/gitlab_runner): Use correct argument to list all runners (ansible-collections#8032)

fix(modules/gitlab_runner): Use correct argument to list all runners (ansible-collections#7790)

* fix(modules/gitlab_runner): Use correct argument to list all runners

python-gitlab 4.0.0 removed support for the `as_list=False` parameter.
This functionality is now available as `iterator=True`.

Without this change, the module actually only retrieves the first
20 results, which can lead to non-idempotent behavior, such as
registering a runner again.

* Add changelog entry (ansible-collections#7790)

* gitlab_runner: Check python-gitlab version when listing runners

* gitlab: Add list_all_kwargs variable to module_utils

* refactor(gitlab modules): use list_all_kwargs where it helps (ansible-collections#7790)

I did not change every instance of all=True or all=False, only those
which could obviously benefit from simplifying:

  * Code using `all=True` but then searching for any items that match a
    condition (no need to collect the full list).
  * Code that basically reimplements `all=True` with manual pagination.
    (These could be changed to `all=True`, but `list_all_kwargs` also
    sets per_page to 100, to gather data faster.)

* gitlab_instance_variable: Use list_all_kwargs

* Add new changelog entry for gitlab module changes (ansible-collections#7790)

(cherry picked from commit 787fa46)

Co-authored-by: Mike Wadsten <[email protected]>
felixfontein pushed a commit that referenced this pull request May 6, 2024
…e correct argument to list all runners (#8311)

[PR #7790/787fa462 backport][stable-8] fix(modules/gitlab_runner): Use correct argument to list all runners (#8032)

fix(modules/gitlab_runner): Use correct argument to list all runners (#7790)

* fix(modules/gitlab_runner): Use correct argument to list all runners

python-gitlab 4.0.0 removed support for the `as_list=False` parameter.
This functionality is now available as `iterator=True`.

Without this change, the module actually only retrieves the first
20 results, which can lead to non-idempotent behavior, such as
registering a runner again.

* Add changelog entry (#7790)

* gitlab_runner: Check python-gitlab version when listing runners

* gitlab: Add list_all_kwargs variable to module_utils

* refactor(gitlab modules): use list_all_kwargs where it helps (#7790)

I did not change every instance of all=True or all=False, only those
which could obviously benefit from simplifying:

  * Code using `all=True` but then searching for any items that match a
    condition (no need to collect the full list).
  * Code that basically reimplements `all=True` with manual pagination.
    (These could be changed to `all=True`, but `list_all_kwargs` also
    sets per_page to 100, to gather data faster.)

* gitlab_instance_variable: Use list_all_kwargs

* Add new changelog entry for gitlab module changes (#7790)

(cherry picked from commit 787fa46)

Co-authored-by: patchback[bot] <45432694+patchback[bot]@users.noreply.github.com>
Co-authored-by: Mike Wadsten <[email protected]>
Massl123 pushed a commit to Massl123/community.general that referenced this pull request Feb 7, 2025
…nsible-collections#7790)

* fix(modules/gitlab_runner): Use correct argument to list all runners

python-gitlab 4.0.0 removed support for the `as_list=False` parameter.
This functionality is now available as `iterator=True`.

Without this change, the module actually only retrieves the first
20 results, which can lead to non-idempotent behavior, such as
registering a runner again.

* Add changelog entry (ansible-collections#7790)

* gitlab_runner: Check python-gitlab version when listing runners

* gitlab: Add list_all_kwargs variable to module_utils

* refactor(gitlab modules): use list_all_kwargs where it helps (ansible-collections#7790)

I did not change every instance of all=True or all=False, only those
which could obviously benefit from simplifying:

  * Code using `all=True` but then searching for any items that match a
    condition (no need to collect the full list).
  * Code that basically reimplements `all=True` with manual pagination.
    (These could be changed to `all=True`, but `list_all_kwargs` also
    sets per_page to 100, to gather data faster.)

* gitlab_instance_variable: Use list_all_kwargs

* Add new changelog entry for gitlab module changes (ansible-collections#7790)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8 Automatically create a backport for the stable-8 branch bug This issue/PR relates to a bug gitlab has_issue module_utils module_utils module module new_contributor Help guide this first time contributor plugins plugin (any type) stale_ci CI is older than 7 days, rerun before merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants