Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow and testing improvements #328

Merged
merged 7 commits into from
May 19, 2023

Conversation

mprasil
Copy link
Contributor

@mprasil mprasil commented May 3, 2023

This is somewhat opinionated PR with the intention to improve the role workflow a bit, make it a bit more readable and less fragile. I'm very open to discuss individual changes. I have split it into two commits that are effectively unrelated, it's just I encountered the test bug while working on the rest.

05407c2

Adds extra steps to update hosts file in test container. Tailscale does attempt to manage resolv.conf which is somewhat in conflict with Docker attempting to provice container name resolution. This is problematic in multi-stage scenarios where tailscale is initialized multiple times and eventually ends up in a state where container name resolution no longer works and test fails.

This extra steps add static hosts override for headscale container so that Tailscale managing DNS resolution does not mess up the connectivity to Headscale container.

f716fc2

The main changes are:

  • Switch to parsing json output of tailscale status - this should hopefully lead to less fragile detection of current status of tailscale connection and should also end up with more readable conditional logic in the role.

  • Split tailscale up step into two separate steps depending on situation:

    • If state didn't change and network is down, simple tailscale up without any arguments is used as the connection should already be properly set and this is all that is needed to bring the connection up.
    • If there is state change, full tailscale up with all the arguments is executed.

    This can enable use cases where we don't necessarily want to fully manage the tailscale connection via Ansible, but the user still wants to bring the connection up. (ie. to prepare the instance for subsequent roles that rely on that connections)

    • Dropped some failed_when and changed_when conditions and instead rely on default behavior, that assumes non-zero return is failure and that any time command is executed (except tailscale status) it is performing a change. This probably reflects reality better.

tasks/install.yml Outdated Show resolved Hide resolved
Comment on lines -19 to +25
- handlers_tailscale_status.stdout | length != 0
- handlers_tailscale_status.stdout is not match('\[L\+V9o\]')
- tailscale_is_online
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure what exactly this condition was trying to assert, but I think the new condition reflects better the name of the task.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tailscale's output had some mysterious behavior in the past. I think --json solves for it, though
#43 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄 When I saw that regexp, I was sure there was some amount of debugging time spent right there.

@artis3n
Copy link
Owner

artis3n commented May 4, 2023

Love the idea of working on the JSON instead. As for

Dropped some failed_when and changed_when conditions and instead rely on default behavior, that assumes non-zero return is failure

I remember having to dig through tailscale source to find out how exit codes were produced by the binary, and the resulting reason led to this setup. It's been a couple years since then, though. The binary has changed a lot and my cursory dive back into the code couldn't find the same location I remember. Worth rethinking how that works as well, but I'll want to test this for a little while before moving forward.

tasks/install.yml Show resolved Hide resolved
tasks/install.yml Outdated Show resolved Hide resolved
Comment on lines -19 to +25
- handlers_tailscale_status.stdout | length != 0
- handlers_tailscale_status.stdout is not match('\[L\+V9o\]')
- tailscale_is_online
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tailscale's output had some mysterious behavior in the past. I think --json solves for it, though
#43 (comment)

@mprasil
Copy link
Contributor Author

mprasil commented May 4, 2023

I remember having to dig through tailscale source to find out how exit codes were produced by the binary, and the resulting reason led to this setup.

If that makes things a bit easier, the dropped failed_when is in tailscale status task and it effectively changed from:

  failed_when:
    - tailscale_status.rc != 0
    - "'Logged out.' not in tailscale_status.stdout"
    - "'not logged in' not in tailscale_status.stdout"

to default behavior:

  failed_when:
    - tailscale_status.rc != 0

In this case the other two conditions were likely to account for unexpected output. In this new implementation we would notice that when parsing json.

The changed_when is in tailscale up task. The changed_when: tailscale_start.rc != 0 setting was irrelevant anyways as it would fail the playbook. And it probably didn't reflect reality, because failure to run tailscale up might have changed configuration - for example if we attempted to change connection that was up with invalid auth key. This would effectively drop the connection, which is a change. It's kind of edge case, but IMO if we can't be sure whether any change was done or not, we should err on the side of reporting change.

@mprasil mprasil force-pushed the workflow-improvements branch from f716fc2 to 81d1755 Compare May 6, 2023 11:08
@mprasil
Copy link
Contributor Author

mprasil commented May 6, 2023

I have now dropped the separate tailscale up task with no arguments. So we're back to running full tailscale up with arguments when state changes or connection is not up. Thanks for the feedback.

molecule/default/prepare.yml Show resolved Hide resolved
tasks/install.yml Show resolved Hide resolved
tasks/install.yml Outdated Show resolved Hide resolved
tasks/install.yml Outdated Show resolved Hide resolved
tasks/install.yml Outdated Show resolved Hide resolved
@artis3n artis3n force-pushed the workflow-improvements branch from 81d1755 to 29c9119 Compare May 6, 2023 13:29
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 6, 2023 13:29 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n temporarily deployed to E2E May 19, 2023 18:46 — with GitHub Actions Inactive
@artis3n artis3n merged commit 6016882 into artis3n:main May 19, 2023
@mprasil mprasil deleted the workflow-improvements branch May 19, 2023 20:21
fredrikekre added a commit to fredrikekre/ansible-role-tailscale that referenced this pull request Mar 26, 2024
This patch redacts the authkey from `tailscale up` stderr output based
on `tailscale_authkey` directly instead of the `tskey.*` regex. The same
change was already applied for the stdout logging in artis3n#328.

diff --git a/tasks/install.yml b/tasks/install.yml
index 4719173..a9db923 100644
--- a/tasks/install.yml
+++ b/tasks/install.yml
@@ -167,6 +167,6 @@

 - name: Install | Report redacted failure from "tailscale up"  # noqa: no-handler
   ansible.builtin.fail:
-    msg: "{{ tailscale_start.stderr | default () | regex_replace('tskey.*\\s', 'REDACTED ') | regex_replace('\\t', '') | split('\n') }}"
+    msg: "{{ tailscale_start.stderr | default () | regex_replace(tailscale_authkey, 'REDACTED') | regex_replace('\\t', '') | split('\n') }}"
   when:
     - tailscale_start is failed
fredrikekre added a commit to fredrikekre/ansible-role-tailscale that referenced this pull request Mar 26, 2024
This patch redacts the authkey from `tailscale up` stderr output based
on `tailscale_authkey` directly instead of the `tskey.*` regex. The same
change was already applied for the stdout logging in artis3n#328.
artis3n pushed a commit that referenced this pull request Mar 27, 2024
This patch redacts the authkey from `tailscale up` stderr output based
on `tailscale_authkey` directly instead of the `tskey.*` regex. The same
change was already applied for the stdout logging in #328.
kieranmanning pushed a commit to kieranmanning/ansible-role-tailscale that referenced this pull request Dec 14, 2024
This patch redacts the authkey from `tailscale up` stderr output based
on `tailscale_authkey` directly instead of the `tskey.*` regex. The same
change was already applied for the stdout logging in artis3n#328.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants