Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack stalls when deleted using foreground cascading delete #753

Closed
EronWright opened this issue Nov 20, 2024 · 0 comments · Fixed by #756 or #760
Closed

Stack stalls when deleted using foreground cascading delete #753

EronWright opened this issue Nov 20, 2024 · 0 comments · Fixed by #756 or #760
Assignees
Labels
kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team p1 A bug severe enough to be the next item assigned to an engineer resolution/fixed This issue was fixed

Comments

@EronWright
Copy link
Contributor

EronWright commented Nov 20, 2024

What happened?

I deleted a Stack using foreground cascading delete, and found that it stalled forever in a deleting state. The stack has destroyOnFinalize enabled, and the status showed that the stack controller was waiting for the workspace to become ready. There was no workspace in existence, because it had just been deleted.

To unblock myself, I killed the operator pod. When it came up, it correctly provisioned a workspace, destroyed the stack, and then removed the finalizer.

I believe that the root cause is a race condition between the workspace being deleted and the stack controller creating the replacement. The watch logic in stack controller should trigger upon delete of the workspace.

Example

N/A

Output of pulumi about

N/A

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@EronWright EronWright added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team p1 A bug severe enough to be the next item assigned to an engineer labels Nov 20, 2024
EronWright added a commit that referenced this issue Nov 21, 2024
<!--Thanks for your contribution. See [CONTRIBUTING](CONTRIBUTING.md)
    for Pulumi's contribution guidelines.

    Help us merge your changes more quickly by adding more details such
    as labels, milestones, and reviewers.-->

### Proposed changes

<!--Give us a brief description of what you've done and what it solves.
-->

Ensures that progress is made if the Workspace object is deleted
concurrently with the Stack object, as may occur in foreground deletion.
Otherwise the operator stays idle when it should reconcile.

A partial solution is to have the Stack controller watch for Workspace
'delete' events, since the controller is otherwise stuck waiting for the
workspace to become ready.

Or, the stack controller may have seen the workspace as ready, and
proceeded to create an update for a non-existent workspace, which will
not progress because the update controller waits for the workspace.
 
Also: fix a flaky test.

### Related issues (optional)

<!--Refer to related PRs or issues: #1234, or 'Fixes #1234' or 'Closes
#1234'.
Or link to full URLs to issues or pull requests in other GitHub
repositories. -->
Closes #753
@pulumi-bot pulumi-bot added the resolution/fixed This issue was fixed label Nov 21, 2024
@EronWright EronWright reopened this Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team p1 A bug severe enough to be the next item assigned to an engineer resolution/fixed This issue was fixed
Projects
None yet
2 participants