Skip to content
This repository has been archived by the owner on Sep 14, 2020. It is now read-only.

Skip resumes for deleted objects, unless explicitly marked for selection #233

Merged
merged 1 commit into from
Nov 14, 2019

Conversation

nolar
Copy link
Contributor

@nolar nolar commented Nov 14, 2019

Prevent resource/memory leaks in some edge cases with resuming+deletion handlers.

Issue : #112 #230

Description

In #230, the resume handlers were fixed to be executed as regular resource-changing handlers, with multiple iterations if needed. Previously, they were executed only if they were lucky to be the first and the only in the row, and if never retried due to errors.


Problem:

This led to the resuming handlers executed even if the object was marked for deletion during the operator downtime/restart. Which, from one point of view, makes sense — since the object exists out there (only "marked" for deletion, but not actually deleted), so should be "resumed" in memory (despite it will be shut down and freed in few moments).

But, from another point of view, leads to potential memory/resource leaks, if the deletion handlers are used to free the resources and stop the tasks, and the resuming handlers allocate and start them accordingly — after they were released/stopped.

Consider an example below and a case when the operator starts and notices an object marked for deletion — should it spawn a task then?

    TASKS = {}

    @kopf.on.delete('zalando.org', 'v1', 'kopfexamples')
    async def my_handler(spec, name, **_):
        if name in TASKS:
            TASKS[name].cancel()

    @kopf.on.resume('zalando.org', 'v1', 'kopfexamples')
    @kopf.on.create('zalando.org', 'v1', 'kopfexamples')
    def my_handler(spec, **_):
        if name not in TASKS:
            TASKS[name] = asyncio.create_task(some_coroutine(spec))

The order of handlers is not a solution, as it can be altered at runtime due to error retries in the real cases with more logic.


Solution:

Kopf's goal is to be intuitive and human-friendly in the first place. Based on this consideration, with this PR, the resuming handlers are NOT invoked if the object is undergoing the deletion when the operator is restarted. This also matches the previous (buggy) behaviour.

However, if that is desired, a developer can declare the resume handler as deletion-compatible with deleted=True (see the docs in this PR), and the resuming handler will be executed even on the deletions.

The main code snippet in this PR is this one:

                if handler.initial and not cause.initial:
                    pass  # ignore initial handlers in non-initial causes.
                elif handler.initial and cause.deleted and not handler.deleted:
                    pass  # ignore initial handlers on deletion, unless explicitly marked as usable.
                elif match(handler=handler, body=cause.body, changed_fields=changed_fields):
                    yield handler

Beside of the main change, the tests were added for the resuming decorators and handler selection. It seems, they were absent previously.

Types of Changes

  • Bug fix (non-breaking change which fixes an issue)

Review

List of tasks the reviewer must do to review the PR

  • Tests
  • Documentation

@nolar nolar added the bug Something isn't working label Nov 14, 2019
@nolar nolar requested a review from samurang87 as a code owner November 14, 2019 10:52
@zincr
Copy link

zincr bot commented Nov 14, 2019

🤖 zincr found 0 problems , 0 warnings

✅ Large Commits
✅ Approvals
✅ Specification
✅ Dependency Licensing

@zincr
Copy link

zincr bot commented Nov 14, 2019

🤖 zincr found 1 problem , 0 warnings

❌ Approvals
✅ Large Commits
✅ Specification
✅ Dependency Licensing

Details on how to resolve are provided below


Approvals

All proposed changes must be reviewed by project maintainers before they can be merged

Not enough people have approved this pull request - please ensure that 1 additional user, who have not contributed to this pull request approve the changes.

  • ✅ Approved by PR author @nolar
  • ❌ 1 additional approval needed
     

@nolar nolar merged commit ca7cfcb into zalando-incubator:master Nov 14, 2019
@nolar nolar deleted the 112-no-resumes-for-deleted branch November 14, 2019 11:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants