Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tenant shutdown reorganization #4327

Closed
Tracked by #4733
koivunej opened this issue May 24, 2023 · 2 comments
Closed
Tracked by #4733

Tenant shutdown reorganization #4327

koivunej opened this issue May 24, 2023 · 2 comments
Assignees
Labels
c/storage/pageserver Component: storage: pageserver

Comments

@koivunej
Copy link
Member

koivunej commented May 24, 2023

Currently we have different ways to shutdown a tenant, at least the following:

  • pageserver::tenant::mgr::shutdown_all_tenants
    • pageserver::shutdown_pageserver
  • pageserver::tenant::mgr::remove_tenant_from_memory
    • detach and ignore

We should consolidate these to a single async fn Tenant::shutdown.

Initial plan for this is to change the task_mgr::spawn API such that its callers become responsible for keeping track of the spawned task, e.g., by returning a wrapped tokio::task::JoinHandle.

For tenant shutdown, this means the async fn Tenant::shutdown will then join these join handles, after signalling need for cancellation to these tasks in some way.

Follow-up noticed while implementing this:

@koivunej koivunej added the c/storage/pageserver Component: storage: pageserver label May 24, 2023
@koivunej koivunej self-assigned this May 24, 2023
@koivunej
Copy link
Member Author

koivunej commented May 25, 2023

Somewhat related, noticed that the seqwait done in the repartitioning path can however block graceful shutdown.

EDIT: Now on Epic.

koivunej added a commit that referenced this issue Jun 6, 2023
We have 2 ways of tenant shutdown, we should have just one.

Changes are mostly mechanical simple refactorings.

Added `warn!` on the "shutdown all remaining tasks" should trigger test
failures in the between time of not having solved the "tenant/timeline
owns all spawned tasks" issue.

Cc: #4327.
awestover pushed a commit that referenced this issue Jun 14, 2023
We have 2 ways of tenant shutdown, we should have just one.

Changes are mostly mechanical simple refactorings.

Added `warn!` on the "shutdown all remaining tasks" should trigger test
failures in the between time of not having solved the "tenant/timeline
owns all spawned tasks" issue.

Cc: #4327.
@koivunej
Copy link
Member Author

Closed by #4407. I was initially thinking that this should grow to cover all follow-ups but created a separate epic instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

No branches or pull requests

1 participant