Zarf injector should filter on "running" pods when initializing #2356

dmiller-boeing · 2024-03-04T22:32:47Z

Is your feature request related to a problem? Please describe.

Recently, I had a k3s cluster get corrupted by the master of master nodes being deleted. Even the zarf-docker-registry was corrupt, and I lost it and the images it contained. I tried a zarf init, and the injector continued to clone pods that were in ImagePullBackoff of similar sates. The timeout before moving on to another pod to try to clone was pretty long, so it took an enormous amount of time to start up the injector pod correctly and some manual finagling with taints to get on a node that had the fewest pods in error states.

Describe the solution you'd like

Given an existing cluster with many pods running, but also many that are in error states (like ImagePullBackoff)
When running zarf init
Then the injector filters out all pods but those that are healthy, in the "running" state to clone

Describe alternatives you've considered

An alternative might be to set the pod and/or node which to clone via an environment variable or --set.

Additional context

The timeout for an injector pod to get to the "running" state could also be lowered or made overrideable by an env var or --set

The text was updated successfully, but these errors were encountered:

…2415) ## Description filter on running pods when finding an image for injector pod https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase Description of the `Running` pod phase: > The Pod has been bound to a node, and all of the containers have been created. At least one container is still running, or is in the process of starting or restarting. ## Related Issue Fixes #2356 Fixes #2410 ## Type of change - [x] Bug fix (non-breaking change which fixes an issue) ## Checklist before merging - [x] Test, docs, adr added or updated as needed - [x] [Contributor Guide Steps](https://github.com/defenseunicorns/zarf/blob/main/CONTRIBUTING.md#developer-workflow) followed --------- Co-authored-by: Austin Abro <[email protected]> Co-authored-by: razzle <[email protected]>

dmiller-boeing added the enhancement ✨ New feature or request label Mar 4, 2024

github-project-automation bot added this to Zarf Project Board Mar 4, 2024

github-project-automation bot moved this to New in Zarf Project Board Mar 4, 2024

eddiezane added this to Zarf (old) Mar 4, 2024

lucasrod16 mentioned this issue Mar 6, 2024

test: data injection flake #2361

Merged

5 tasks

lucasrod16 mentioned this issue Apr 4, 2024

fix: filter on running pods when finding an image for injector pod #2415

Merged

3 tasks

Noxsios moved this to In review in Zarf (old) Apr 9, 2024

Noxsios closed this as completed in #2415 Apr 15, 2024

github-project-automation bot moved this from New to Closed in Zarf Project Board Apr 15, 2024

github-project-automation bot moved this from In review to Done in Zarf (old) Apr 15, 2024

salaxander removed this from Zarf (old) Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zarf injector should filter on "running" pods when initializing #2356

Zarf injector should filter on "running" pods when initializing #2356

dmiller-boeing commented Mar 4, 2024

Zarf injector should filter on "running" pods when initializing #2356

Zarf injector should filter on "running" pods when initializing #2356

Comments

dmiller-boeing commented Mar 4, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context