Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(tests): use instance.clean/restart instead of clean --reboot #5636

Conversation

blackboxsw
Copy link
Collaborator

Proposed Commit Message

fix(tests): use instance.clean/restart instead of clean --reboot

Directly calling execute("cloud-init clean --logs --reboot") on
an integration instances also involves awaiting a new boot id upon
next interaction with with instance to ensure a reboot has actually
taken place already on this target machine.

Slow responding test instances/platforms may not completed the shutdown
restart sequence yet when trying to iteract with an immediate blocking
call to execut("cloud-init status --wait") which may exit early if accessing
the prior instance boot before the reboot occurred.

It is preferable to use inspect /proc/sys/kernel/random/boot_id before
issuing a reboot request and block until a delta is seen in boot_id.
This blocking wait on reboot and new boot_id is encapsulated inside
pycloudlib.BaseInstance.restart which will inspect
/proc/sys/kernel/random/boot_id before restart and block until a delta
in boot_id across the requested restart.

Fix test_status_block_through_all_boot_status to call instance.clean()
and restart() to ensure we do not beat the instance reboot race with
our post-boot assertions.

Additional Context

Jenkins failures in oracular
https://jenkins.canonical.com/server-team/view/cloud-init/job/cloud-init-integration-oracular-lxd_vm/53/testReport/junit/tests.integration_tests.cmd/test_status/test_status_block_through_all_boot_status/
Noble: https://jenkins.canonical.com/server-team/view/cloud-init/job/cloud-init-integration-noble-lxd_vm/152/testReport/junit/tests.integration_tests.cmd/test_status/test_status_block_through_all_boot_status/

OSError: Failed reading remote file via cat: /before-local.start-nostatusjson
Return code: 1
Stderr: cat: /before-local.start-nostatusjson: No such file or directory
Stdout:

Test Steps


PYCLOUDLIB_CONFIG=pycloudlib.toml CLOUD_INIT_CLOUD_INIT_SOURCE=ppa:cloud-init-dev/daily CLOUD_INIT_OS_IMAGE=oracular CLOUD_INIT_PLATFORM=lxd_vm tox -e integration-tests -- tests/integration_tests/cmd/test_status.py::test_status_block_through_all_boot_status --pdb

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

Directly calling execute("cloud-init clean --logs --reboot") on
an integration instances also involves awaiting a new boot id upon
next interaction with with instance to ensure a reboot has actually
taken place already on this target machine.

Slow responding test instances/platforms may not completed the shutdown
restart sequence yet when trying to iteract with an immediate blocking
call to execut("cloud-init status --wait") which may exit early if accessing
the prior instance boot before the reboot occurred.

It is preferable to use inspect /proc/sys/kernel/random/boot_id before
issuing a reboot request and block until a delta is seen in boot_id.
This blocking wait on reboot and new boot_id is encapsulated inside
pycloudlib.BaseInstance.restart which will inspect
/proc/sys/kernel/random/boot_id before restart and block until a delta
in boot_id across the requested restart.

Fix test_status_block_through_all_boot_status to call instance.clean()
and restart() to ensure we do not beat the instance reboot race with
our post-boot assertions.
Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does kind of highlight some API cruft we have. My original intention with the IntegrationInstance class was that we wouldn't be directly accessing methods on the pycloudlib instances. In that vein, you can just call client.restart() rather than client.instance.restart(). There isn't anything similar for clean() though (mostly because I didn't think we needed one).

It's a bit weird to have one and not the other though...but I think it probably makes more sense to just ditch the wrapper method at this point.

Nothing specific to do here...just some context for the future.

@TheRealFalcon TheRealFalcon merged commit 6e4343e into canonical:main Aug 26, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants