Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] guaranteed-engine-manager-cpu and guaranteed-replica-manager-cpu settings are no longer supported #5568

Closed
bk201 opened this issue Apr 12, 2024 · 11 comments
Assignees
Labels
area/installer Harvester installer backport-needed/1.2.2 backport-needed/1.3.1 kind/bug Issues that are defects reported by users or that we know have reached a real release priority/1 Highly recommended to fix in this release require/doc Improvements or additions to documentation
Milestone

Comments

@bk201
Copy link
Member

bk201 commented Apr 12, 2024

Describe the bug

Longhorn's guaranteed-engine-manager-cpu and guaranteed-replica-manager-cpu settings are no longer supported and replaced by guaranteed-instance-manager-cpu.

Note this applies to the upcoming v1.2.2 release (Longhorn v1.5.5) and v1.3.0 (Longhorn v1.6.0). v1.2.1 still uses Longhorn v1.4.3 and doesn't have this issue.

To Reproduce
Steps to reproduce the behavior:

  1. Install the harvester with the two configurations:
  1. These settings don't take effect.

Expected behavior

Support bundle

Environment

  • Harvester ISO version:
  • Underlying Infrastructure (e.g. Baremetal with Dell PowerEdge R630):

Additional context
Add any other context about the problem here.

@harvesterhci-io-github-bot
Copy link
Collaborator

added backport-needed/1.2.2 issue: #5570.

@harvesterhci-io-github-bot
Copy link
Collaborator

added backport-needed/1.2.2 issue: #5569.

@harvesterhci-io-github-bot
Copy link
Collaborator

added backport-needed/1.3.1 issue: #5571.

@harvesterhci-io-github-bot
Copy link
Collaborator

added backport-needed/1.3.1 issue: #5572.

@harvesterhci-io-github-bot
Copy link
Collaborator

harvesterhci-io-github-bot commented Apr 16, 2024

Pre Ready-For-Testing Checklist

  • If labeled: require/HEP Has the Harvester Enhancement Proposal PR submitted?
    The HEP PR is at:

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:

  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at:

  • Have the backend code been merged (harvester, harvester-installer, etc) (including backport-needed/*)?
    The PR is at: Add installation configure of LH GuaranteedInstanceManagerCPU harvester-installer#706

    • Does the PR include the explanation for the fix or the feature?

    • Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
      The PR for the YAML change is at:
      The PR for the chart change is at:

  • If labeled: area/ui Has the UI issue filed or ready to be merged?
    The UI issue/PR is at:

  • If labeled: require/doc, require/knowledge-base Has the necessary document PR submitted or merged?
    The documentation/KB PR is at: Add installation configure of LH GuaranteedInstanceManagerCPU docs#549

  • If NOT labeled: not-require/test-plan Has the e2e test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue?

    • The automation skeleton PR is at:
    • The automation test case PR is at:
  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at:

@harvesterhci-io-github-bot
Copy link
Collaborator

Automation e2e test issue: harvester/tests#1224

@bk201
Copy link
Member Author

bk201 commented Aug 14, 2024

@albinsun It looks like all v1.3/v1.2 backport issues are all verified. Can you also help check this main issue?

@albinsun albinsun self-assigned this Aug 14, 2024
@albinsun
Copy link

albinsun commented Sep 19, 2024

Test Result

Test FAIL as behavior is not identical to v1.3/v1.2.

Environment

  • harvester-v1.4.0-dev-20240918
    • Profile: 2 nodes QEMU/KVM (8C/16G/500G per node)
    • ui-source: Auto

Prerequisite

  1. VM => ⋮ => Migrate
    • Check 1
      SCREENSHOT

Scenarios

w/o guaranteedInstanceManagerCPU

  1. 🟡 No any configs -> cpu should be 960m (8 x 1000m x 0.12 (default value))

    842 / 0.12 / 1000 ~ 7.02
    Behavior not the same as v1.3.1

  2. ⚪ Even with guaranteedEngineManagerCPU & guaranteedReplicaManagerCPU -> cpu should still 960m
    • Check 1
      SCREENSHOT
    • Check 2 ❌

      ⚠️: Fail due to reason
      SCREENSHOT

w/ guaranteedInstanceManagerCPU

  1. guaranteedInstanceManagerCPU: 10 -> cpu should be 800m (8 x 1000m x 0.10)
    • Check 1
      SCREENSHOT
    • Check 2 ❌

      ⚠️: Fail due to reason
      SCREENSHOT

  2. guaranteedInstanceManagerCPU: 13 -> cpu should be 960m (8 x 1000m x 0.12)
    • Check 1
      SCREENSHOT
    • Check 2 ❌

      ⚠️: Fail due to reason
      SCREENSHOT

  3. guaranteedInstanceManagerCPU: 12 -> cpu should be 960m (8 x 1000m x 12)
    • Check 1
      SCREENSHOT
    • Check 2 ❌

      ⚠️: Fail due to reason
      SCREENSHOT

  4. guaranteedInstanceManagerCPU: 0 -> Not to explicit requests CPU limit
    • Check 1
      SCREENSHOT
    • Check 2 ❌

      ⚠️: Fail due to reason
      SCREENSHOT

  5. guaranteedInstanceManagerCPU: q -> Should error due to non-integer
    • Check 1
      SCREENSHOT
    • Check 2 ❌

      ⚠️: Fail due to reason
      SCREENSHOT

Supportbundle

Note

  1. ...

QA Check List

  • Labels: not-require/test-plan, not-require/release-note, require-ui, regression...
  • Update corresponding tests ticket lables

@albinsun
Copy link

albinsun commented Sep 19, 2024

Hi @w13915984028 ,
We found the behavior is not indentical to v1.3/v1.2.
Using the same 8 cpus, I get 842m (7.02 x 1000m x 0.12) on v1.4 instead of 960m (8 x 1000m x 0.12) on v1.3/v1.2.
Is this expected?

cc. @bk201

@w13915984028
Copy link
Member

w13915984028 commented Sep 19, 2024

@albinsun The differences should be related to node usable CPU, in each version (with different rke2&k8s) the node usable CPU may vary as k8s reserves some CPU for itself at first, when LH tries to get a percentage CPU from the node, it is based on the kubeNode.Status.Allocatable.Cpu, not the physical available CPU on the node.

For the validation, we can compare: the LH CPU with [guaranteed-instance-manager-cpu] and without (use the default value) in the same version, with bigger guaranteed-instance-manager-cpu LH get more CPU.

Comparing the absolute CPU value between different releases can be an reference. LH is not requesting an absolute value, but a percentage, there could be differences on the final computed value.

thanks.

LH CPU resources calculation code, there looks no change.

allocatableMilliCPU := float64(kubeNode.Status.Allocatable.Cpu().MilliValue())

https://github.com/longhorn/longhorn-manager/blob/a3768493385042b65da343d569daa9e977ebf928/controller/controller_manager.go#L273

https://github.com/longhorn/longhorn-manager/blob/3dde300284b4a7ca8dfa1c2757ee2c2613c766be/controller/controller_manager.go#L161

@albinsun
Copy link

albinsun commented Sep 19, 2024

Test PASS, close as fixed.

Environment

  • harvester-v1.4.0-dev-20240918
    • Profile: 2 nodes QEMU/KVM (8C/16G/500G per node)
    • ui-source: Auto
    • Allocatable CPU: 7.02

Scenarios

w/o guaranteedInstanceManagerCPU

  1. No any configs -> cpu should be 842m ✔️

    image

7.02 x 1000m x 0.12 (default)

  1. Even with guaranteedEngineManagerCPU & guaranteedReplicaManagerCPU -> cpu should still 842m ✔️

    image

    7.02 x 1000m x 0.12 (default)

w/ guaranteedInstanceManagerCPU

  1. guaranteedInstanceManagerCPU: 10 -> cpu should be 702m ✔️

    image

    7.02 x 1000m x 0.10

  2. guaranteedInstanceManagerCPU: 13 -> cpu should be 842m ✔️
    • PXE Server
      image
    • Harvester
      image

    7.02 x 1000m x 0.12 (default)

  3. guaranteedInstanceManagerCPU: 0 -> Not to explicit requests CPU limit ✔️
    • PXE
      image
    • Harvester
      image
  4. guaranteedInstanceManagerCPU: q -> Should error due to non-integer ✔️

    image

Reference

  1. https://docs.harvesterhci.io/v1.4/install/harvester-configuration#installharvesterlonghorndefault_settingsguaranteedinstancemanagercpu
  2. https://longhorn.io/docs/1.7.1/references/settings/#guaranteed-instance-manager-cpu

QA Check List

  • Labels: not-require/test-plan, not-require/release-note, require-ui, regression...
  • Update corresponding tests ticket lables

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/installer Harvester installer backport-needed/1.2.2 backport-needed/1.3.1 kind/bug Issues that are defects reported by users or that we know have reached a real release priority/1 Highly recommended to fix in this release require/doc Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants