Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix waged instance capacity npe on new resource #2969

Merged
merged 3 commits into from
Jan 29, 2025

Conversation

GrantPSpencer
Copy link
Contributor

@GrantPSpencer GrantPSpencer commented Nov 25, 2024

Issues

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Full description of the NPE can be found in the issue #2891. Here is a brief summary:

  1. _instanceCapacityMap is calculated with WAGED resources in cluster
  2. If you remove all WAGED resources, the previous _instanceCapacityMap is still present. This is only recalculated under certain conditions
  3. If you add a new WAGED resource, the _instanceCapacityMap is not recalculated before it is used. This stale _instanceCapacityMap can lead to an NPE

This PR addresses the above issue by ensuring the _instanceCapacityMap is null whenever there are no WAGED resources in the cluster. This leads to the map being recalculated before it is used when a new WAGED resource is added.

Tests

  • The following tests are written for this issue:
    New test class: TestWagedNPE

  • The following is the result of the "mvn test" command on the appropriate module:

$ mvn test -o -Dtest=TestWagedNPE -pl=helix-core

[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.836 s - in org.apache.helix.integration.TestWagedNPE
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  35.815 s
[INFO] Finished at: 2024-11-25T10:09:46-08:00
[INFO] ------------------------------------------------------------------------

(If CI test fails due to known issue, please specify the issue and test PR locally. Then copy & paste the result of "mvn test" to here.)

Changes that Break Backward Compatibility (Optional)

  • My PR contains changes that break backward compatibility or previous assumptions for certain methods or API. They include:
    N/A

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

@GrantPSpencer
Copy link
Contributor Author

Pull request approved by: @xyuanlu
Commit message: Fix waged instance capacity npe on new resource by clearing the WAGED capacity map whenever there are no WAGED resources in the cluster. This will prevent a stale map from being used once a new resource is added.

@xyuanlu xyuanlu merged commit e0c551d into apache:master Jan 29, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants