Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable computation of evenness score based on a preferred scoring key [Testing Purporses] #2699

Conversation

csudharsanan
Copy link
Contributor

@csudharsanan csudharsanan commented Nov 15, 2023

Note : This PR is used for testing purposes, won't be merged.

Issues

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Currently, WAGED consider many capacity categories/dimensions such as CPU, Disk, etc for the placement decision evenness scores. In some use case, only one dimension is relevant because the clusters are provisioned based on that category. And, if evenness scores are not based on that then it introduces more shuffle/movements.

So, we would like to provide a way for users to specify a prioritized evennessScoringKey that will help computing scores based on that category only.

Tests

  • The following tests are written for this issue:
  1. TestClusterContext
  • testEstimateMaxUtilization
  1. TestMaxCapacityUsageInstanceConstraint
  • testGetNormalizedScoreWithPreferredScoringKey
  1. TestTopStateMaxCapacityUsageInstanceConstraint
  • testGetNormalizedScoreWithPreferredScoringKey
[INFO] Reactor Summary for Apache Helix 1.3.2-SNAPSHOT:
[INFO] 
[INFO] Apache Helix ....................................... SUCCESS [  2.365 s]
[INFO] Apache Helix :: Metrics Common ..................... SUCCESS [  3.243 s]
[INFO] Apache Helix :: Metadata Store Directory Common .... SUCCESS [  2.169 s]
[INFO] Apache Helix :: ZooKeeper API ...................... SUCCESS [  6.354 s]
[INFO] Apache Helix :: Helix Common ....................... SUCCESS [  2.425 s]
[INFO] Apache Helix :: Core ............................... SUCCESS [ 41.671 s]
[INFO] Apache Helix :: Admin Webapp ....................... SUCCESS [  5.571 s]
[INFO] Apache Helix :: Restful Interface .................. SUCCESS [ 10.913 s]
[INFO] Apache Helix :: Distributed Lock ................... SUCCESS [  2.080 s]
[INFO] Apache Helix :: HelixAgent ......................... SUCCESS [  3.155 s]
[INFO] Apache Helix :: Front End .......................... SUCCESS [05:15 min]
[INFO] Apache Helix :: Recipes ............................ SUCCESS [  0.051 s]
[INFO] Apache Helix :: Recipes :: Rabbitmq Consumer Group . SUCCESS [  2.146 s]
[INFO] Apache Helix :: Recipes :: Rsync Replicated File Store SUCCESS [  2.399 s]
[INFO] Apache Helix :: Recipes :: distributed lock manager  SUCCESS [  1.894 s]
[INFO] Apache Helix :: Recipes :: distributed task execution SUCCESS [  1.934 s]
[INFO] Apache Helix :: Recipes :: service discovery ....... SUCCESS [  1.964 s]
[INFO] Apache Helix :: View Aggregator .................... SUCCESS [  4.175 s]
[INFO] Apache Helix :: Meta Client ........................ SUCCESS [  3.506 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  06:54 min
[INFO] Finished at: 2023-11-16T11:48:16-08:00
[INFO] ------------------------------------------------------------------------

mvn test -o -Dtest=TestClusterContext -pl=helix-core


[INFO] Results:
[INFO] 
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
[INFO] 
[INFO] --- jacoco:0.8.6:report (generate-code-coverage-report) @ helix-core ---
[INFO] Loading execution data file /Users/csudhars/Documents/GitHub/csudharsanan_helix/helix/helix-core/target/jacoco.exec
[INFO] Analyzed bundle 'Apache Helix :: Core' with 947 classes
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

mvn test -o -Dtest=TestMaxCapacityUsageInstanceConstraint -pl=helix-core

[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.534 s - in org.apache.helix.controller.rebalancer.waged.constraints.TestMaxCapacityUsageInstanceConstraint
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
[INFO] 
[INFO] --- jacoco:0.8.6:report (generate-code-coverage-report) @ helix-core ---
[INFO] Loading execution data file /Users/csudhars/Documents/GitHub/csudharsanan_helix/helix/helix-core/target/jacoco.exec
[INFO] Analyzed bundle 'Apache Helix :: Core' with 947 classes
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

mvn test -o -Dtest=TestTopStateMaxCapacityUsageInstanceConstraint -pl=helix-core

[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.218 s - in org.apache.helix.controller.rebalancer.waged.constraints.TestTopStateMaxCapacityUsageInstanceConstraint
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
[INFO] 
[INFO] --- jacoco:0.8.6:report (generate-code-coverage-report) @ helix-core ---
[INFO] Loading execution data file /Users/csudhars/Documents/GitHub/csudharsanan_helix/helix/helix-core/target/jacoco.exec
[INFO] Analyzed bundle 'Apache Helix :: Core' with 947 classes
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  11.500 s
[INFO] Finished at: 2023-11-16T12:44:49-08:00
[INFO] ------------------------------------------------------------------------

Changes that Break Backward Compatibility (Optional)

  • My PR contains changes that break backward compatibility or previous assumptions for certain methods or API. They include:

(Consider including all behavior changes for public methods or API. Also include these changes in merge description so that other developers are aware of these changes. This allows them to make relevant code changes in feature branches accounting for the new method/API behavior.)

Documentation (Optional)

  • In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

@csudharsanan csudharsanan force-pushed the csudhars/custom-evenness-soft-constraint branch 2 times, most recently from 82c9121 to 46bea93 Compare November 16, 2023 20:52
@csudharsanan csudharsanan changed the title [WIP] Enable computation of evenness score based on a specified (prioritized) capacity key Enable computation of evenness score based on a preferred scoring key Nov 16, 2023
@csudharsanan csudharsanan force-pushed the csudhars/custom-evenness-soft-constraint branch 3 times, most recently from 8d9a2e6 to 7fee710 Compare December 1, 2023 21:48
return computeUtilizationScore(estimatedMaxUtilization, projectedHighestUtilization);
node.getGeneralProjectedHighestUtilization(replica.getCapacity(), clusterContext.getPreferredScoringKey());
double utilizationScore = computeUtilizationScore(estimatedMaxUtilization, projectedHighestUtilization);
LOG.info("[DEPEND-29018] clusterName: {}, estimatedMaxUtilization: {}, projectedHighestUtilization: {}, utilizationScore: {}, preferredScoringKey: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider avoiding include user specific ticket here. We want Helix to be a generic service.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Also let's not have this at info level, debug level should be better. Otherwise, it will be per replica per partition per resource log. That's too much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR was used to test these changes in ei and get some numbers, will not be merged. I will shortly put up a PR cleaning up the logs.

return computeUtilizationScore(estimatedMaxUtilization, projectedHighestUtilization);
node.getGeneralProjectedHighestUtilization(replica.getCapacity(), clusterContext.getPreferredScoringKey());
double utilizationScore = computeUtilizationScore(estimatedMaxUtilization, projectedHighestUtilization);
LOG.info("[DEPEND-29018] clusterName: {}, estimatedMaxUtilization: {}, projectedHighestUtilization: {}, utilizationScore: {}, preferredScoringKey: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Also let's not have this at info level, debug level should be better. Otherwise, it will be per replica per partition per resource log. That's too much.

Map<String, Integer> remainingCapacity, String preferredScoringKey) {
Set<String> capacityKeySet = _maxAllowedCapacity.keySet();
if (preferredScoringKey != null && capacityKeySet.contains(preferredScoringKey)) {
capacityKeySet = ImmutableSet.of(preferredScoringKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this. Does this mean the preferred score will override all others? Then why we just setup one dimension instead of having other dimension for computation? That would be easier?

Copy link
Contributor Author

@csudharsanan csudharsanan Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know if my understanding of the question is not right. I believe by dimensions you mean the capacity keys.

preferredScoringKey will override others if present. but it has to be set up at cluster configs explicitly. if not set, we will need all the dimensions for computation.

@csudharsanan csudharsanan force-pushed the csudhars/custom-evenness-soft-constraint branch from 7fee710 to 97479b1 Compare December 14, 2023 08:37
@csudharsanan csudharsanan changed the title Enable computation of evenness score based on a preferred scoring key Enable computation of evenness score based on a preferred scoring key [Testing Purporses] Jan 16, 2024
@junkaixue
Copy link
Contributor

Would suggest to schedule a follow up section so let everybody on the same page. @csudharsanan

@junkaixue
Copy link
Contributor

Close this PR. feel free to have a new one as current version is outdated.

@junkaixue junkaixue closed this Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants