Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][loadbalance] added loadBalancerReportUpdateMinIntervalMillis and ignores memory usage in getMaxResourceUsage() #17598

Merged

Conversation

heesung-sn
Copy link
Contributor

@heesung-sn heesung-sn commented Sep 12, 2022

Fixes #

Master Issue: #

Motivation

Currently, the load report frequency is hard-coded at 5secs, and the JVM memory usage is re-computed in the ZK report thread every 5 sec. In the high traffic environment, when GC is frequent, the memory usage could fluctuate in 5 secs, and this could result in writing load reports to ZK too frequently(every 5 sec).

Modifications

Verifying this change

  • Make sure that the change passes the CI checks.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • Anything that affects deployment

Documentation

  • doc-required
    (Your PR needs to update docs and you will update later). It needs to update the broker config doc. https://pulsar.apache.org/docs/reference-configuration/#broker

  • [] doc-not-needed
    (Please explain why) loadBalancerReportUpdateMinIntervalMilliSeconds

  • doc
    (Your PR contains doc changes)

  • doc-complete
    (Docs have been already added)

@github-actions github-actions bot added the doc-required Your PR changes impact docs and you will update later. label Sep 12, 2022
}

public double getMaxResourceUsage(boolean checkMemoryUsage) {
if (checkMemoryUsage) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to remove this config option and instead only check the direct memory but not the heap memory usage here.

Copy link
Contributor Author

@heesung-sn heesung-sn Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

I initially thought this checkMemoryUsage config is safer since it does keep the old behavior(checking memory).

I see that memory usage is always noisy, so I agree that we could remove it from getMaxResourceUsage.

Note that we keep getMaxResourceUsageWithWeightWithinLimit() as-is -- we are not removing the memory usage signal in this func, as its memory usage can be already disabled by loadBalancerMemoryResourceWeight=0.0.

@heesung-sn heesung-sn force-pushed the update-lb-update-frequency-config branch from e0d963d to 94ad0f9 Compare September 13, 2022 01:07
@heesung-sn heesung-sn changed the title [improve][loadbalance] added loadBalancerReportUpdateMemoryUsageChecked and loadBalancerReportUpdateMinIntervalMilliSeconds configs [improve][loadbalance] added loadBalancerReportUpdateMemoryUsageChecked and ignores memory usage in getMaxResourceUsage() Sep 13, 2022
@merlimat merlimat added this to the 2.12.0 milestone Sep 13, 2022
@merlimat merlimat added release/2.9.4 release/2.11.1 release/2.10.3 type/bug The PR fixed a bug or issue reported a bug labels Sep 13, 2022
@heesung-sn
Copy link
Contributor Author

/pulsarbot run-failure-checks

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And could you please also add a test for the new configuration?

conf/broker.conf Outdated
@@ -1163,6 +1163,9 @@ loadBalancerEnabled=true
# Percentage of change to trigger load report update
loadBalancerReportUpdateThresholdPercentage=10

# minimum interval to update load report
loadBalancerReportUpdateMinIntervalMilliSeconds=5000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
loadBalancerReportUpdateMinIntervalMilliSeconds=5000
loadBalancerReportUpdateMinIntervalMillis=5000

Just keep consistent with other configuration names in the broker.conf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

Comment on lines +2092 to +2095
@FieldContext(
category = CATEGORY_LOAD_BALANCER,
dynamic = true,
doc = "Min delay of load report to collect, in milli-seconds"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should move to line 2090?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@@ -235,7 +235,8 @@ private void updateBundleData(final Map<String, NamespaceBundleStats> bundleStat
}

public double getMaxResourceUsage() {
return max(cpu.percentUsage(), memory.percentUsage(), directMemory.percentUsage(), bandwidthIn.percentUsage(),
// does not consider memory because it is noisy by gc.
return max(cpu.percentUsage(), directMemory.percentUsage(), bandwidthIn.percentUsage(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please help add a unit test?
Looks like we changed the behavior, but all the tests got passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

… config and ignores memory usage in getMaxResourceUsage$()
@heesung-sn heesung-sn force-pushed the update-lb-update-frequency-config branch from 94ad0f9 to 5844002 Compare September 13, 2022 23:02
@heesung-sn heesung-sn changed the title [improve][loadbalance] added loadBalancerReportUpdateMemoryUsageChecked and ignores memory usage in getMaxResourceUsage() [improve][loadbalance] added loadBalancerReportUpdateMinIntervalMillis and ignores memory usage in getMaxResourceUsage() Sep 14, 2022
@heesung-sn
Copy link
Contributor Author

/pulsarbot run-failure-checks

1 similar comment
@heesung-sn
Copy link
Contributor Author

/pulsarbot run-failure-checks

@codelipenghui codelipenghui merged commit de7c586 into apache:master Sep 14, 2022
@codelipenghui codelipenghui added type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages area/broker labels Sep 14, 2022
codelipenghui pushed a commit that referenced this pull request Sep 21, 2022
…s and ignores memory usage in getMaxResourceUsage() (#17598)

(cherry picked from commit de7c586)
codelipenghui pushed a commit that referenced this pull request Sep 21, 2022
…s and ignores memory usage in getMaxResourceUsage() (#17598)

(cherry picked from commit de7c586)
@codelipenghui codelipenghui removed this from the 2.12.0 milestone Sep 21, 2022
@codelipenghui codelipenghui added this to the 2.11.0 milestone Sep 21, 2022
codelipenghui pushed a commit that referenced this pull request Sep 21, 2022
…s and ignores memory usage in getMaxResourceUsage() (#17598)

(cherry picked from commit de7c586)
@momo-jun
Copy link
Contributor

@Anonymitaet the configuration doc update can be automated in the next version, which means we don't need to update https://pulsar.apache.org/docs/reference-configuration/#broker manually, correct?

@Anonymitaet
Copy link
Member

@Anonymitaet the configuration doc update can be automated in the next version, which means we don't need to update https://pulsar.apache.org/docs/reference-configuration/#broker manually, correct?

yes

@momo-jun momo-jun added doc-complete Your PR changes impact docs and the related docs have been already added. and removed doc-required Your PR changes impact docs and you will update later. labels Sep 23, 2022
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Sep 28, 2022
…s and ignores memory usage in getMaxResourceUsage() (apache#17598)

(cherry picked from commit de7c586)
(cherry picked from commit a6cb786)
@heesung-sn heesung-sn deleted the update-lb-update-frequency-config branch April 2, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker cherry-picked/branch-2.9 Archived: 2.9 is end of life cherry-picked/branch-2.10 cherry-picked/branch-2.11 doc-complete Your PR changes impact docs and the related docs have been already added. release/2.9.4 release/2.10.2 type/bug The PR fixed a bug or issue reported a bug type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants