Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka add gauge v1 #33408

Merged
merged 4 commits into from
Jan 24, 2025
Merged

Kafka add gauge v1 #33408

merged 4 commits into from
Jan 24, 2025

Conversation

Naireen
Copy link
Contributor

@Naireen Naireen commented Dec 17, 2024

Add per worker gauge support to add per backlog partition for kafka with java legacy worker for Dataflow

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 43 lines in your changes missing coverage. Please review.

Project coverage is 60.42%. Comparing base (edc4766) to head (949f3f5).
Report is 45 commits behind head on master.

Files with missing lines Patch % Lines
...a/org/apache/beam/sdk/metrics/DelegatingGauge.java 0.00% 19 Missing ⚠️
...dataflow/worker/StreamingStepMetricsContainer.java 23.07% 10 Missing ⚠️
...ker/MetricsToPerStepNamespaceMetricsConverter.java 86.95% 3 Missing and 3 partials ⚠️
...in/java/org/apache/beam/sdk/metrics/NoOpGauge.java 0.00% 5 Missing ⚠️
...ners/dataflow/worker/DataflowMetricsContainer.java 0.00% 2 Missing ⚠️
.../org/apache/beam/sdk/metrics/MetricsContainer.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             master   #33408       +/-   ##
=============================================
+ Coverage     57.47%   60.42%    +2.94%     
- Complexity     1474    15172    +13698     
=============================================
  Files           985     2760     +1775     
  Lines        155802   267597   +111795     
  Branches       1076    12161    +11085     
=============================================
+ Hits          89550   161700    +72150     
- Misses        64035    99435    +35400     
- Partials       2217     6462     +4245     
Flag Coverage Δ
java 64.81% <50.00%> (-3.78%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Naireen Naireen force-pushed the kafka_add_counters_V1 branch 4 times, most recently from 2791b23 to b91a77f Compare December 18, 2024 01:21
@Naireen Naireen marked this pull request as ready for review December 18, 2024 04:18
@Naireen
Copy link
Contributor Author

Naireen commented Dec 18, 2024

R: @sjvanrossum for the kafka io part, thanks in advance!

Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

@Naireen Naireen force-pushed the kafka_add_counters_V1 branch from b91a77f to 948dfe6 Compare December 18, 2024 07:34
@Naireen
Copy link
Contributor Author

Naireen commented Dec 18, 2024

Run Java PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Dec 18, 2024

R: @johnjcasey for the sdk portion of it.

@Naireen
Copy link
Contributor Author

Naireen commented Dec 19, 2024

Run Java_Pulsar_IO_Direct PreCommit

@Naireen Naireen force-pushed the kafka_add_counters_V1 branch from 11bee27 to b9d2f2b Compare December 19, 2024 19:54
@Naireen
Copy link
Contributor Author

Naireen commented Dec 19, 2024

Run Java PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Dec 19, 2024

Run Java_GCP_IO_Direct PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Dec 19, 2024

Run Java PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Dec 19, 2024

Run Java PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Dec 19, 2024

Run Java_GCP_IO_Direct PreCommit

1 similar comment
@Naireen
Copy link
Contributor Author

Naireen commented Dec 19, 2024

Run Java_GCP_IO_Direct PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Dec 19, 2024

Run Java PreCommit

@Naireen Naireen mentioned this pull request Jan 6, 2025
3 tasks
Comment on lines 110 to 119
/**
* @param topicName topicName
* @param partitionId partitionId for the topic Only included in the metric key if
* 'supportsMetricsDeletion' is enabled.
* @param backlog backlog for the topic Only included in the metric key if
* 'supportsMetricsDeletion' is enabled.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" Only" -> ". Only"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks for catching that.

@@ -71,11 +79,17 @@ abstract class KafkaMetricsImpl implements KafkaMetrics {

abstract HashMap<String, ConcurrentLinkedQueue<Duration>> perTopicRpcLatencies();

static ConcurrentHashMap<String, Gauge> backlogGauges = new ConcurrentHashMap<String, Gauge>();

abstract HashMap<String, Long> perTopicPartitionBacklogs();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an instance of this class may be concurrently updated, then HashMap needs to be replaced (ditto for the existing HashMap fields). Use ConcurrentHashMap instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly unrelated, but why doesn't perTopicRpcLatencies use a gauge or sum as the value type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would the sum represent? the sum of latencies? but each individual one is important, and a sum would lose information.
A gauge isn't quite clear either, if you have two concurrent rpcs that completed, what value do you return?

A histogram of values provides more information (and allows us to see the spread of values)

@@ -743,6 +747,16 @@ private void reportBacklog() {
backlogElementsOfSplit.set(splitBacklogMessages);
}

private void reportBacklogMetrics() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this can be merged with reportBacklog (potentially rename that method to reportBacklogMetrics updateBacklogMetrics).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I explicitly moved it out to be separate, since reportBacklog() is called twice, and we only need to do this once (when we advance to the next record).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@Naireen
Copy link
Contributor Author

Naireen commented Jan 17, 2025

Run Java_IOs_Direct PreCommit

@Naireen Naireen force-pushed the kafka_add_counters_V1 branch 2 times, most recently from 5baabd8 to 11137f9 Compare January 21, 2025 18:32
@Naireen
Copy link
Contributor Author

Naireen commented Jan 21, 2025

Run Java PreCommit

@Naireen Naireen force-pushed the kafka_add_counters_V1 branch from 11137f9 to 099ab06 Compare January 22, 2025 07:50
@Naireen
Copy link
Contributor Author

Naireen commented Jan 22, 2025

Run Java_Hadoop_IO_Direct PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Jan 22, 2025

Run Java PreCommit

@Naireen Naireen force-pushed the kafka_add_counters_V1 branch 2 times, most recently from 7e1ad4c to 70a5f63 Compare January 22, 2025 19:56
@Naireen
Copy link
Contributor Author

Naireen commented Jan 23, 2025

Run Java PreCommit

@Naireen Naireen force-pushed the kafka_add_counters_V1 branch from 70a5f63 to 949f3f5 Compare January 23, 2025 17:09
@Naireen
Copy link
Contributor Author

Naireen commented Jan 23, 2025

Run Java_IOs_Direct PreCommit

@Naireen
Copy link
Contributor Author

Naireen commented Jan 23, 2025

Run Java PreCommit

@Naireen Naireen force-pushed the kafka_add_counters_V1 branch from 949f3f5 to fc42673 Compare January 24, 2025 01:27
@johnjcasey johnjcasey merged commit 2064103 into apache:master Jan 24, 2025
28 checks passed
tomstepp pushed a commit to tomstepp/apache-beam that referenced this pull request Feb 3, 2025
* add counter stuff

* Address John's comments about separting conversion and validation checks

* address Steven's comments

* another round of comments

---------

Co-authored-by: Naireen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants