Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][client] Refactor SchemaHash to reduce call of hashFunction in SchemaHash #17948

Merged
merged 17 commits into from
Oct 12, 2022

Conversation

AnonHxy
Copy link
Contributor

@AnonHxy AnonHxy commented Oct 6, 2022

Fixes #17931

Motivation

The PR #17049 bring a significant performance regression for publish throughput. The root cause is that HashFunction#hashBytes(byte[]) takes a lot of cpu time. So we need add a cache for SchemaHash.

For details see line97:

public static <T> MessageImpl<T> create(MessageMetadata msgMetadata, ByteBuffer payload, Schema<T> schema,
String topic) {
@SuppressWarnings("unchecked")
MessageImpl<T> msg = (MessageImpl<T>) RECYCLER.get();
msg.msgMetadata.clear();
msg.msgMetadata.copyFrom(msgMetadata);
msg.messageId = null;
msg.topic = topic;
msg.cnx = null;
msg.payload = Unpooled.wrappedBuffer(payload);
msg.properties = null;
msg.schema = schema;
msg.schemaHash = SchemaHash.of(schema);
msg.uncompressedSize = payload.remaining();
return msg;

And we can see that hashFunction takes a lot of cpu time from the flame graph below
image
image

The test result for publish throughput(msgs/s) is below:

master branch master branch revert PR#17049 master branch apply this patch
1 837,119.26 1,504,240.76 1,570,375.90
2 870,065.55 1,504,240.76 1,594,203.19
3 868,281.75 1,504,240.76 1,589,468.89
avg 858,488.85 1,528,647.63 1,584,682.66

Modifications

  • Adding a cache for SchemaHash.
  • To avoid the new byte[0] allocation and instead passing a null for SchemaHash#of

Verifying this change

  • Make sure that the change passes the CI checks.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: AnonHxy#8

@merlimat
Copy link
Contributor

merlimat commented Oct 7, 2022

One more change could be:

 public SchemaHash getSchemaHash() {
        return schemaHash == null ? SchemaHash.of(new byte[0], null) : schemaHash;
    }

to avoid the new byte[0] allocation and instead passing a null;

@AnonHxy
Copy link
Contributor Author

AnonHxy commented Oct 7, 2022

to avoid the new byte[0] allocation and instead passing a null;

A great suggestion

@AnonHxy AnonHxy added cherry-picked/branch-2.9 Archived: 2.9 is end of life cherry-picked/branch-2.10 type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages labels Oct 7, 2022
@AnonHxy AnonHxy added this to the 2.12.0 milestone Oct 7, 2022
@AnonHxy
Copy link
Contributor Author

AnonHxy commented Oct 8, 2022

@codelipenghui
Copy link
Contributor

@merlimat Please help review this PR again.

@codelipenghui codelipenghui modified the milestones: 2.12.0, 2.11.0 Oct 10, 2022
@codelipenghui codelipenghui added the release/blocker Indicate the PR or issue that should block the release until it gets resolved label Oct 10, 2022
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this patch but we cannot chery pick to released branches.
We are changing the internal API of SchemaInfoImpl.
Unfortunately Pulsar users are very creative and they could have used it.

If we find a way to not break the internal builder API and make it compatible we can cherry pick

@codelipenghui
Copy link
Contributor

codelipenghui commented Oct 11, 2022

@eolivelli I think this PR will not break the SchemaInfoImpl API. It just added SchemaHash to the SchemaInfoImpl . Do you see any point that it will break the builder API?

The old API like the followings are still available for users.

SCHEMA_INFO = new SchemaInfoImpl()
        .setName("Boolean")
        .setType(SchemaType.BOOLEAN)
        .setSchema(new byte[0]);

@AnonHxy
Copy link
Contributor Author

AnonHxy commented Oct 11, 2022

If we find a way to not break the internal builder API and make it compatible we can cherry pick

If users create SchemaInfo by no-args-constructor, the SchemaHash in SchemaInfoImpl will not be initialized, and SchemaHash#of(SchemaInfo) will return null.

If the compatible issues you mentioned is that, I think we can upated SchemaInfoImpl#getSchemaHash like below: WDYT @codelipenghui @eolivelli

 public SchemaHash getSchemaHash() {
        if (schemaHash == null) {
            schemaHash = SchemaHash.of(this.schema, this.type);
        }
        return schemaHash;
    }

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

great work

@AnonHxy
Copy link
Contributor Author

AnonHxy commented Oct 11, 2022

/pulsarbot run-failure-checks

Copy link
Contributor

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nicoloboschi
Copy link
Contributor

@AnonHxy please update the pr title to make the semantic check to pass. [improve][client] should be accepted
I've restarted the failed jobs

@AnonHxy AnonHxy changed the title [improve][java-client][issue-17931]Refactor SchemaHash to reduce call of hashFunction in SchemaHash [improve][client][issue-17931]Refactor SchemaHash to reduce call of hashFunction in SchemaHash Oct 11, 2022
@Jason918 Jason918 changed the title [improve][client][issue-17931]Refactor SchemaHash to reduce call of hashFunction in SchemaHash [improve][client] Refactor SchemaHash to reduce call of hashFunction in SchemaHash Oct 12, 2022
@Jason918
Copy link
Contributor

@AnonHxy please update the pr title to make the semantic check to pass. [improve][client] should be accepted
I've restarted the failed jobs

The title check seems don't accept the "[issue-xxx]" part.

@Technoboy- Technoboy- merged commit 2b5b92c into apache:master Oct 12, 2022
Technoboy- pushed a commit that referenced this pull request Oct 12, 2022
Jason918 pushed a commit that referenced this pull request Oct 12, 2022
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Oct 12, 2022
…in SchemaHash (apache#17948)

(cherry picked from commit 2b5b92c)
(cherry picked from commit 11b5df7)
@Technoboy- Technoboy- removed the release/blocker Indicate the PR or issue that should block the release until it gets resolved label Oct 17, 2022
congbobo184 pushed a commit that referenced this pull request Nov 14, 2022
@congbobo184 congbobo184 added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Nov 14, 2022
congbobo184 pushed a commit that referenced this pull request Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-2.9 Archived: 2.9 is end of life cherry-picked/branch-2.10 cherry-picked/branch-2.11 doc-not-needed Your PR changes do not impact docs ready-to-test release/2.9.4 release/2.10.2 type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] A significant performance regression about producer publish throughput on master branch
8 participants