Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-2994: Optimize string to binary conversion in AvroWriteSupport #2995

Merged
merged 2 commits into from
Aug 28, 2024

Conversation

sschepens
Copy link
Contributor

@sschepens sschepens commented Aug 19, 2024

Rationale for this change

Binary.fromCharSequence is an order of magnitud slower than Binary.fromString when input is a String:

Benchmarks.fromCharSequence  thrpt   25   5885347.328 ±  186669.738  ops/s
Benchmarks.fromString        thrpt   25  71335979.492 ± 8800704.044  ops/s

Here is the code for the benchmarks:

public class Benchmarks {
    private static final String string = RandomStringUtils.randomAlphanumeric(100);

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    public void fromCharSequence(Blackhole blackhole) {
        blackhole.consume(Binary.fromCharSequence(string));
    }

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    public void fromString(Blackhole blackhole) {
        blackhole.consume(Binary.fromString(string));
    }
}

What changes are included in this PR?

Change AvroWriteSupport.fromAvroString() to use Binary.fromString when operating with string inputs.

Are these changes tested?

Current tests should cover the change

Are there any user-facing changes?

No

Closes #2994

`Binary.fromCharSequence` is an order of magnitud slower than `Binary.fromString` when input is a `String`:

```
Benchmarks.fromCharSequence  thrpt   25   5885347.328 ±  186669.738  ops/s
Benchmarks.fromString        thrpt   25  71335979.492 ± 8800704.044  ops/s
```

Here is the code for the benchmarks:
```java
public class Benchmarks {
    private static final String string = RandomStringUtils.randomAlphanumeric(100);

    @benchmark
    @BenchmarkMode(Mode.Throughput)
    public void fromCharSequence(Blackhole blackhole) {
        blackhole.consume(Binary.fromCharSequence(string));
    }

    @benchmark
    @BenchmarkMode(Mode.Throughput)
    public void fromString(Blackhole blackhole) {
        blackhole.consume(Binary.fromString(string));
    }
}
```
@wgtmac wgtmac changed the title GH-2994: optimize string to binary conversion in AvroWriteSupport GH-2994: Optimize string to binary conversion in AvroWriteSupport Aug 28, 2024
@wgtmac wgtmac merged commit 3ac860e into apache:master Aug 28, 2024
9 checks passed
@sschepens sschepens deleted the patch-1 branch September 19, 2024 12:09
@wgtmac wgtmac added this to the 1.15.0 milestone Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AvroWriteSupport: optimize String to Binary Conversion
2 participants