Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save expensive interface type checks (related JDK-8180450 and https://github.com/netty/netty/issues/12708) #4629

Merged
merged 2 commits into from
Mar 1, 2023

Conversation

franz1981
Copy link
Contributor

@franz1981 franz1981 commented Feb 23, 2023

JDK-8180450 has forced Netty http encoding/decoding to order type checks in order to save a huge scalability hit.
And still this is not enough, perf-wise: vertx, as a Netty users, still consume Netty types and cannot ignore such ordering, or would have a similar performance hit.

Moreover, there are cases (outlined on TechEmpower/FrameworkBenchmarks#7942) where the existing JDK mechanism for interface type checks waste CPU resources, but the "constrained" vertx http types can save it to happen both performing exact class type checks and/or instanceof vs concrete and known types.

@franz1981
Copy link
Contributor Author

franz1981 commented Feb 23, 2023

I've run the existing JMH benchmark HttpServerHandlerBenchmark but:

  • sadly it's limited to just one side of the http handling ie VertxHttpRequestDecoder but not VertxHttpResponseDecoder
  • metrics are enabled (!) and it is badly hitting a SUPER slow path type check on DefaultHttpRequest on
    static long sizeOf(Object obj) {
    // https://github.com/netty/netty/issues/12708
    // try first Netty HTTP singleton types, without any instanceof/checkcast bytecodes
    if (obj == Unpooled.EMPTY_BUFFER || obj == LastHttpContent.EMPTY_LAST_CONTENT) {
    return 0;
    }
    // try known vertx (non-interface) types: bi-morphic
    if (obj instanceof AssembledHttpResponse) {
    return ((AssembledHttpResponse) obj).content().readableBytes();
    }
    if (obj instanceof Buffer) {
    return ((Buffer) obj).length();
    } else if (obj instanceof ByteBuf) {
    return ((ByteBuf) obj).readableBytes();
    // see Netty's HttpObjectEncoder::acceptOutboundMessage:
    // the order of checks is the same!
    } else if (obj instanceof FullHttpMessage) {
    return ((FullHttpMessage) obj).content().readableBytes();
    } else if (obj instanceof LastHttpContent) {
    return ((LastHttpContent) obj).content().readableBytes();
    } else if (obj instanceof HttpContent) {
    return ((HttpContent) obj).content().readableBytes();
    } else if (obj instanceof WebSocketFrame) {
    return sizeOf((WebSocketFrame) obj);
    } else if (obj instanceof FileRegion) {
    return ((FileRegion) obj).count();
    } else if (obj instanceof ChunkedFile) {
    ChunkedFile file = (ChunkedFile) obj;
    return file.endOffset() - file.startOffset();
    } else {
    return 0L;
    }

I'll modify the benchmark in order to make evident the benefits

@franz1981 franz1981 force-pushed the http_slow_type_checks branch from f09a21d to dff5858 Compare February 28, 2023 10:07
@franz1981
Copy link
Contributor Author

Before this PR:

Benchmark                                      Mode  Cnt    Score   Error   Units
HttpServerHandlerBenchmark.vertx              thrpt   20  614.545 ± 7.614  ops/ms
HttpServerHandlerBenchmark.vertxOpt           thrpt   20  777.142 ± 7.255  ops/ms
HttpServerHandlerBenchmark.vertxOptMetricsOn  thrpt   20  759.129 ± 7.972  ops/ms

These are the results post PR:

Benchmark                                      Mode  Cnt    Score    Error   Units
HttpServerHandlerBenchmark.vertx              thrpt   20  640.783 ±  4.690  ops/ms
HttpServerHandlerBenchmark.vertxOpt           thrpt   20  834.770 ± 3.142  ops/ms
HttpServerHandlerBenchmark.vertxOptMetricsOn  thrpt   20  832.764 ±  4.015  ops/ms

While Netty (now optimized in order to save expensive type checks as well):

Benchmark                                      Mode  Cnt    Score    Error   Units
HttpServerHandlerBenchmark.netty              thrpt   20  917.029 ±  3.416  ops/ms

In short:

  • this PR improve http performance ~7-10%
  • even with such improvement, at our best, we're ~10% slower then Netty

@vietj vietj added this to the 4.4.0 milestone Mar 1, 2023
@vietj vietj merged commit 5eac2b0 into eclipse-vertx:master Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants