minor perf: reduce fine grained buffer access when encoding HTTP1 headers #19115

wbpcode · 2021-11-29T06:43:21Z

Commit Message: minor perf: reduce fine grained buffer access when encoding HTTP1 headers

Additional Description:

The HTTP1 codec performs a large number of buffer APIs calls during the encoding of HTTP headers, which introduces an additional CPU overhead (3-4%). This PR implements a buffer helper as a cache to reduce direct buffer writes to improving the overall performance.

Check this #19103 (comment) for some more related info.

This PR reverts #9825. At the same time, I also try my best to make the code simpler and easier to maintain.

Risk Level: High.
Testing: Waiting.
Docs Changes: N/A.
Release Notes: N/A.

Some benchmark results:

The total amount of data written in each test is the same, 64MB. The figure shows the performance of using different APIs and different basic write unit sizes.

In addFrags/{n}, n represents the number of units written by the addFragments API in batches. For example, when the write unit is 16 bytes, addFrags/3 means to use addFragments to write 3 strings of 16 bytes at one time.

Y-axis: number of nanoseconds to complete 64mb data writes. Higher Y-axis is slower.
X-axis: basic write unit bytes size. add can only writes one basic unit one time and addFragments can writes multiple units one time.

From the graph, addFrags/5 gives the best performance in most all cases.

…uest Signed-off-by: wbpcode <[email protected]>

Signed-off-by: wbpcode <[email protected]>

jmarantz · 2021-11-29T15:05:34Z

nice! Can you add both unit tests and a microbenchmark test to this PR?

You can see a nearby microbenchmark test in test/common/http/header_map_impl_speed_test.cc , bazel target //test/common/http:header_map_impl_speed_test , which you could copy into the http1 subdir to capture the performance improvement directly.

Thanks!

wbpcode · 2021-11-29T15:14:23Z

nice! Can you add both unit tests and a microbenchmark test to this PR?

You can see a nearby microbenchmark test in test/common/http/header_map_impl_speed_test.cc , bazel target //test/common/http/header_map_impl_speed_test , which you could copy into the http1 subdir to capture the performance improvement directly.

Thanks!

Get it. And thanks very much for all you kind and detailed sugesstions. It's very helpful. （＾_＾）

soulxu · 2021-11-30T03:43:07Z

source/common/http/http1/codec_impl.cc

+  const uint64_t basic_header_size = method->value().size() + (host ? host->value().size() : 0) +
+                                     (path ? path->value().size() : 0);
+
+  conn_buffer_helper.reserveBuffer(4096 + basic_header_size);


Is it somewhere to ensure the size of headers is less than 4096 bytes?

We need not to ensure it. We always will reserve enough space for encoding.

void StreamEncoderImpl::encodeHeader(absl::string_view key, absl::string_view value) { ASSERT(!key.empty()); const uint64_t header_size = key.size() + value.size() + 4; auto& conn_buffer_helper = connection_.bufferHelper(); conn_buffer_helper.reserveBuffer(header_size); // <-- reserve enough space for every header. conn_buffer_helper.writeToBuffer(key); conn_buffer_helper.writeToBuffer(':'); conn_buffer_helper.writeToBuffer(' '); conn_buffer_helper.writeToBuffer(value); conn_buffer_helper.writeToBuffer(CRLF); bytes_meter_->addHeaderBytesSent(header_size); }

ggreenway · 2021-11-30T17:23:43Z

Reading through the history of this PR and the one you linked to, it sounds like the biggest contributor to the performance degradation is the repeated checking of watermarks, correct?

If that's the case, an alternative that would keep the code-cleanup of #9825 would be to add an API for temporarily disabling those checks (startTransaction()/stopTransaction()?).

If we choose to keep going down this path, we'll need to validate that this use of the buffer reservation API isn't resulting in a worse set of Slices in the buffer, which could also negatively affect performance. The call to reserve 4096 bytes will most likely always create a new Slice, and I'm not sure what the state of the buffer likely is before that call.

alyssawilk · 2021-11-30T18:19:47Z

I hadn't been checking out the flame fraphs I assumed it was fragmented allocations not checking watermark bounds. that should be pretty cheap or something we can address without adding back all the complexity. @wbpcode can you clue me on what causes the gains?

jmarantz · 2021-11-30T20:59:00Z

another thought is to add an API to add an ordered collection of string_views to a Buffer Instance.

That could iterate through the collection twice: once to get the total size so we can reserve enough contiguous space for it, the other to append all the chunks without need for intermediate checking. I would use some sort of templated method if possible (annoying cause it's a virtual class...) so that the same code can work whether it is supplied a C array, std::vector, std::array, or whatever, meaning at the call site you could call buffer_->appendStrings({header, ": ", value, "\r"}); without having to create allocate a temp vector. You might need a non-virtual wrapper with templated helpers.

jmarantz · 2021-11-30T20:59:55Z

/wait

jmarantz · 2021-11-30T21:10:21Z

also I want to re-emphasize that it would be really helpful to get a repro of this issue using Nighthawk or even just a microbenchmark so we could make sure Envoy is operating in its sweet-spot, and that the proposed changes are making a realistic case go faster. I think this change probably is worth it but it'd be better to repro it.

wbpcode · 2021-12-01T02:32:37Z

@ggreenway Thanks and here is some replies.

Reading through the history of this PR and the one you linked to, it sounds like the biggest contributor to the performance degradation is the repeated checking of watermarks, correct?

Nope. Although the repeated checking of watermarks brings some overhead. But from the callgrind data, the repeated buffer Add brings more.

If we choose to keep going down this path, we'll need to validate that this use of the buffer reservation API isn't resulting in a worse set of Slices in the buffer, which could also negatively affect performance. The call to reserve 4096 bytes will most likely always create a new Slice, and I'm not sure what the state of the buffer likely is before that call.

We always maintain sequential writing. In other words, in the worst case, a slice is created at the end of the buffer and submitted. And when a slice is created, unless the exist slice does not have enough space, otherwise we will not submit and create a new slice.

Judging from the test results, this PR's effect is good, and the throughput has been increased by about 4%. (Same config and condition in the #19103 )

wbpcode · 2021-12-01T03:33:48Z

@alyssawilk Thanks and here is some replies.

Yes. Single watermark check is very cheap. But for 100, 000 request, it would be 6, 200, 000 watermark checks. And the repeat (6, 200, 000) buffer Add introduce the more obvious overhead. It seems that Add has more complicated logic. It is not cost-effective to call Add once to write a character or a very short string.

wbpcode · 2021-12-01T03:44:14Z

also I want to re-emphasize that it would be really helpful to get a repro of this issue using Nighthawk or even just a microbenchmark so we could make sure Envoy is operating in its sweet-spot, and that the proposed changes are making a realistic case go faster. I think this change probably is worth it but it'd be better to repro it.

I have test the latest binary of this PR with wrk and the throughput has been increased by about 4% . And I can run some tests with nighthawk this weekend. In addition, until the weekend, I will update this PR.

jmarantz · 2021-12-01T04:06:53Z

This makes sense; there's a lot of per-string-segment gook even in simple OwnedImpl::add().

alyssawilk · 2021-12-01T15:31:41Z

hm yeah. While I wouldn't block this change I'd be more inclined to address by improving the buffer add functionality to be more efficient for small writes, than wrapper HTTP/1.1 code this way. I feel like if we improve add we'll improve perf for the cases discussed but also for things like trickle attack where I think the current fix wouldn't improve things?
cc @antoniovicente because I think he's looked into perf optimization here and may have thoughts as well.

mattklein123 · 2021-12-01T16:28:39Z

FWIW, the reason the code was originally the way it was is exactly because of this type of benchmark. Way back in the day I measured the same thing and noticed the same type of perf delta by doing this type of batching at the codec level. Whether it's actually worth it or not for real world use cases to go back to some variant of what we had before is a different question.

wbpcode · 2021-12-02T01:48:20Z

@alyssawilk

I'd be more inclined to address by improving the buffer add functionality to be more efficient for small writes.

It would be a great job. If necessary, I can also help at that time. I am now interested in most performance optimization issues. 😄

But I read the source code of OwnImpl, I think this job is not easy. The current OwnImpl::addImpl has been implemented very well, and it will be very difficult to obtain a larger performance improvement. This is not because OwnImpl is not good, but because the work undertaken by OwnImpl itself is very complicated.

In addition, this RP can also alleviate the overhead introduced by the watermark check. Although it's not too big, it shouldn't be ignored after the check is executed more frequently.

antoniovicente · 2021-12-02T02:25:14Z

I need to look at this more closely, but it seems that the story is that high-watermark checks add measurable overhead. I wonder if a mitigation would be to perform header writes to a new buffer and then move it's contents to the output buffer. This way we eliminate all the high watermark checks in the critical section.

The change as is does appear very risky as indicated by the risk assessment in the PR description. #9825 attempted to eliminate crashes due to incorrect reservation sizes and this PR re-introduces some of those human error risks. As mentioned in earlier comments, it would be good to have a continuous loadtest that shows the performance benefits of a change like this one and allows us to compare those benefits with alternate implementations like the write to temporary OwnedImpl buffer and then move its contents to the output buffer.

wbpcode · 2021-12-02T02:29:44Z

@mattklein123

This PR try to alleviate the fine-grains buffer write problem introduced by the HTTP headers encoding. I think this problem exists in all HTTP1 requests.

Of course, in different scenarios, the effect of this PR may be different. For example, in some very complex scenarios, the effect of this PR will not be obvious, because other modules may become a bottleneck. But this does not mean that this PR is useless, and it will at least not make things worse.

real world use cases

Regarding the meaning of this sentence, I am a little confused. 🤔
If the read world cases refers to a more realistic load, I can try to use nighthawk to generate some more different types of load to observe the effect of PR.
If you are talking about some complex scenarios, such as enabling a large number of extensions, then I think this is another problem. Because for most extensions, users can at least choose to disable it. But codec is a critical path that all users cannot avoid.

wbpcode · 2021-12-02T02:38:55Z

@anatolebeuzon Thanks for this comment.

... compare those benefits with alternate implementations like the write to temporary OwnedImpl buffer and then move its contents to the output buffer.

I thinks this won't help. Because the frequent fine-grains OwnedImpl::add introduced more overhead.

#9825 attempted to eliminate crashes due to incorrect reservation sizes and this PR re-introduces some of those human error risks.

🤔 can you give some more info about this problem? then we can try to solve it in this PR. #9825 has no detailed description about the problem.

antoniovicente · 2021-12-02T03:08:03Z

@anatolebeuzon Thanks for this comment.

... compare those benefits with alternate implementations like the write to temporary OwnedImpl buffer and then move its contents to the output buffer.

I thinks this won't help. Because the frequent fine-grains OwnedImpl::add introduced more overhead.

#9825 attempted to eliminate crashes due to incorrect reservation sizes and this PR re-introduces some of those human error risks.

thinking can you give some more info about this problem? then we can try to solve it in this PR. #9825 has no detailed description about the problem.

PR #9825 is a defense-in-depth fix for GHSA-gxvv-x4p2-rppp which was previously fixed by b3f42a4

I can try to implement the idea that I proposed above, but I'll need some help on the profiles front. I assume we don't have a micro benchmark that covers serialization of HTTP/1 request/response to output buffers in cases with a small/medium/large number of headers. At least part of the problem is the WatermarkBuffer implementation of checkHighAndOverflowWatermarks being called on every add operation, your benchmark shows this.

wbpcode · 2021-12-02T03:36:39Z

@antoniovicente

At least part of the problem is the WatermarkBuffer implementation of checkHighAndOverflowWatermarks being called on every add operation, your benchmark shows this.

Yes. But this part is not as significant as the problems caused by frequent fine-grains OwnImpl::add. And use a new Buffer, and finally submit its content to the output buffer, which will not make the code much simpler than this PR. 🤔

However, if we can optimize the entire OwnImpl, the value should be far greater than this PR. Because at that time all modules that used OwnImpl could benefit from it, not just codec.

I assume we don't have a micro benchmark that covers serialization of HTTP/1 request/response to output buffers in cases with a small/medium/large number of headers.

I want to create a simple test set for continuous tracking of #19103. But need to wait until this weekend. At that time, I can share it with you. 😄

antoniovicente · 2021-12-02T05:46:51Z

@antoniovicente

At least part of the problem is the WatermarkBuffer implementation of checkHighAndOverflowWatermarks being called on every add operation, your benchmark shows this.

Yes. But this part is not as significant as the problems caused by frequent fine-grains OwnImpl::add. And use a new Buffer, and finally submit its content to the output buffer, which will not make the code much simpler than this PR. thinking

However, if we can optimize the entire OwnImpl, the value should be far greater than this PR. Because at that time all modules that used OwnImpl could benefit from it, not just codec.

I assume we don't have a micro benchmark that covers serialization of HTTP/1 request/response to output buffers in cases with a small/medium/large number of headers.

I want to create a simple test set for continuous tracking of #19103. But need to wait until this weekend. At that time, I can share it with you. smile

@mum4k

I look forward to more data. I have some optimizations in mind, but work on them has been blocked on there being more repeatable Envoy OSS loadtests.

Signed-off-by: wbpcode <[email protected]>

antoniovicente

Sorry for the delayed response, my holiday vacation was busier than expected.

antoniovicente · 2022-01-05T19:37:28Z

envoy/buffer/buffer.h

+      *dst += v.size();
+    };
+
+    total_size_to_write = (absl::string_view(args).size() + ...);


I think that the multiple conversions of args to string_view on lines 478 and 482 may have some performance impact in the case where args contains "const char*". I think that it may be possible to address this issue by changing the addFragments signature to:

template size_t addFragments(absl::string_view&&... args) {

Variable-length template parameters do not seem to support this. Or we need to modify our interface to accept a container, but there may be additional memory allocation and copying. 🤔

Yeah that might make sense. It would be a little less syntactically convenient, but you don't need to use dynamic arrays. You can just use a begin/end ptr/size at the interface, and write things like:

const absl::string_view args[] = {name, ": ", value, "\r\n"}; addFragments(args, args + arraysize(args));

or

addFragments(args, arraysize(args));

this would also work for std::vector if that's more convenient, at the cost of some allocs:

std::vector<absl::string_view> args{name, ": ", value, "\r\n"}; addFragments(&args[0], &args[0] + args.size());

or

addFragments(&args[0], args.size());

I can do a quick test for this. Because we need to consider the cost of building the container itself. 🤔

Or we can just ensure all args in the original addFragments must have data() and size() methods. It would be a little less syntactically convenient because the caller need convert char* to absl::string_view by itself.

I don't think you'll need to allocate any memory for the first option, a C Array like that will just extend the stack.

WDYT of having the model for HTTP1 where you collect a single vector of Spans for all the fragments of all the headers, and use a single addSpans calls that does the watermark check on the aggregated size? Does that simplify things?

Then we need create a vector to store all the key/values string view. Although frequent watermark checks will bring some overhead, I think copying all string view to the heap (64 bytes for single header) may bring more overhead.

In addition, I haven't figured out how to deal with the situation where a large number of fragments are written at the same time in a single addFragments. If the back slice overflows, then we will downgrade to use addImpl.

Although our performance is definitely not worse than repeated add(), there is not much performance gain when overflow occurs.

RE copying string-views; that's pretty small (16 bytes per).
RE heap: we have a limited number of headers, right? You could put a string_view[MAX_HEADERS] on the stack; no need to heap it.

RE add vs addFragments: as discussed in previous comments I'm suggesting we not have the policy of bailing into add mode when you spill out of the back slice. Instead, with the giant array of fragments, you can figure out before adding anything exactly how many bytes are needed, set up the slices as required, and do the watermarking once. I think it will be a cleaner/faster model.

For HTTP1, if we modify the codec to the mode you mentioned, the benefits may not be particularly significant. The overhead of watermark check is not very large, and in the current implementation, we have reduced a lot (current overhead is about 1/4 of the original). And for HTTP1, back slice overflow does not happen frequently.

Instead, with the giant array of fragments, you can figure out before adding anything exactly how many bytes are needed, set up the slices as required, and do the watermarking once. I think it will be a cleaner/faster model.

This suggestion is valuable. When the back slice does not have enough contiguous memory, the remaining memory in the back slice can be filled first. After that, for a large amount of unwritten data, contiguous memory can be allocated. In this way addFragments can handle a large amount of fragmented data. 🤔

I think we can add a TODO for this. Then I will optimize it in subsequent PRs. This PR has really taken too long. orz

It's not just an optimization tweak -- it's a much simpler model, isn't it?

With the API that you have you have to have the caller do addFragments calls, and then end with an add(). That's potentially error-prone and I believe @ggreenway has commented about this.

Moreover the current API you have, as @antoniovicente observed, may have excess conversions from const char* to string_view which would result in extra calls to strlen.

With the Span API you can make the interface much simpler and more performant. So although I know this PR has gone on for a while, we iterated quite a bit and learned a lot and got some really good data. We are trying to make Envoy fast and safe, and it's very useful to do this well.

@jmarantz I am a little confused. 🤣 I have updated this PR to use new absl::Span<absl::string_view> API. Because as you said, it's cleaner, simper, and even faster. So the problems of @ggreenway and @antoniovicente have been solved. 🤣 Do you noticed the latest commit?

The optimization I mentioned above is ref the special case where the back slice has no continuous memory. We still can do some more perf improvements in that case.

antoniovicente · 2022-01-05T19:42:54Z

source/common/buffer/buffer_impl.h

+    if (slices_.empty()) {
+      slices_.emplace_back(Slice(size, account_));
+      length_ += size;
+      slices_.back().reservable_ += size;


The "length_ += size;" and "slices_.back().reservable_ += size;" in here are a bit odd. inlineReserve is not only reserving, but also committing bytes to the buffer before the caller of inlineReserve has written the bytes to the appropriate buffer.

I take it that this is an optimization. I think that the method name should be changed to reflect this reserve+commit semantic that your implementation depends on.

Signed-off-by: wbpcode <[email protected]>

jmarantz

this looks great! Sorry I missed the last awesome commit.

Just some small nits but I think this is ready for Alyssa.

jmarantz · 2022-01-06T18:21:58Z

source/common/buffer/buffer_impl.cc

+    slices_.emplace_back(Slice(total_size_to_write, account_));
+  }
+
+  Slice& back = slices_.back();


Add TODO to improve performance as discussed in the comments, rather than the downgrade strategy we have here? You can add here the detail about contiguous memory also, that you provided in that comment thread.

source/common/buffer/buffer_impl.cc

alyssawilk

This looks absolutely fantastic! Thanks for all your patience and iteration @wbpcode and reviewers!

couple of questions from me below, and could you take a look at the PR description and see if it's still up to date?
@jmarantz : as you've done most of the reviewing I'd appreciate you weighing in on your comfort level landing this before we cut the release next week

envoy/buffer/buffer.h

source/common/buffer/watermark_buffer.h

alyssawilk · 2022-01-06T18:36:10Z

source/common/http/http1/codec_impl.cc

@@ -231,7 +227,7 @@ void StreamEncoderImpl::encodeHeadersBase(const RequestOrResponseHeaderMap& head
    }
  }

-  connection_.addToBuffer(CRLF);
+  connection_.buffer().add(CRLF);


Is this newly added comment still true? Reading the code it looks like watermarks are applied now

source/common/http/http1/codec_impl.cc

jmarantz

I think this deserves an approval from @antoniovicente &/or @KBaichoo as they have worked more in the buffer code than I do. At a high level this direction makes sense to me but this code is at the heart of things and deserves another pass from them.

envoy/buffer/buffer.h

KBaichoo · 2022-01-07T21:35:40Z

Thanks for the patience @wbpcode

KBaichoo · 2022-01-07T21:11:27Z

source/common/buffer/buffer_impl.h


-class OwnedImpl;
+private:
+  friend OwnedImpl;


Do we need to make OwnedImpl a friend? I think we can get away with using the public interface via reservation, commit to the slice in https://github.com/envoyproxy/envoy/pull/19115/files#diff-0fdef7bd42dfa48ed5a015ddb85be158a514d61a82a5a17c1cfa34c6d744e5f8R604-R610

Can this be removed now that we use the public interface? Thanks!

KBaichoo · 2022-01-07T21:16:51Z

source/common/buffer/buffer_impl.cc

+    uint8_t* mem = back.base_ + back.reservable_;
+    for (const auto& fragment : fragments) {
+      memcpy(mem, fragment.data(), fragment.size()); // NOLINT(safe-memcpy)
+      mem += fragment.size();
+    }
+    length_ += total_size_to_write;
+    back.reservable_ += total_size_to_write;


Can we just use the existing API on Slice to create a reservation via reserve() memcpy the fragments into that and then do back.commit(reservation)?

then there must be some more unnecessary check 🤔 It is minor update and I can try update it and check the benchmark result.

look like it is ok.

KBaichoo · 2022-01-07T21:21:41Z

source/common/http/http1/codec_impl.cc

@@ -231,7 +227,7 @@ void StreamEncoderImpl::encodeHeadersBase(const RequestOrResponseHeaderMap& head
    }
  }

-  connection_.addToBuffer(CRLF);
+  connection_.buffer().add(CRLF);


This is no longer true and should be reverted

test/common/buffer/buffer_test.cc

source/common/buffer/buffer_impl.h

Signed-off-by: wbpcode <[email protected]>

wbpcode · 2022-01-10T07:59:37Z

/retest

repokitteh-read-only · 2022-01-10T07:59:40Z

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #19115 (comment) was created by @wbpcode.

see: more, trace.

wbpcode · 2022-01-10T08:39:53Z

@KBaichoo Hi, I have update this RP, can you take a look again when you have free time? 😄

KBaichoo

Looks great otherwise :)

KBaichoo · 2022-01-10T16:07:45Z

source/common/buffer/buffer_impl.h


-class OwnedImpl;
+private:
+  friend OwnedImpl;


Can this be removed now that we use the public interface? Thanks!

KBaichoo · 2022-01-10T16:08:13Z

source/common/buffer/buffer_impl.cc

+    back.commit<false>(reservation);
+    length_ += total_size_to_copy;
+  } else {
+    // Downgrade to using `addImpl` for not enough memory in the back slice.


KBaichoo · 2022-01-10T16:08:59Z

test/common/buffer/buffer_test.cc

+
+    auto slice_vec = buffer.getRawSlices();
+
+    EXPECT_EQ(slice_vec.size(), 5);


likewise here as in https://github.com/envoyproxy/envoy/pull/19115/files#r780540013

KBaichoo · 2022-01-10T16:09:45Z

source/common/buffer/buffer_impl.h

@@ -17,6 +17,8 @@
 namespace Envoy {
 namespace Buffer {

+class OwnedImpl;


Can remove the forward declaration since we no longer need to friend the class.

Signed-off-by: wbpcode <[email protected]>

KBaichoo · 2022-01-10T21:13:50Z

/retest

repokitteh-read-only · 2022-01-10T21:13:54Z

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #19115 (comment) was created by @KBaichoo.

see: more, trace.

wbpcode · 2022-01-11T00:41:54Z

/retest

repokitteh-read-only · 2022-01-11T00:41:58Z

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #19115 (comment) was created by @wbpcode.

see: more, trace.

…-update-for-perf-3

KBaichoo

LGTM

antoniovicente

Thanks for the optimization.

alyssawilk

Awesome! I'm inclined to wait until after the release tomorrow to merge unless someone objects, as it's a pretty major change to some very core code.

from months ago, stale I think

…ders (envoyproxy#19115) Commit Message: minor perf: reduce fine grained buffer access when encoding HTTP1 headers Additional Description: The HTTP1 codec performs a large number of buffer APIs calls during the encoding of HTTP headers, which introduces an additional CPU overhead (3-4%). This PR implements a buffer helper as a cache to reduce direct buffer writes to improving the overall performance. Check this envoyproxy#19103 (comment) for some more related info. This PR reverts envoyproxy#9825. At the same time, I also try my best to make the code simpler and easier to maintain. Risk Level: High. Testing: Waiting. Docs Changes: N/A. Release Notes: N/A. Signed-off-by: wbpcode <[email protected]> Signed-off-by: Josh Perry <[email protected]>

wbpcode and others added 2 commits November 27, 2021 17:09

minor perf: reduce fine grained buffer access when encoding http1 req…

461bec3

…uest Signed-off-by: wbpcode <[email protected]>

fix clang tidy

b256628

Signed-off-by: wbpcode <[email protected]>

wbpcode mentioned this pull request Nov 29, 2021

some performance results of envoy's different versions #19103

Open

jmarantz self-assigned this Nov 29, 2021

ggreenway assigned asraa and alyssawilk Nov 30, 2021

soulxu reviewed Nov 30, 2021

View reviewed changes

repokitteh-read-only bot added the waiting label Nov 30, 2021

antoniovicente self-assigned this Dec 2, 2021

add unit test and perf test

c69a25c

Signed-off-by: wbpcode <[email protected]>

antoniovicente reviewed Jan 5, 2022

View reviewed changes

wbpcode added 2 commits January 6, 2022 12:40

new implementation again

ecd332a

Signed-off-by: wbpcode <[email protected]>

minor update

fbc018a

Signed-off-by: wbpcode <[email protected]>

jmarantz previously approved these changes Jan 6, 2022

View reviewed changes

jmarantz assigned alyssawilk Jan 6, 2022

alyssawilk reviewed Jan 6, 2022

View reviewed changes

yanavlasov dismissed jmarantz’s stale review via fbc018a January 6, 2022 19:28

jmarantz reviewed Jan 7, 2022

View reviewed changes

envoy/buffer/buffer.h Outdated Show resolved Hide resolved

KBaichoo reviewed Jan 7, 2022

View reviewed changes

updates to address some comments

06c97ec

Signed-off-by: wbpcode <[email protected]>

KBaichoo reviewed Jan 10, 2022

View reviewed changes

minor update

91e2496

Signed-off-by: wbpcode <[email protected]>

Merge branch 'main' of https://github.com/envoyproxy/envoy into minor…

b256ad2

…-update-for-perf-3

KBaichoo approved these changes Jan 11, 2022

View reviewed changes

antoniovicente approved these changes Jan 11, 2022

View reviewed changes

alyssawilk approved these changes Jan 11, 2022

View reviewed changes

alyssawilk merged commit 02dff6e into envoyproxy:main Jan 12, 2022

mum4k mentioned this pull request Mar 15, 2022

[Salvo] Add documentation 'Measure Envoy's Performance Change with an A/B Testing' envoyproxy/envoy-perf#127

Merged

mum4k mentioned this pull request Apr 7, 2022

[Salvo] Use Salvo to measure real commits in Envoy repo envoyproxy/envoy-perf#131

Closed


		auto slice_vec = buffer.getRawSlices();

		EXPECT_EQ(slice_vec.size(), 5);

minor perf: reduce fine grained buffer access when encoding HTTP1 headers #19115

minor perf: reduce fine grained buffer access when encoding HTTP1 headers #19115

Conversation

wbpcode commented Nov 29, 2021 • edited Loading

jmarantz commented Nov 29, 2021 • edited Loading

wbpcode commented Nov 29, 2021

Choose a reason for hiding this comment

wbpcode Nov 30, 2021 • edited Loading

Choose a reason for hiding this comment

ggreenway commented Nov 30, 2021

alyssawilk commented Nov 30, 2021

jmarantz commented Nov 30, 2021

jmarantz commented Nov 30, 2021

jmarantz commented Nov 30, 2021

wbpcode commented Dec 1, 2021 • edited Loading

wbpcode commented Dec 1, 2021

wbpcode commented Dec 1, 2021

jmarantz commented Dec 1, 2021

alyssawilk commented Dec 1, 2021

mattklein123 commented Dec 1, 2021

wbpcode commented Dec 2, 2021 • edited Loading

antoniovicente commented Dec 2, 2021

wbpcode commented Dec 2, 2021 • edited Loading

wbpcode commented Dec 2, 2021 • edited Loading

antoniovicente commented Dec 2, 2021

wbpcode commented Dec 2, 2021

antoniovicente commented Dec 2, 2021

antoniovicente left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz Jan 6, 2022 • edited Loading

Choose a reason for hiding this comment

wbpcode Jan 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wbpcode Jan 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wbpcode Jan 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wbpcode Jan 6, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alyssawilk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz left a comment

Choose a reason for hiding this comment

KBaichoo commented Jan 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wbpcode Jan 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wbpcode commented Jan 10, 2022

repokitteh-read-only bot commented Jan 10, 2022

wbpcode commented Jan 10, 2022

KBaichoo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KBaichoo commented Jan 10, 2022

repokitteh-read-only bot commented Jan 10, 2022

wbpcode commented Jan 11, 2022

repokitteh-read-only bot commented Jan 11, 2022

KBaichoo left a comment

Choose a reason for hiding this comment

wbpcode commented Nov 29, 2021 •

edited

Loading

jmarantz commented Nov 29, 2021 •

edited

Loading

wbpcode Nov 30, 2021 •

edited

Loading

wbpcode commented Dec 1, 2021 •

edited

Loading

wbpcode commented Dec 2, 2021 •

edited

Loading

wbpcode commented Dec 2, 2021 •

edited

Loading

wbpcode commented Dec 2, 2021 •

edited

Loading

jmarantz Jan 6, 2022 •

edited

Loading

wbpcode Jan 6, 2022 •

edited

Loading

wbpcode Jan 6, 2022 •

edited

Loading

wbpcode Jan 6, 2022 •

edited

Loading

wbpcode Jan 6, 2022 •

edited

Loading

wbpcode Jan 9, 2022 •

edited

Loading