-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking Envoy RPS #28318
Comments
Istio, through their benchmarking states
That is a far cry from what I am seeing. |
cc @yanavlasov for ideas of who might be into Enovy benchmarking |
First, compare to old version Envoy, the new version Envoy comsume more CPU and has a more poor benchmark result. See #19103. The new version Envoy do more verification and has more features which degrade it's performance. But we are trying to improve it. |
Depending on the CPU capabilities the performance is different. It isn't sure #5536 is running on a CPU that has same performance with yours. And the wrk is closed loop, the k6's ramping-arrival-rate is open loop. https://k6.io/docs/using-k6/scenarios/concepts/open-vs-closed/ So I wasn't sure these two cases are comparable. If we really want to compare these two cases. In 5536, it uses 10 threads wrk vs 4 thread envoy. In my machine, the envoy hits 100% CPU for this case. Then it got 40k RPS for 4 threads, then 1 thread should be 10K (assume those 4 thread on physical CPU core without hyperthread). Compare to your case, it is 7k RPS when the CPU hits 100%. And your case is TLS enabled, 5536 is non-TLS. It seems it isn't too far. But yes, I'm still not sure these two cases are comparable. Just my two cents I wasn't sure I'm right. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions. |
Title: Benchmarking Envoy
Description:
I understand that when it comes to benchmarking Envoy there are many factors that come into play. My need to benchmark was strictly based on what I observed in our setup of Envoy (which underpins Emissary Ingress) and looking to implementation of Istio (which uses Envoy for the data plane).
My goal was to isolate Envoy as best as possible and drive as high as possible RPS. My expectations of what should be possible were somewhat guided by this previous issue #5536 (#5536).
Before I go into my findings I will outline the setup I used when performing the tests. As I mentioned above I have attempted to isolate the tests to Envoy, removing any other limiting contributing factors.
I am testing against Envoy version 1.26 with the following configuration:
The upstream endpoints are all defined statically. In this case the upstream is a simple NodeJS hello world service, designed to be as dead simple as possible.
Envoy is running within Kubernetes, on a dedicated node (outside of normal daemonset PODs such as CNI, kube-proxy, etc) that has 2 CPU and 4 gig of memory (c5.large). Attached storage is provisioned IOPS at 5k. No limits or requests are set on the Envoy POD.
For the upstream service, as mentioned, it is a simple NodeJS hello world service. Given that I am generating very high RPS I wanted to ensure that the upstream service wasn't limiting throughput, therefore for the duration of the tests I scaled up to 130 PODs, all on dedicated nodes.
My test suite is k6 using a simple ramping arrival rate executor:
The above script ramps up to 10k RPS and then keeps that rate for an additional 5 minutes.
When I execute this test I note the following results:
During peak RPS (around @17:31) the idle CPU (green descending line) gets to about 21%.
Sorry I cut off the Y-axis, but the load peaks just above 2 but for the most part stays under 2.
Based on these metrics my conclusion would be that the underlying node is not over-saturated and can handle additional RPS.
This graph is interesting in that it show the response latency for the upstream service (again I apologize for cutting off the Y-axis). But the latency is around 5 ms until it hockey sticks up to almost 300 ms. This uptick corresponds to around 7k RPS.
I have done additional testing on nodes of different sizes (CPU and memory) and based off what I see is that the latency increases around the 3500 RPS per CPU mark. So a 1 CPU node can push 3500 RPS before latency drastically increases, a 2 can do 7k, etc, etc.
The above tests were all ran with access logging disabled. If I enable access logging (since that is standard in most setups) the results are even worse.
Using this Envoy configuration:
By enabling logging and executing the same ramping arrival rate k6 executor I can only obtain max 7k RPS, as the node CPU hits 100%, load spikes above 3.5 and the latency for the upstream spikes to 1 second.
While I don't know for sure, these numbers seem awfully low to me (and considering #5536 talks about 40k requests per second on a 4 CPU box).
My Envoy configuration is pretty simple so I don't think there are changes there that I could make to improve throughput. I also feel that I have eliminated any external factors.
The text was updated successfully, but these errors were encountered: