fix sendLocalResponse in wasm #23049

johnlanni · 2022-09-09T08:09:55Z

Commit Message:
Additional Description:
Risk Level: low
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

source/extensions/common/wasm/context.cc

mathetake · 2022-09-20T02:27:35Z

/wait

test/extensions/common/wasm/wasm_test.cc

htuch · 2022-09-30T22:08:52Z

source/extensions/common/wasm/context.cc

    });
  }
+  local_reply_hold_ = true;


Why do we need both hold and sent? Doesn't your "hold" case already cover the sent case?

envoy/source/extensions/common/wasm/context.cc

Line 1609 in f09ed36

decoder_callbacks_->sendLocalReply(Envoy::Http::Code::ServiceUnavailable, "", nullptr,

https://github.com/proxy-wasm/proxy-wasm-cpp-host/blob/4fcf895fa2433a1cdf20704926b8b7e4039a6a04/src/context.cc#L34

This is to avoid sending the local reply twice if the vm fails. I added another test for this case.

I'll defer to @mathetake @lizan here, can you folks take a pass as CODEOWNERS? Thanks.

yeah unfortunately I think it is inevitable to have both of these two booleans.

@johnlanni this is fine I think, can you add some documentation in the form of comments for posterity, as the reasons we need this is subtle as per your above comment? Everything else LGTM.

@htuch done

Signed-off-by: johnlanni <[email protected]>

This reverts commit 01d9b4cbbb77f5508a285590e6f83b98c9e19496. Signed-off-by: johnlanni <[email protected]>

Signed-off-by: johnlanni <[email protected]>

johnlanni · 2022-10-01T14:18:27Z

https://dev.azure.com/cncf/envoy/_build/results?buildId=117861&view=logs&j=b7634614-24f3-5416-e791-4f3affaaed6c&t=21e6aa7d-f369-5abd-5e4e-e888cac18e9c

envoy/source/extensions/common/wasm/context.h

Line 30 in f09ed36

using proxy_wasm::ContextBase;

There seems to be a problem with this clang tidy check, this using will be used in the context.cc like:

envoy/source/extensions/common/wasm/context.cc

Line 152 in f09ed36

Context::Context(Wasm* wasm) : ContextBase(wasm) {}

test/extensions/common/wasm/test_data/test_context_cpp.cc

htuch · 2022-10-03T03:04:37Z

source/extensions/common/wasm/context.cc

    });
  }
+  local_reply_hold_ = true;


I'll defer to @mathetake @lizan here, can you folks take a pass as CODEOWNERS? Thanks.

Signed-off-by: johnlanni <[email protected]>

source/extensions/common/wasm/context.cc

test/extensions/common/wasm/wasm_test.cc

Signed-off-by: johnlanni <[email protected]>

mathetake

LGTM

Signed-off-by: johnlanni <[email protected]>

htuch

LGTM, thanks!

This change fixes envoyproxy#28826. Some additional discussions for context can be found in proxy-wasm/proxy-wasm-cpp-host#423. The issue reported in envoyproxy#28826 happens when proxy-wasm plugin calls proxy_send_local_response during the HTTP request proessing and HTTP response processing. This happens because in attempt to mitigate a use-after-free issue (see envoyproxy#23049) we added logic to proxy-wasm that avoids calling sendLocalReply multiple times. So now when proxy-wasm plugin calls proxy_send_local_response only the first call will result in sendLocalReply, while all subsequent calls will get ignored. At the same time, when proxy-wasm plugins call proxy_send_local_response, because it's used to report an error in the plugin, proxy-wasm also stops iteration. During HTTP request processing this leads to the following chain of events: 1. During request proxy-wasm plugin calls proxy_send_local_response 2. proxy_send_local_response calls sendLocalReply, which schedules the local reply to be processed later through the filter chain 3. Request processing filter chain gets aborted and Envoy sends the previous created local reply though the filter chain 4. Proxy-wasm plugin gets called to process the response it generated and it calls proxy_send_local_response 5. proxy_send_local_response **does not** call sendLocalReply, because proxy-wasm prevents multiple calls to sendLocalReply currently 6. proxy-wasm stops iteration So in the end the filter chain iteration is stopped for the response and because proxy_send_local_respose does not actually call sendLocalReply we don't send another locally generated response either. I think we can do slightly better and close the connection in this case. This change includes the following parts: 1. Partial rollback of envoyproxy#23049 2. Change to Envoy implementation of failStream used by proxy-wasm in case of critical errors 3. Tests covering this case and some other using the actual FilterManager. The most important question is why rolling back envoyproxy#23049 now is safe? The reason why it's safe, is that since introduction of prepareLocalReplyViaFilterChain in envoyproxy#24367, calling sendLocalReply multiple times is safe - that PR basically address the issue in a generic way for all the plugins, so a proxy-wasm specific fix is not needed anymore. On top of being safe, there are additional benefits to making this change: 1. We don't end up with a stuck connection in case of errors, which is slightly better 2. We remove a failure mode from proxy_send_local_response that was introduced in envoyproxy#23049 - which is good, because proxy-wasm plugins don't have a good fallback when proxy_send_local_response is failing. Why do we need to change the implementation of the failStream? failStream gets called when Wasm VM crashes (e.g., null pointer derefernce or abort inside the VM or any other unrecoverable error with the VM). Current implementation just calls sendLocalReply in this case. Let's consider what happens during the HTTP request processing when Wasm VM crashes: 1. Wasm VM crashes 2. proxy-wasm calls failStream which calls sendLocalReply 3. Envoy prepares local reply and eventually sends it through the filter chain 4. proxy-wasm plugin with a crashed VM is called to process the reply proxy-wasm in this case can't really do anything and just stops the iteration. Which is a fine way of dealing with the issue, but we can do slightly better and close the connection in this case instead of just pausing the iteration. And we are not losing anything in this case by replacing sendLocalReply with resetStream, because the local reply doesn't get send anyways. > NOTE: The same issue does not happen if the VM crashes during response > processing, because sendLocalReply in this case will send the response > directly ignoring the filter chain. Finally, why replace the current mocks with a real FilterManager? Mock implementation of sendLocalReply works fine for tests that just need to assert that sendLocalReply gets called. However, in this case we rely on the fact that it's safe to call sendLocalReply multiple times and it will do the right thing and we want to assert that the connection will get closed in the end - that cannot be tested by just checking that the sendLocalReply gets called or by relying on a simplistic mock implementation of sendLocalReply. Signed-off-by: Mikhail Krinkin <[email protected]>

This change fixes envoyproxy#28826. Some additional discussions for context can be found in proxy-wasm/proxy-wasm-cpp-host#423. The issue reported in envoyproxy#28826 happens when proxy-wasm plugin calls proxy_send_local_response during the HTTP request proessing and HTTP response processing. This happens because in attempt to mitigate a use-after-free issue (see envoyproxy#23049) we added logic to proxy-wasm that avoids calling sendLocalReply multiple times. So now when proxy-wasm plugin calls proxy_send_local_response only the first call will result in sendLocalReply, while all subsequent calls will get ignored. At the same time, when proxy-wasm plugins call proxy_send_local_response, because it's used to report an error in the plugin, proxy-wasm also stops iteration. During HTTP request processing this leads to the following chain of events: 1. During request proxy-wasm plugin calls proxy_send_local_response 2. proxy_send_local_response calls sendLocalReply, which schedules the local reply to be processed later through the filter chain 3. Request processing filter chain gets aborted and Envoy sends the previous created local reply though the filter chain 4. Proxy-wasm plugin gets called to process the response it generated and it calls proxy_send_local_response 5. proxy_send_local_response **does not** call sendLocalReply, because proxy-wasm prevents multiple calls to sendLocalReply currently 6. proxy-wasm stops iteration So in the end the filter chain iteration is stopped for the response and because proxy_send_local_respose does not actually call sendLocalReply we don't send another locally generated response either. I think we can do slightly better and close the connection in this case. This change includes the following parts: 1. Partial rollback of envoyproxy#23049 2. Tests covering this case and some other using the actual FilterManager. The most important question is why rolling back envoyproxy#23049 now is safe? The reason why it's safe, is that since introduction of prepareLocalReplyViaFilterChain in envoyproxy#24367, calling sendLocalReply multiple times is safe - that PR basically address the issue in a generic way for all the plugins, so a proxy-wasm specific fix is not needed anymore. On top of being safe, there are additional benefits to making this change: 1. We don't end up with a stuck connection in case of errors, which is slightly better 2. We remove a failure mode from proxy_send_local_response that was introduced in envoyproxy#23049 - which is good, because proxy-wasm plugins don't have a good fallback when proxy_send_local_response is failing. Finally, why replace the current mocks with a real FilterManager? Mock implementation of sendLocalReply works fine for tests that just need to assert that sendLocalReply gets called. However, in this case we rely on the fact that it's safe to call sendLocalReply multiple times and it will do the right thing and we want to assert that the connection will get closed in the end - that cannot be tested by just checking that the sendLocalReply gets called or by relying on a simplistic mock implementation of sendLocalReply. Signed-off-by: Mikhail Krinkin <[email protected]>

…6809) Commit Message: This change fixes #28826. Some additional discussions for context can be found in proxy-wasm/proxy-wasm-cpp-host#423. The issue reported in #28826 happens when proxy-wasm plugin calls proxy_send_local_response during the HTTP request proessing and HTTP response processing. This happens because in attempt to mitigate a use-after-free issue (see #23049) we added logic to proxy-wasm that avoids calling sendLocalReply multiple times. So now when proxy-wasm plugin calls proxy_send_local_response only the first call will result in sendLocalReply, while all subsequent calls will get ignored. At the same time, when proxy-wasm plugins call proxy_send_local_response, because it's used to report an error in the plugin, proxy-wasm also stops iteration. During HTTP request processing this leads to the following chain of events: 1. During request proxy-wasm plugin calls proxy_send_local_response 2. proxy_send_local_response calls sendLocalReply, which schedules the local reply to be processed later through the filter chain 3. Request processing filter chain gets aborted and Envoy sends the previous created local reply though the filter chain 4. Proxy-wasm plugin gets called to process the response it generated and it calls proxy_send_local_response 5. proxy_send_local_response **does not** call sendLocalReply, because proxy-wasm prevents multiple calls to sendLocalReply currently 6. proxy-wasm stops iteration So in the end the filter chain iteration is stopped for the response and because proxy_send_local_respose does not actually call sendLocalReply we don't send another locally generated response either. I think we can do slightly better and close the connection in this case. This change includes the following parts: 1. Partial rollback of #23049 2. Tests covering this case and some other using the actual FilterManager. The most important question is why rolling back #23049 now is safe? The reason why it's safe, is that since introduction of prepareLocalReplyViaFilterChain in #24367, calling sendLocalReply multiple times is safe - that PR basically address the issue in a generic way for all the plugins, so a proxy-wasm specific fix is not needed anymore. On top of being safe, there are additional benefits to making this change: 1. We don't end up with a stuck connection in case of errors, which is slightly better 2. We remove a failure mode from proxy_send_local_response that was introduced in #23049 - which is good, because proxy-wasm plugins don't have a good fallback when proxy_send_local_response is failing. Finally, why replace the current mocks with a real FilterManager? Mock implementation of sendLocalReply works fine for tests that just need to assert that sendLocalReply gets called. However, in this case we rely on the fact that it's safe to call sendLocalReply multiple times and it will do the right thing and we want to assert that the connection will get closed in the end - that cannot be tested by just checking that the sendLocalReply gets called or by relying on a simplistic mock implementation of sendLocalReply. Additional Description: Risk Level: low Testing: Manually, by reproducing the case reported in #28826. I also added new unit tests and verified that they pass and aren't flaky: ``` bazel test --runs_per_test=1000 //test/extensions/common/wasm:all --config=docker-clang ``` Docs Changes: N/A Release Notes: N/A Platform Specific Features: N/A Fixes #28826 --------- Signed-off-by: Mikhail Krinkin <[email protected]>

johnlanni requested a review from lizan as a code owner September 9, 2022 08:09

htuch reviewed Sep 12, 2022

View reviewed changes

source/extensions/common/wasm/context.cc Show resolved Hide resolved

htuch assigned htuch and mathetake Sep 12, 2022

repokitteh-read-only bot added the waiting label Sep 20, 2022

repokitteh-read-only bot removed the waiting label Sep 30, 2022

johnlanni requested review from htuch and removed request for lizan September 30, 2022 02:17

htuch reviewed Sep 30, 2022

View reviewed changes

johnlanni added 7 commits October 1, 2022 17:16

fix sendLocalResponse in wasm

2b2006d

Signed-off-by: johnlanni <[email protected]>

add UT

34b16f5

Signed-off-by: johnlanni <[email protected]>

fix UT

173d72f

Signed-off-by: johnlanni <[email protected]>

remove unused for clang tidy

00c0f99

Signed-off-by: johnlanni <[email protected]>

Revert "remove unused for clang tidy"

72e2250

This reverts commit 01d9b4cbbb77f5508a285590e6f83b98c9e19496. Signed-off-by: johnlanni <[email protected]>

add more UT

af8370b

Signed-off-by: johnlanni <[email protected]>

change setup params

5984707

Signed-off-by: johnlanni <[email protected]>

johnlanni requested a review from htuch October 1, 2022 14:18

htuch reviewed Oct 3, 2022

View reviewed changes

optimize UT

b37f92d

Signed-off-by: johnlanni <[email protected]>

johnlanni requested review from htuch and mathetake and removed request for htuch and mathetake October 3, 2022 05:23

PiotrSikora reviewed Oct 3, 2022

View reviewed changes

source/extensions/common/wasm/context.cc Show resolved Hide resolved

johnlanni requested review from mathetake and removed request for htuch October 6, 2022 13:42

mathetake reviewed Oct 6, 2022

View reviewed changes

test/extensions/common/wasm/wasm_test.cc Outdated Show resolved Hide resolved

test/extensions/common/wasm/wasm_test.cc Outdated Show resolved Hide resolved

optimize UT

c94ee6c

Signed-off-by: johnlanni <[email protected]>

johnlanni requested a review from mathetake October 7, 2022 05:03

optimize UT

c1abd98

Signed-off-by: johnlanni <[email protected]>

mathetake previously approved these changes Oct 9, 2022

View reviewed changes

johnlanni requested a review from htuch October 10, 2022 02:09

add comment

c962f83

Signed-off-by: johnlanni <[email protected]>

johnlanni dismissed mathetake’s stale review via c962f83 October 10, 2022 08:03

htuch approved these changes Oct 11, 2022

View reviewed changes

htuch merged commit ff49762 into envoyproxy:main Oct 11, 2022

jcchavezs mentioned this pull request Jan 2, 2023

ci: fix ci pipeline to test all envoy versions corazawaf/coraza-proxy-wasm#124

Closed

M4tteoP mentioned this pull request Jan 2, 2023

Fix: avoids double interruption corazawaf/coraza-proxy-wasm#126

Merged

orangetangerine mentioned this pull request Aug 4, 2023

WASM Calling proxy_send_local_response twice will stuck remote http client(e.g. curl) forever until timeout or interrupted #28826

Closed

PiotrSikora mentioned this pull request Oct 18, 2024

Don't pause processing when send_local_response fails proxy-wasm/proxy-wasm-cpp-host#423

Closed

krinkinmu mentioned this pull request Oct 24, 2024

[proxy-wasm] Prevent stuck connections in case of multiple local replies #36809

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix sendLocalResponse in wasm #23049

fix sendLocalResponse in wasm #23049

johnlanni commented Sep 9, 2022

mathetake commented Sep 20, 2022

htuch Sep 30, 2022

johnlanni Oct 1, 2022

htuch Oct 3, 2022

mathetake Oct 3, 2022

htuch Oct 10, 2022

johnlanni Oct 10, 2022

johnlanni commented Oct 1, 2022

htuch Oct 3, 2022

mathetake left a comment

htuch left a comment

fix sendLocalResponse in wasm #23049

fix sendLocalResponse in wasm #23049

Conversation

johnlanni commented Sep 9, 2022

mathetake commented Sep 20, 2022

htuch Sep 30, 2022

Choose a reason for hiding this comment

johnlanni Oct 1, 2022

Choose a reason for hiding this comment

htuch Oct 3, 2022

Choose a reason for hiding this comment

mathetake Oct 3, 2022

Choose a reason for hiding this comment

htuch Oct 10, 2022

Choose a reason for hiding this comment

johnlanni Oct 10, 2022

Choose a reason for hiding this comment

johnlanni commented Oct 1, 2022

htuch Oct 3, 2022

Choose a reason for hiding this comment

mathetake left a comment

Choose a reason for hiding this comment

htuch left a comment

Choose a reason for hiding this comment