-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
delta-xds: perform resource state tracking after resource ingestion #23214
delta-xds: perform resource state tracking after resource ingestion #23214
Conversation
Signed-off-by: Adi Suissa-Peleg <[email protected]>
/assign @htuch The TTL consideration is similar about SotW, and would like your opinion before changing that as well. |
Thanks for working on this. There is still a risk tho it's much mitigated: if there is an exception fired in the middle of accepting 1000 resources, say 500 has been updated(v0 --> V1), the 501th RouteConfiguration is rejected. The root issue is that the resourceState dict is out of sync with each individual resource's "consumer" (RDSConfigProviderImpl for example). I think instead we could possibly update the per-resource-state by calling from each resource consumer's onConfigUpdateFailed/(add the onConfigUpdateSucceeded). Which requires quite a bit plumbing. |
The right way to fix this is probably to make the update “transactional” or have a per-resource ACK/NACK. Otherwise Envoy may NACK a config, making the server believe that none of the resources are used, but will partially apply the config. The current fix should address the issue of custom config validators rejecting the config, and the state tracking that is described in #20790. |
Ping @htuch for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
This makes sense to me. The broader issue that @stevenzzzz points to is not really about TTL tracking but how xDS resource acceptance works in relation to ACK/NACK, so I think let's track that in a separate issue. |
…estion (#23214)" This reverts commit 6e6a3a1. Coverage is failing on main. Signed-off-by: Matt Klein <[email protected]>
#23335) Revert "delta-xds: perform resource state tracking after resource ingestion (#23214)" This reverts commit 6e6a3a1. Coverage is failing on main. Signed-off-by: Matt Klein <[email protected]>
Fixes delta-xDS protocol resource state tracking. Prior to this PR, the implementation updated the state of a resource (tracked version, TTL), even if the fetched config was NACKed. This PR changes it to only update the tracked state after a resource was successfully ingested, and an ACK will be sent. Note that this PR impacts the TTL of a resource, as now TTL timer starts after the resource was ingested rather than before. This PR is essentially #23214 (which was reverted in #23335) with more coverage. Risk Level: Medium for delta-xDS usage. Testing: Added unit tests. Docs Changes: N/A. Release Notes: Added release note. Platform Specific Features: N/A. Runtime guard: Can be reverted by setting envoy.reloadable_features.delta_xds_subscription_state_tracking_fix to false. Fixes #20790 Signed-off-by: Adi Suissa-Peleg <[email protected]>
Commit Message: delta-xds: perform resource state tracking after resource ingestion
Additional Description:
Fixes delta-xDS protocol resource state tracking.
Prior to this PR, the implementation updated the state of a resource (tracked version, TTL), even if the fetched config was NACKed. This PR changes it to only update the tracked state after a resource was successfully ingested, and an ACK will be sent.
Note that this PR impacts the TTL of a resource, as now TTL timer starts after the resource was ingested rather than before.
Risk Level: Medium for delta-xDS usage.
Testing: Added unit tests.
Docs Changes: N/A.
Release Notes: Added release note.
Platform Specific Features: N/A.
Runtime guard: Can be reverted by setting
envoy.reloadable_features.delta_xds_subscription_state_tracking_fix
to false.Fixes #20790
Signed-off-by: Adi Suissa-Peleg [email protected]