-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trivuele batch 2: RequestStarter simplification and fixes #777
Open
ArneBab
wants to merge
2
commits into
next
Choose a base branch
from
trivuele-batch-2
base: next
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The RequestStarter logic was subtly broken in multiple ways: - Bad interaction between rtt and window estimates would cause your node to oscillate between starting way more requests than the network can handle and almost stalling entirely for minutes. The speed difference may be a factor 100x from one minute to another. - Nodes patched with the infamous RequestStarter patches would cause your node to throttle down on number of requests started to their advantage. This could potentially be exploited by an attacker as well. - The window estimation would attempt to throttle the speed to target a certain ratio of requests succeeding without getting dropped due to rejected-overload. This would avoid meltdown of the network, but the target was not hard-coded but dependant on average network speed among other things, in practice ending up allowing 30% of requests to fail. - The window estimation was changed according to an AIMD schedule. This would have encouraged nodes on the network to agree on a speed collectively, but the logic deviated from AIMD in undocumented ways, for example by making the incremental step multiplicative. It is unclear if this would actually cause any agreement on the network. - All code was completely undocumented, had many seemingly arbitrarily chosen constants, and multiple lines of code with no effect at all. - There are other ways to design the estimation of an ideal request starting speed that avoids most of the above problems, but it would still require selecting some arbitrary constants, so not obviously better than replacing all logic altogether with a single constant. The whole thing can be viewed as a single network constant; the number of requests per second and peer we can start to cause an ideal load on the network, assuming distribution of types of requests and idling and busy nodes looks like it does today. The biggest advantage of this method is that it causes a very even starting of requests everywhere in the network, with no oscillating speeding that chokes parts of the space in turns. This may enable faster speeds with fewer rejected-overloads and peer backoffs. There are other advantages with this simple method as well. If a future release does something that improves or worsen rejected-overload ratio or average request completion time, it will show clearly in the stats. The network will not try to conceal the change. The constant was tuned to an empirical value matching todays average request starting speed. This may need finetuning in future releases if rejected-overload situation improves and higher speeds are possible. Made all stats average over whole session, but no persistent. This makes them more useful to judge state of network. === Review note === This is not perfect, because at lower speeds peer count is the logarithm of the speed, so with this change slow peers may be sending more request than right for their bandwidth. But the previous logic was broken, because the static increment was divided by the window size, so it was actually an increase by a fraction of the window size, which is not how AIMD works. Merging even though this might need more tuning, because it is an improvement. - Arne
@toad can you have a look at this? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
code sent on FMS, not written by me, reviewed by me; needs input by @toad