Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct multi-FEC block size calculations #1466

Closed
wants to merge 1 commit into from

Conversation

cgutman
Copy link
Collaborator

@cgutman cgutman commented Jul 25, 2023

Description

The old multi-FEC block calculations were way too conservative in determining our max data size before needing to split, which resulted in many frames being split into multiple FEC blocks unnecessarily. For example, a frame with 91 data shards would be split into 3 FEC blocks even though we could easily handle up to 212 data shards per block at the default FEC percentage of 20.

This issue is compounded by the fact that we always used a fixed split of 3 FEC blocks instead of calculating how many FEC blocks are actually required. Each FEC block requires separate calls to reed_solomon_new(), reed_solomon_encode(), and WSASendMsg()/sendmsg() which is costly in CPU time. Additionally, since the protocol supports a maximum of 4 FEC blocks per frame, our hardcoded split prevented us from sending frames as large as the protocol allows.

By fixing these calculations, this PR reduces CPU usage on the host, improves network utilization, and also enables up to 848 packets per frame at 20% FEC and 4096 packets per frame with 0% FEC (dynamically selected if data shard count is too high to send in 4 FEC blocks) up from 636 and 3172 respectively.

Screenshot

Issues Fixed or Closed

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Dependency update (updates to dependencies)
  • Documentation update (changes to documentation)
  • Repository update (changes to repository files, e.g. .github/...)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated the in code docstring/documentation-blocks for new or existing methods/components

Branch Updates

LizardByte requires that branches be up-to-date before merging. This means that after any PR is merged, this branch
must be updated before it can be merged. You must also
Allow edits from maintainers.

  • I want maintainers to keep my branch updated

We were too conservative in determining our max data size before needing to split, which resulted in many frames being split into multiple FEC blocks unnecessarily.

We also just used a hardcoded split into 3 blocks instead of actually calculating how many blocks are actually required.
@cgutman
Copy link
Collaborator Author

cgutman commented Jul 25, 2023

Heh, so now that the FEC block splitting logic is correct, it's exposing scalability issues in platf::send_batch().

Rather than sending 3 very loosly packed FEC blocks as we did before, we're now sending nearly full blocks of ~255 packets each. Because we send each FEC block as a single batch with platf::send_batch(), this means we're sending much larger batches than before, which is leading to packet loss. These large batches were possible before this change, but they were much less likely due to the fixed 3-way FEC block split.

So I'll have to finally write that batch throttling logic that I was putting off. For now, I can have it mimic the batching of the current code and we can optimize it using the FEC feedback messages from Moonlight later on.

@cgutman cgutman marked this pull request as draft July 25, 2023 03:37
@LizardByte-bot
Copy link
Member

It looks like this PR has been idle for 90 days. If it's still something you're working on or would like to pursue, please leave a comment or update your branch. Otherwise, we'll be closing this PR in 10 days to reduce our backlog. Thanks!

@LizardByte-bot
Copy link
Member

This PR was closed because it has been stalled for 10 days with no activity.

@cgutman cgutman mentioned this pull request Jul 1, 2024
11 tasks
@ns6089 ns6089 mentioned this pull request Jul 4, 2024
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants