Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible current_timestamp desynchronization in PlayoutBuffer #261

Closed
tazz4843 opened this issue Oct 29, 2024 · 3 comments · Fixed by #266
Closed

Possible current_timestamp desynchronization in PlayoutBuffer #261

tazz4843 opened this issue Oct 29, 2024 · 3 comments · Fixed by #266
Labels
bug Something isn't working

Comments

@tazz4843
Copy link
Contributor

Songbird version: git serenity-next

Rust version (rustc -V): rustc 1.82.0-nightly (176e54520 2024-08-04)

Serenity/Twilight version: git next

Description:
Scripty has been having issues with VoiceTick returning empty audio ever since a few weeks before DAVE introduction (possibly related, not worth digging into yet I don't think). Digging through logs, I noticed lots of lines noting Witholding packet: ts_diff is -xxxxxx or Packet arrived beyond playout max length: wanted slot nn. I was going to open a bug report long ago but I finally got around to it.

Explore-logs-2024-10-29 05_34_55.txt

Attached is a log file from the most recent known event of a desync. The first line where audio seems to disappear is Witholding packet: ts_diff is -403200. The timestamp difference between that and the last audio packet from this speaker approximately nine seconds before is 412800, 9600 (or 10*MONO_FRAME_SIZE) off of the actual ts_diff value. Based off this very basic poking and prodding I suspect a possible race condition somewhere, but don't know enough about songbird internals to poke into it much.

There's been lots of comments from me in the serenity discord about this so I'll copy paste some of them here for a full thread: whatever follows is copied+pasted

there's at least three possible failure states i know of right now

  • VoiceTick is always silent for every user no matter what
  • VoiceTick returns audio for a handful of users for an arbitrary length of time and then goes silent forever for all
  • VoiceTick always returns data for some users but never others

and as far as i know the failure state varies from session to session
during all of this the WebSocket is unaffected and i still get speaker updates normally no matter what which triggers empty results getting sent to scripty's STT service which return unusually fast (that's the thing that originally tipped me off here)

I'm willing to put a $50 USD bounty on this issue, through GitHub Sponsors or kofi. Let me know fi any additional info or anything else is required to help debug :)

@tazz4843 tazz4843 added the bug Something isn't working label Oct 29, 2024
@tazz4843
Copy link
Contributor Author

Was hoping this would be fixed somewhat with the first implementation of Discord's new encryption modes but it doesn't appear to be so unfortunately.

@tazz4843
Copy link
Contributor Author

Since originally opening this issue, I have heard from some users who can usually get it working consistently if they always join after the bot (ie they never get hit by this issue)

@FelixMcFelix
Copy link
Member

FelixMcFelix commented Nov 13, 2024

I'm not sure about a race condition, but I'm not really able to reproduce this locally over a various voice servers (Amsterdam, US West, Russia).

Broadly, what I see locally is:

  • User joins after bot.
    • No issues, no surprising RTP behaviour. No long (1s) catchup period.
  • Bot joins after user.
    • Bot receives approx 50 packets (1 second) of audio with identical timestamp as user switches over from DAVE to bot-compatible mode. This is what Fix(receive): Handle sleds of identical RTP timestamps #249 was intended to fix.
    • I tried this a fair few times and I always seemed to have correct playout (all packets find correct slot in PlayoutBuffer, decode to valid audio in sync with my speech -- no obvious gaps etc.). Keeping the mic hot for >30s got me nothing here.

So, there's something different between my setup and yours that you're getting a different pattern of sequence numbers for one or more users. I just don't yet know what it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants