Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node servers should not be affected by latency more than chrome or firefox #2612

Open
trevj opened this issue Jul 27, 2016 · 15 comments
Open
Assignees

Comments

@trevj
Copy link
Contributor

trevj commented Jul 27, 2016

From UWNetworksLab/uproxy-docker#96:

no obfuscation

0ms latency

chrome,14611
firefox,5781
node,1523

150ms latency

chrome,604
firefox,612
node,225

caesar

0ms latency

chrome,1289
firefox,1772
node,872

150ms latency

chrome,535
firefox,532
node,192

Summary of our work to fix this in Chrome:
#1339

At the uProxy layer, this might be relevant too:
#1563

@trevj trevj changed the title latency disproportionally affects throughput on Node.js node servers should not be affected by latency more than chrome or firefox Aug 2, 2016
@agallant agallant self-assigned this Aug 2, 2016
@agallant
Copy link
Collaborator

agallant commented Aug 2, 2016

A few more notes:

The man page has a particularly salient bit under socket options:

SCTP_NODELAY
Turn on/off any Nagle-like algorithm. This means that packets are generally sent
as soon as possible and no unnecessary delays are introduced, at the cost of more
packets in the network. Expects an integer boolean flag.

So, possibly just figuring out how to properly set this while building may fix our issue. Otherwise, trying to copy the fix we did for Chrome into node is another path.

@agallant
Copy link
Collaborator

agallant commented Aug 3, 2016

I think Nagle is already being disabled - https://github.com/js-platform/libwebrtc/blob/7848af36d86a4bbe8569693a1169123ad67c8905/talk/media/sctp/sctpdataengine.cc#L373

So, the fix may involve tuning buffer size - when we did that with Chrome, what was the ultimate resolution? Did we push upstream, or do we maintain our own config/build somewhere?

@agallant
Copy link
Collaborator

agallant commented Aug 4, 2016

I think I found commits in the official webrtc repo corresponding to Lally's work mentioned in the prior Chrome issue - https://chromium.googlesource.com/external/webrtc/+log/master/?s=3480728

If those are the fixes, then all we need to is get the node-webrtc build process to be based on an adequately new libwebrtc (right now it's ca. 2014). I think it's worth trying to get this into the official node-webrtc, but we can probably figure it out ourselves first by building the library and then editing this - https://github.com/js-platform/node-webrtc/blob/develop/scripts/download-webrtc-libraries-and-headers.js

@agallant
Copy link
Collaborator

agallant commented Aug 5, 2016

Possibly more relevant commit - https://chromium.googlesource.com/external/webrtc/+/5c6c6e026bbcc83ad546d00b41ab739dcc856c1b

Current plan of action - try to get node-webrtc building off newer webrtc.

@agallant
Copy link
Collaborator

agallant commented Aug 5, 2016

So I think they actually are already building on newer-ish webrtc (the 2014 repo is just stale legacy). https://github.com/markandrus/build-webrtc/blob/43b972651326c2ea3872896cf170320bbd89bd46/config.default.json#L7-L11

May still try doing a custom build, but I'm less confident that it will change anything.

@trevj
Copy link
Contributor Author

trevj commented Aug 31, 2016

Right, AFAICT we're already running code that includes the fix.

Perhaps node-webrtc doesn't use that exact same code path? The figures are almost exactly match what we used to see in Chrome:
https://github.com/uProxy/uproxy-lib/releases/tag/v35.0.0

Suggest we dig deeper into node-webrtc's "wrapping" C code.

@trevj
Copy link
Contributor Author

trevj commented Aug 31, 2016

FYI, here is the commit that seemed to fix Chrome:
https://chromium.googlesource.com/external/webrtc.git/+/e8386d21992b6683b6fadd315ea631875b4256fb

@agallant
Copy link
Collaborator

Agreed re: the wrapping. https://github.com/js-platform/node-webrtc/blob/develop/src/peerconnection.cc is where I'm investigating from, and https://github.com/nodejs/nan may also be relevant.

@trevj
Copy link
Contributor Author

trevj commented Aug 31, 2016

I dug into build-webrtc itself, too.

Since the patch for Chrome essentially added a call to SignalReadyToSend, this seems relevant:

We might be able to hack a solution at the build-webrtc layer? Need to trace SignalReadyToSend a bit first.

@trevj
Copy link
Contributor Author

trevj commented Aug 31, 2016

This also seems relevant:
https://chromium.googlesource.com/external/webrtc.git/+/master/webrtc/media/sctp/sctpdataengine.cc#368

Fired on our I/O thread. - What if there is no I/O thread on Node.js? Some printfs might help us answer that :-)

@trevj
Copy link
Contributor Author

trevj commented Aug 31, 2016

And where that callback is registered in usrsctplib:
https://github.com/sctplab/usrsctp/blob/master/usrsctplib/user_socket.c#L1410

Again, we might even hack the callback at that level, avoid the whole I/O thread thing.

@agallant
Copy link
Collaborator

If it's threading related, another possibility is to increase the uv thread pool size - http://stackoverflow.com/questions/22644328/when-is-the-thread-pool-used#22644735

Both https://github.com/js-platform/node-webrtc/blob/develop/src/datachannel.cc and https://github.com/js-platform/node-webrtc/blob/develop/src/peerconnection.cc make liberal use of libuv to deal with async and mutex lock/unlocking. According to the above link, libuv has a default thread pool size of 4, but can be increased e.g. process.env.UV_THREADPOOL_SIZE = 10;.

Could be worth a shot, I'll read more about libuv in general in case there are other pertinent aspects.

@trevj
Copy link
Contributor Author

trevj commented Sep 1, 2016

Nice find!

I ran this command:

UV_THREADPOOL_SIZE=25 time node examples/stress.js

I didn't see an improvement but as discussed let's come up with a test where we can add latency between the peers.

@agallant agallant assigned rajasagashe and unassigned agallant Oct 11, 2016
@rajasagashe
Copy link
Collaborator

@trevj How are you running the chrome and firefox throughput tests?

@rajasagashe
Copy link
Collaborator

Also what are your units for the throughputs? When I'm running the stress send and receive as documented in the wiki it's taking ~13 seconds for the program to complete with 18mbps throughput.

@ghost ghost added Feature Request and removed C:Node labels Nov 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants