-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multistream support via Unified Plan #1459
Conversation
…anus.js, and started updating EchoTest and Streaming demos
…Streaming plugin subscriptions
FYI, since it's been too long since we last aligned this branch to master, and we fixed a ton of things, there's a mountain of conflicts now... rather than fixing the conflicts here, I'll probably work on a new branch (and so a new PR) applying the same set of changes this PR tried to enforce, but working on master instead. I'll probably start this after #1880 is merged, since that would likely cause several conflicts of its own. I'll keep you posted and let you know when that effort is done, and after that I'll schedule when it will be merged (because I'm not going through this again after I do that). |
any timeline this gets rebased for master? |
It will not be rebased, it will be done from scratch in a new PR. But no, no timeline: I want to do it, but we're awfully busy right now. |
thank you for working on this! |
Closing this deprecated branch, since we've revived the effort in a new one. If you still care about this, please jump on there and let's start testing this 😄 |
Thanks to a generous sponsorship from our friends at Highfive, we finally started working on something we wanted to do for a long time: multistream support in Janus!
As you may know, since the beginning PeerConnections in Janus have had a limitation: you could create as many as you wanted, associating them to a handle, but they could only contain, at max, one audio, one video, and one data channel. There was no way to involve, let's say, three video streams in the same PeerConnection: to do that, you'd have to create more than one PeerConnection. This is why, for instance, in the VideoRoom plugin you're required to create a new handle/PeerConnection any time you want to create a new subscription. While functionally this is fine (we've all used Janus successfully even with these constraints!) there are advantages to be able to put more streams together in the same PeerConnection, e.g., in terms of network setup (less ports needed, less ICE/DTLS interactions so less chances of things breaking, etc.).
Both Chrome and Firefox have supported multistream for a long time. Anyway, they historically did things very differently: namely, Chrome was using the so-called "Plan B" approach, while Firefox was using the "Unified Plan" one. Without delving too much into the details of what those plans meant (I'll let you google that if you're curious), suffice it to say they were very different ways of negotiating this multistream support in the SDP, and completely incompatible with one another. Needless to say, we didn't have any intention of implementing both, especially considering one of the two would have to be dumped sooner or later, which is why we've stayed far away from it so far.
That said, eventually "Unified Plan" was chosen as the standard approach, and now that Chrome has started its transition to implement it as well, the time was ripe to start doing the same in Janus too, which is exactly what this PR is for.
What changes?
Internally, things changed A LOT. We had a lot of assumptions in the Janus core about what a PeerConnection could do and contain: specifically, we had hardcoded pieces about audio and video stuff, for instance, and this applied to both internal structures and SDP management. Of course, internal routing of the media had to change too: while plugins would previously just expect (and send) audio, video or data packets, they now need to know which audio or video stream they're handling, as there may be more than one.
Besides, we had many pieces of information scattered across different structures, when they conceptually belonged somewhere else. As such, we took advantage of this considerable refactoring to also better streamline the way we address information. Just to give you an idea, when creating a PeerConnection a Janus handle would contain a single
janus_ice_stream
structure, which in turn would contain a singlejanus_ice_component
structure: this originally mapped to ICE streams and components, as initially in Janus we supported non-bundled communications (meaning a PeerConnection could contain, e.g., two streams for audio and video, each containing two components for RTP and RTCP). We've stopped doing that for a long time, now, but those structures remained, and as anticipated the information on Peerconnections was actually scattered between the two, not always in a clear separation of concerns. This PR changes that, and now a PeerConnection is identified by ajanus_handle_webrtc
structure, which contains all the global stuff: ICE agent, DTLS stack, identifiers, etc. Then, for each media stream that is negotiated, a separatejanus_handle_webrtc_medium
is allocated, which only contains information related to the medium itself: if it's audio/video/data, its mid, any SSRC if available, direction of the media, etc. This made it very easy to break the "one media per type" constraint", as we could dynamically add as many as we wanted, depending on what the SDP negotiation dictates.As anticipated, the SDP management had to change considerably as well to allow for all that, as there were many assumptions in there too. This translated to changes in the SDP utils, which you'll need to be aware of if you are using them in your plugin and want to use it in this new version of Janus as well.
Notice that this considerably changed the Admin API handle information.
SDP utils
The SDP utilities we added some time ago had to change quite a bit. In fact, while previously to generate offers and answers we could just say "I want audio, but not video", it's not so simple anymore: we may have more than one audio or video stream, and so we needed more fine-grained operations.
As such, the major changes were:
JANUS_SDP_OA_MLINE
that needs to be put before any media stream we want to add, whose value is the type of media;Describing all the changes verbosely is probably going to be problematic, so a couple of examples may help understand the changes here:
This is creating a new offer with two audio streams, a video stream, and a data channel: specifically, it's one audio (mid
audio1
), one video (video1
), one audio (audio2
) and a data. The other video you see (video2
) is NOT added to the offer asJANUS_SDP_OA_ENABLED
isFALSE
(which seems stupid in this example, but is actually quite helpful when you need to decide programmatically if to add something or needed).You could have created a barebone offer and added m-lines later using
janus_sdp_generate_answer_mline
as well: this method uses variable arguments as well, but is limited to a single m-line, so it would start withJANUS_SDP_OA_MLINE
and end withJANUS_SDP_OA_DONE
.As anticipated, answering is a bit more convoluted, instead, as you first create a barebone answer to an offer (which will copy the m-lines from the offer and reject them by default), and then iterate on all the m-lines to decide what to do with them:
You can have a look at some of the updated plugins (e.g., the AudioBridge) to see all this in action.
Media routing
As you know, plugins receive incoming packets using
incoming_rtp
,incoming_rtcp
andincoming_data
, while they send packets back usingrelay_rtp
,relay_rtcp
andrelay_data
.The signatures for the RTP and RTCP methods changed slightly: specifically, we added an
mindex
property to address the index of the stream in the PeerConnection (basically the index of the related m-line in the SDP), and changed thevideo
property to a boolean. As such, the new callbacks are:If you plan to support multiple media streams, you'll have to use the new
mindex
property carefully, or media packets will end up on the wrong stream, and may break media. If you don't, instead, notice that inrelay_rtp
andrelay_rtcp
you can just setmindex
to-1
, and just rely on the booleanvideo
property instead: when you do that, the core will pick the first audio/video stream it finds (depending on the boolean) and use that. Ths is particularly useful when your plugin will still limit itself to one audio/one video per PC, as you'll be spared from the need to track and map mindexes.The
incoming_data
andrelay_data
methods, instead, remain exactly the same, as only a single data channel line is allowed in a PeerConnection in Janus, meaning there will never be any ambiguity as to which index it should refer to (the core will take care of that).Notice that the signature for the
slow_link
callback changed too, again to identify the specific stream that may be addressed by the event via themindex
property:janus.js
While we tried to limit the API changes as much as possible, there were some things we had to change.
onlocalstream/onremotestream
Two of those were the callbacks you use to be notified about your local stream and the remote stream, namely the
onlocalstream
andonremotestream
callbacks.To be more precise, while we could have kept them as they were with some jumps, it made much more sense to rethink them and make them closer to what the JS WebRTC APIs make available, and start notifying tracks instead. As such,
onlocalstream
becameonlocaltrack
whileonremotestream
becameonremotetrack
. Both the new callbacks notify you about tracks, but most importantly they're also supposed to tell you if the track was just added or removed: this means you have more fine grained information to decide what to do with them. It also means you have to keep some more state in your web application, though, e.g., to update the UI depending on what's happening: to make a simple example, if you're notified about an audio track first, you may want to create a placeholder for video; in case video is added later, you can remove the placeholder and replace it with the actual video; in case another video is added, you may want to do something else; and in case some video track is removed, you may want to know if there's video at all, now, before putting the placeholder back.In part these are all things that you had to start doing when we allowed
onlocalstream
andonremotestream
to be called more than once. Anyway, it will require some more logic with this new method. You can have a look at the updated demos to see how we've started to do it there: in particular, you can find more details below, where I address the changes in each plugin.getVolume
Another change that might impact your application is in the methods that return the volume, local and remote, that is
getVolume/getRemoteVolume
(one is an alias of the other) andgetLocalVolume
. The previous version ofjanus.js
did a few things:getStats
, namelyaudioInputLevel
andaudioOutputLevel
;All these things changes. First of all, we know use
getStats
to retrieve the standard audioLevel property instead: this is currently only available in Chrome and possibly Safari, while Firefox doesn't support it yet, but will hopefully soon; sinceaudioLevel
is a double between0.0
and1.0
, it's a completely different value from the one we returned before. Besides, it's worth pointing out that at the moment this only works for the remote volume: for some reason, stats always return 0 for local tracks. That said, you really shouldn't use these methods if you want some volume related stuff: much better to rely on good libraries like hark instead. Those methods now also allow you to specify amid
to query a specific track: this solves the "works for a single audio track" limitation. Finally, the methods now expect you to specify a callback to receive the result: in fact,getStats
returns a promise, which means we can't return synchronously. This means that, using the defaultgetVolume
to get the volume of a remote track as an example, the methods now work like this:You can omit the mid if you know there's only one audio track:
mute/unmute
A smaller and hopefully harmless change is in the mute/unmute methods. As you may know, there are methods to check if your local audio/video tracks are muted or not, and to mute/unmute them:
isAudioMuted()
,isVideoMuted()
,muteAudio()
,muteVideo()
,unmuteAudio()
andunmuteVideo()
. In the oldjanus.js
, they all assume that a single audio/video track is available, which of course won't work if you're sending more than one video track, for instance. The way we solved it was to add an optionalmid
argument, by which you can specify the exact mid of the medium to act upon: the argument is optional, meaning that if you omit it, we'll just pick the first audio/video track instead; this also means this change is completely backwards compatible, if you don't care about multistream. Some examples:getBitrate
Pretty much the same change was done on
getBitrate()
as well, which as you know returns the current bitrate of the incoming video stream. This method now accepts amid
parameter as well: if you specify it, we'll only compute the bitrate for the specified stream. If you omit it, this will work as expected if there's just a single video stream, but will NOT work at all if there's more than one video stream: as such, it's up to you to ensure the method is called properly. A couple of examples:(re)negotiating media
What's still missing in this patch, and something we'll probably have to address, is how to add multiple streams when creating an offer or answer. At the moment we just set an
audio:true
or related stuff for saying "we want to capture our microphone", but what if we want to send multiple audio streams from different sources? Same for when renegotiating: we have theaddAudio
andaddVideo
properties, for instance (same thing for replace/remove), but they also worked on the assumption that a single audio/video stream would be present. I still don't have a clear view of how to fix this, and I'm not sure this can be done easily in a backwards compatible way. One possibility is a new property (however it's called) that's actually an array of things we want in the SDP, with related directions: for each object there are things that can be specified (e.g., type of media, direction, optionally the device ID, etc.), and if it turns out it's something that needs a local device, we do a getUserMedia for that. We'd then return the actual media in the SDP with the related mid properties, so that they can be referenced later on when adding/replacing/removing streams. My feeling is that this new array should be an alternative to the existingmedia
object`, and not a replacement, mostly because older versions of Chrome for instance don't support transceivers. Anyway, I can anticipate this will complicate the janus.js internals.Plugins
This section tries to summarize the changes you need to be aware of in talking to plugins. In fact, while we tried to keep the API as much the same as possible, we had to make some changes, e.g., in order to allow users and/or plugins to address a specific media stream rather than another, especially when they are of the same type (and where previously a
video: true
might have sufficed).EchoTest
At the moment, we haven't changed anything in the API you use to talk to this plugin: you establish a connection the same way, and you configure it the same as well. That said, considering now the plugin supports multiple media streams of the same type, we may want to revisit some of the properties you can send to it: in fact,
video: false
means we shouldn't send back any video, but what if we have two video streams and want only to pause one? Besides, simulcast support in the plugin is quite sketchy, at the moment, so you probably don't want to use it if sending more than one video stream at the same time.As to the
echotest.js
anddevicetest.js
demos, they have both been updated to the newjanus.js
callbacks.Streaming
Creating a mountpoint can still be done as you did so far, but we've introduced a new approach that takes advantage of the expanded libconfig semantics. Specifically, you can now create mountpoints specifying an array of streams to add, rather than just using boolean audio/video properties and hardcoded settings. Eventually, this will become the only way to create RTP mountpoints, so you should consider the other one deprecated.
This is an example of a mountpoint created the new way:
This comes from a new sample I added to the Streaming plugin configuration file to test multistream, and you can test it with a new gstreamer-based script in the repo, called
test_gstreamer_multistream.sh
.As you can see, it's much cleaner in the way you create and configure it: there's no hardcoded audio/video prefix for the name of properties, you configure media streams the same way and just add them to a list. Notice that of course this also works with the simple one audio/one video mountpoints you've used so far. What's important to point out is that you're REQUIRED to specify a
mid
here: this is a unique string that will end up in the SDP, and that you'll get in theonremotetrack
callback as well; as such, it will allow you to uniquely identify an incoming track and match it to a subscription you made.The new approach also works when creating mountpoints dynamically via API: in that case, of course, you pass a JSON array called
media
instead. I haven't tested that part yet, though.This additional information will be returned when you send a
list
orinfo
request, in order to give you more information on what a mountpoint provides, e.g.:With respect to that, while subscribing to a mountpoint currently works as before (
watch
request with the requestedid
), you can decide to subscribe only to a subset of the available media, for whatever reason. In the past, this could be done with theoffer_audio
,offer_video
andoffer_data
properties which, if set to FALSE, would exclude the related media from the SDP offer: these properties are still available for backwards compatibility, but they have to be considered deprecated, since they're not fine grained enough. The right way to do that, now, is adding amedia
array to yourwatch
request with the list of mids you're interested in: an empty or missingmedia
array means "subscribe to all streams", meaning this is 100% backwards compatible. A couple of simple examples:to only subscribe to the "a" mid (which is audio in that mountpoint), or:
to only subscribe to the "a" and "v2" mountpoints (audio and second video stream, in that mountpoint).
As to the
streamingtest.js
demo, it has already been updated to the newjanus.js
callbacks. It still does NOT present those streams in a nice way, though: meaning that if you have, e.g., two video streams, you'll see both, but they won't have any label associated to that to have more context. I plan to update that later on.VideoCall
Pretty much as in the EchoTest plugin (the two are quite similar, after all), we haven't changed anything in the API you use to talk to this plugin: you place and accept calls the same way, and you configure/update them the same as well. That said, considering now the plugin supports multiple media streams of the same type, we may want to revisit some of the properties you can send to it: in fact,
video: false
means we shouldn't send back any video, but what if we have two video streams and want only to pause one? Besides, simulcast support in the plugin is quite sketchy, at the moment, so you probably don't want to use it if sending more than one video stream at the same time.As to the
videocalltest.js
demo, it has been updated to use the newjanus.js
callbacks.AudioBridge
I haven't updated this plugin to really support multistream yet, and I don't expect it will ever be. This plugin was conceived to mix audio packets on a single PeerConnection no matter how many participants are in anyway, so not sure it makes sense to support more streams here at all. As such, it will reject any stream apart from the first audio m-line.
As to the
audiobridgetest.js
demo, it has been updated to use the newjanus.js
callbacks.SIP
I haven't updated this plugin to really support multistream yet and, as for the AudioBridge, I don't plan to do that. I'm not sure legacy SIP devices support that anyway. It will still work if it's just one audio/one video.
As to the
siptest.js
demo, it has been updated to use the newjanus.js
callbacks.SIPre
I haven't updated this plugin to really support multistream yet and, as for the SIP plugin, I don't plan to do that. I'm not sure legacy SIP devices support that anyway. It will still work if it's just one audio/one video.
As to the
sipretest.js
demo, it has been updated to use the newjanus.js
callbacks.NoSIP
I haven't updated this plugin to really support multistream yet and, as for the SIP and SIPre plugins, I don't plan to do that. I'm not sure legacy SIP devices support that anyway. It will still work if it's just one audio/one video.
As to the
nosiptest.js
demo, it has been updated to use the newjanus.js
callbacks.VideoRoom
The VideoRoom plugin fully supports multistream, now: after all, the SFU is where multistream would really shine! We still use separate PeerConnections for publishing and subscribing, but with the ability to receive multiple users (or send multiple streams) on the same PeerConnection. The first bulk of changes was changing the internal pub/sub mechanism: before, we had a tight relationship between publishers and subscribers, considering the safe assumption we could do on their monostream nature; now, we work on streams instead. More precisely, each publisher is a collection of published streams, while each subscriber is a collection of subscription streams: as such, the relationship is between each published stream and its subscription streams, which means we're much more free in terms of how to distribute those. This also makes the job of subscribing to multiple streams, whether they belong to the same publisher or not, much easier.
All of this has mostly affected the plugin internals, trying to keep it as transparent as possible to the API and applications: as such, the existing VideoRoom demo should still work, but with a different way of handling the streams internally. Anyway, the API has been revamped in the process in order to take advantage of the new features: to make an example, while we try to return info the same way as before, we also added stuff to it that we need for multistream. For instance, when you start publishing, you'll get back the streams as we indexed them, e.g.:
and the same info will be returned to other participants when they join, in order to make them aware of who's publishing exactly what, e.g.:
While the above example doesn't list them, we also added a new optional
"description"
property to those streams that publishers can specify: i.e., a simple verbose description to say that video 1 is their webcam, and video 2 their screenshare, and that can displayed in the UI if needed. In case publishers specify it, it will be sent to all interested parties as part of the info shared above. Publishers can do that by adding adescriptions
array to the "publish" request, where each object contains the description to map to a specificmid
, e.g.:The descriptions are optional and can be updated along the way, e.g., via a
configure
request. As you can see, mids play an important role in the new API, and we'll see in the next paragraphs how we now use it a lot to address specific streams in a PeerConnection, whether it's related to a publisher or a subscriber.What is more interesting, though, is the new subscription API, which we changed from the old (still working, but deprecated) approach using the
feed
property to subscribe to a single publisher:to a new, much more flexible one, using an array of
streams
to subscribe to:Each stream we want to subscribe to is made of a mandatory
feed
and an optionalmid
, both information you receive when publishers start sending stuff as we've seen above: if you omitmid
, we subscribe to all the streams this publisher has, otherwise only to one in particular. Obviously, since we're free to add as many streams as we want, we can put streams from different publishers, and pick what we want here. You can even add the same stream more than once (which is what I've done in my tests so far), even though it's silly except for debugging. As anticipated, the oldfeed: id
approach still works, but it's deprecated and translated internally tostreams: [ { feed: id } ]
.Once you've created a subscription, you can update it with the
subscribe
andunsubscribe
requests: specifically, the former allows you to add more streams to subscribe to, while the latter ubsubscribes you from some of the feeds you were receiving. Both will trigger a new offer to the browser. Thesubscribe
request expects astreams
array exactly as the original setup does, meaning that again you can specify one or more streams to add in terms offeed
and/or specificmid
, e.g.:When subscribing, we'll try to re-use empty and unused m-lines when available, and we'll add new ones otherwise; we'll also ignore multiple subscriptions to datachannels, as we only do one (this will be a problem, more on that later). The
unsubscribe
request also expects such an array, but items can refer to a subscriptionsub_mid
as well. This means that you can say, for instance, to unsubscribe to all streams of feed XYZ, just to mid Z of feed XYZ, or to whatever the subscriber mid ABC is (e.g., "get rid of the third m-line"):Notice that, while both
subscribe
andunsubscribe
typically result in a new offer, they will NOT if it turns out you didn't change anything: e.g., you subscribed to something that didn't exist, or unsubscribed to something you weren't receiving in the first place. All of this you can test in a new demo page I added, calledmvideoroomtest.html
: it's visually the same asvideoroomtest.html
(with which it is interoperable), but uses a single PeerConnection to subscribe.The same updates on subscriptions impact the way the "switching" now works in the VideoRoom, obviously. Previously, a "switch" would replace the publisher you were subscribed to with another one, replacing both audio and video: now that we can subscribe to heterogeneous sources, we needed a way to selectively "switch" a subset of the subscriptions instead. This means that the new syntax of "switch" changed as well:
In this example, we're replacing a single stream: specifically, the stream that in our subscription is identified by "1" (
sub_mid
means "m-line identified by mid in my subscription") now needs to be fed by mid "2" of the publisher with id "22". Of course, we can update more streams in the same subscription by just adding more objects with the same syntax to thestreams
array. Notice that the same limitations as the old "switch" apply: the stream you switch to must have the same characteristics (codec, mainly) as the stream you're replacing, as otherwise things will simply not work as no renegotiation takes place. For instance, if mid "2" in the above example is a subscription to a VP9 video stream, trying to switch it to an H.264 video coming from another publisher will simply not work: data will flow, but the recipient will fail to decode. Note well: the "legacy" switch (just passing the feed ID to replace all streams) is still supported, but it's explicitly marked as deprecated; besides, it's best effort (tries to find streams that match) so keeping on using it is probably now a good idea.I was pointing out that we only negotiate a single datachannel per PeerConnection. In the VideoRoom with a multistream setup this might be seen as a problem, if we want to use datachannels, because it means we're getting data channel messages from multiple publishers, and we might not be able to say which belongs to whom. Anyway, now that Janus supports multiple streams in the same datachannel, this is not an issue anymore: now the VideoRoom plugin always uses the publisher's ID as the "label" for outgoing data channel messages, which means that you just need to look at the label of incoming messages to know where it's from.
As to the
videoroomtest.js
,screensharingtest.js
andvp9svctest.js
demos (all of which use the VideoRoom plugin), they have all been updated to use the newjanus.js
callbacks.Record&Play
I haven't updated this plugin to really support multistream yet and, again, I currently don't plan to do that. Maybe later, after the bulk has been done for more important plugins and this PR has been merged. It will still work if it's just one audio/one video.
As to the
recordplaytest.js
demo, it has been updated to use the newjanus.js
callbacks.VoiceMail
I haven't updated this plugin to really support multistream yet and it won't happen. I'm not even sure anyone ever used this plugin, to be honest, as it was mostly a proof of concept... anyway, just as the AudioBridge, it will reject anything apart from the first audio stream.
As to the
voicemailtest.js
demo, it has been updated to use the newjanus.js
callbacks.TextRoom
This plugin only uses datachannels, so it's not affected by this change.
As to the
textroomtest.js
demo, it has NOT been updated to use the newjanus.js
yet, but it's not an issue as it doesn't useonlocalstream
oronremotestream
anyway (it doesn't do audio or video, only data).Lua and Duktape
I haven't updated these plugins to really support multistream yet and, again, I currently don't plan to do that: it may not even be needed, if it turns out the abstractions we have in place do the job already. If not, we may look at these later, after the bulk has been done for more pressing plugins and this PR has been merged. They will still work if it's just one audio/one video.
What's missing?
At this stage and after all these changes, very little: we still need some tweaks to
janus.js
(e.g., the ability to specify more streams to send or how to respond to streams individually, and update them accordingly when adding/removing/replacing tracks), but apart from that, we're just waiting for your feedback to see if there are things we need to fix, both in terms of regressions and the new features.That's all, folks!
As anticipated, all I need now is feedback, so start testing! Feedback will help greatly. I'm especially interested in feedback on the APIs as well (janus.js, plugins), in order to make sure we don't end up making stupid choices that could have been done better.
Thanks!