-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix transport-wide CC feedback when simulcast SSRCs are missing #2908
Conversation
…he first simulcast layer never gets data.
I've just signed the CLA. |
Thanks @OxleyS for the effort. Did you try what happens with Firefox when different layers SSRCs are being used for twcc media ? |
Firefox always advertises SSRCs so it's not an issue I guess. It's Chrome that doesn't when rid-based simulcast is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few notes inline. That said, I think the approach may need to be changed, since you're applying that logic only to a single medium instance, the one we choose initially in janus_ice_outgoing_transport_wide_cc_feedback
where we say /* Find inbound video medium */
. Under normal circumstances that's ok, but a PeerConnection may contain more than one video stream, so why should only one of them be analysed for this process, especially considering you may end up without a SSRC? (you currently have a warning for that).
You may want to move your SSRC lookup login in the previous while instead:
while (g_hash_table_iter_next(&iter, NULL, &value)) {
janus_ice_peerconnection_medium *m = value;
if(m && m->type == JANUS_MEDIA_VIDEO && m->recv) {
medium = m;
break;
}
}
so that you can always move to another medium candidate in case the previous one doesn't contain an SSRC that might be of use.
I'm also wondering if we might extend the selection to audio m-lines as well: I can't remember if we negotiate TWCC for audio streams as well, even though that may be problematic (especially if browsers do not take those into account instead).
src/ice.c
Outdated
/* Simulcast layers that have not received data will not have a recorded peer SSRC. */ | ||
/* Pick the first layer that has a peer SSRC */ | ||
guint32 ssrc_peer = 0; | ||
for(int i = 0; i < 3; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per code style, we don't define variables inside the for
loop, so please define i
outside.
src/ice.c
Outdated
break; | ||
} | ||
} | ||
if (ssrc_peer == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code style: no space between if
and bracket.
src/ice.c
Outdated
} | ||
} | ||
if (ssrc_peer == 0) { | ||
JANUS_LOG(LOG_WARN, "No peer SSRC, cannot send transport-wide CC feedback\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't display a warning here: in case it happens, this is going to spam the logs a lot, since TWCC feedback is very frequent (5 to 10 times per second by default). I'd make it a LOG_HUGE
instead.
Thanks for the review - I'll hopefully be able to work on the points you brought up later today. |
As I understand it, we have to take bundling into consideration. If audio/video are being bundled, then we can and should include audio SSRCs in the SSRC selection process. If they're not being bundled, however, then not only must we not consider audio SSRCs for video feedback, but we actually also should be sending multiple sets of feedback reports, one per transport. Does Janus force bundling? Peeking through the code indicates a strong one-to-one relationship between PeerConnections and ICE handles. One area the spec is unclear on is what happens when two m-lines are bundled, but only one of them negotiates TWCC. Perhaps we can just ignore this case, and treat it as if both specified TWCC? This seems to be current Janus behavior.
I peeked around the code for usages of VideoRoom, meanwhile, only inserts it into SDP (both offer and answer) for video m-lines. I'm wondering if this could cause issues down the line if, say, a connection first negotiates audio-only, then adds video later. |
Media is always bundled in Janus. Not all media lines may be part of the TWCC process, though, as it depends whether or not the extension (and RTCP feedback) is negotiated on all of them or not. IIRC that's only true for video streams in browsers, but I may be remembering things wrong.
Shouldn't be an issue, as we update our references to RTP extensions any time an update takes place. |
Tested this just now, it seems Chrome negotiates it for both, at least in the latest version. Firefox still does video-only. If we want to use this to gate which SSRCs we consider for the feedback report, then we might want to move the That's an increase in scope, though, so for this PR maybe we can just expand to all video SSRCs as you suggested, and take care of audio in a followup if so desired.
I was more worried about whether browsers would do the right thing, although I tested that too with Chrome/Firefox and it seems they handle it fine. |
From my point of view, even extending the search to all video m-lines is out of scope here (the issue was about missing rid layers in simulcast). The proposed change is still an improvement over the existing code and fixes the issue. |
Sure, we can merge as is (removing the LOG_WARN at least though), but this would need to be addressed in another commit nevertheless. |
Extending to all video is not a big deal, I'll just do it now (and fix the LOG_WARN thing). |
…ms, not just the first one. Changed the log level to LOG_HUGE for when no valid SSRC is found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Added a couple of editorial notes (the changes look fine to me otherwise).
src/ice.c
Outdated
ssrc_peer = m->ssrc_peer[i]; | ||
medium = m; | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be simplified as
int i = 0;
for(i=0; i<3; i++) {
if(m->ssrc_peer[i] > 0) {
medium = m;
ssrc_peer = m->ssrc_peer[i];
break;
}
}
if(medium && ssrc_peer)
break;
as I feel the second check is a bit more "awkward" (and possibly error prone should we decide to increase the number of substreams in the future), and the whole block could be made more compact.
src/ice.c
Outdated
JANUS_LOG(LOG_WARN, "No peer SSRC, cannot send transport-wide CC feedback\n"); | ||
return G_SOURCE_CONTINUE; | ||
} | ||
if(!medium) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a small nit and personal preference in terms of code style, we typically use if(medium == NULL)
for checks like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found instances of both styles in the code base and wasn't sure which to use. Is there a coding style document I could refer to in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're 100% right, and apologies for the confusion: that's mainly because of the many contributions we've received so far (and of which I'm grateful for of course), where we haven't always enforced a specific code style. Unfortunately we don't have a reference for that, and I'm not really sure we'll prepare one: even @atoppi and I write things a bit differently 😆 These are small nits that can be ascribed to my excessive pedantry on the matter, so no need to concern too much on that!
Thanks! Incidentally, have you checked if this indeed fixes the stuttering video when the first substream is not sending any data and so the first SSRC is 0? I'll make a test myself later as well. |
Yeah, I've been checking it using the project where I originally noticed the behavior. |
Just tested and I can't replicate the delayed/stuttering video anymore, so I'll merge. Thanks! |
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [meetecho/janus-gateway](https://github.com/meetecho/janus-gateway) | major | `v0.11.7` -> `v1.0.0` | --- ### Release Notes <details> <summary>meetecho/janus-gateway</summary> ### [`v1.0.0`](https://github.com/meetecho/janus-gateway/blob/HEAD/CHANGELOG.md#v100---2022-03-03) [Compare Source](meetecho/janus-gateway@v0.12.0...v1.0.0) - Refactored Janus to support multistream PeerConnections \[[PR-2211](meetecho/janus-gateway#2211)] - Moved all source files under new 'src' folder to unclutter the repo \[[PR-2885](meetecho/janus-gateway#2885)] - Fixed definition of trylock wrapper when using pthreads \[[Issue-2894](meetecho/janus-gateway#2894)] - Fixed broken RTP when no extensions are negotiated - Added checks when inserting RTP extensions to avoid buffer overflows - Added missing support for disabled rid simulcast substreams in SDP \[[PR-2888](meetecho/janus-gateway#2888)] - Fixed TWCC feedback when simulcast SSRCs are missing (thanks [@​OxleyS](https://github.com/OxleyS)!) \[[PR-2908](meetecho/janus-gateway#2908)] - Added support for playout-delay RTP extension \[[PR-2895](meetecho/janus-gateway#2895)] - Fixed partially broken H.264 support when using Firefox in VideoRoom - Fixed new VideoRoom rtp_forward API ignoring some properties - Fixed deadlock and segfault when stopping Streaming mountpoint recordings \[[Issue-2902](meetecho/janus-gateway#2902)] - Fixed RTSP support in Streaming plugin for cameras that expect path-only DESCRIBE requests (thanks [@​jp-bennett](https://github.com/jp-bennett)!) \[[PR-2909](meetecho/janus-gateway#2909)] - Fixed RTP being relayed incorrectly in Lua and Duktape plugins - Added Duktape as optional dependency, instead of embedding the engine code \[[PR-2886](meetecho/janus-gateway#2886)] - Fixed crash at startup when not able to connect to RabbitMQ server - Improved fuzzing and checks on RTP extensions - Removed distinction between simulcast and simulcast2 in janus.js \[[PR-2887](meetecho/janus-gateway#2887)] - Other smaller fixes and improvements (thanks to all who contributed pull requests and reported issues!) ### [`v0.12.0`](meetecho/janus-gateway@v0.11.8...v0.12.0) [Compare Source](meetecho/janus-gateway@v0.11.8...v0.12.0) ### [`v0.11.8`](https://github.com/meetecho/janus-gateway/blob/HEAD/CHANGELOG.md#v0118---2022-02-11) [Compare Source](meetecho/janus-gateway@v0.11.7...v0.11.8) - Added initial (and limited) integration of RED audio \[[PR-2685](meetecho/janus-gateway#2685)] - Added support for Two-Byte header RTP extensions (RFC8285) and, partially, for the new Depencency Descriptor RTP extension (needed for AV1-SVC) \[[PR-2741](meetecho/janus-gateway#2741)] - Fixed rare race conditions between sending a packet and closing a connection \[[PR-2869](meetecho/janus-gateway#2869)] - Fix last stats before closing PeerConnection not being sent to handlers (thanks [@​zodiak83](https://github.com/zodiak83)!) \[[PR-2874](meetecho/janus-gateway#2874)] - Changed automatic allocation on static loops from round robin to least used \[[PR-2878](meetecho/janus-gateway#2878)] - Added new API to bulk start/stop MJR-based recordings in AudioBridge \[[PR-2862](meetecho/janus-gateway#2862)] - Fixed broken duration in spatial AudioBridge recordings - Fixed broken G.711 RTP forwarding in AudioBridge (thanks [@​AlexYaremchuk](https://github.com/AlexYaremchuk)!) \[[PR-2875](meetecho/janus-gateway#2875)] - Fixed broken recordings in NoSIP plugin - Fixed warnings when postprocessing Opus recordings with DTX packets - Other smaller fixes and improvements (thanks to all who contributed pull requests and reported issues!) </details> --- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox. --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). Reviewed-on: https://git.walbeck.it/walbeck-it/docker-janus-gateway/pulls/65 Co-authored-by: renovate-bot <[email protected]> Co-committed-by: renovate-bot <[email protected]>
When a publisher never sends video frames for a simulcast layer (whether due to its own heuristics or the layer being disabled by the user), the SSRC for that layer will remain unknown. The TWCC feedback mechanism was always picking the first layer's SSRC, causing the report to default to an SSRC of zero if the layer never got any data.
As I understand it, being transport-wide, any SSRC going over the transport will do, so this simply goes through each simulcast layer and picks the first SSRC that exists.
The bug this PR fixes turned out to be the underlying cause of the behavior described here. In short, the lack of feedback reports with known SSRCs caused the publisher to continually downgrade the BWE for the connection, until it got low enough that the VP8 encoder fell over, causing the timestamps on frames to fall out of pace.