Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference count janus_request instances #2020

Merged
merged 3 commits into from
Mar 26, 2020
Merged

Reference count janus_request instances #2020

merged 3 commits into from
Mar 26, 2020

Conversation

lminiero
Copy link
Member

While investigating #2005, and while @atoppi tried to replicate the issue, we found out that for some reason janus_request was one of the few structures still not using our reference counter mechanism. This janus_request structure contains a reference to the transport instance that originated it, and is actually used in two different ways:

  1. any time we receive a request from a transport, we create an instance of janus_request and we destroy it when the request has been served (or acked, if it's an asynchronous plugin message);
  2. when a session is created, we create an instance as well, in order to keep track of the transport instance we should send events to.

The latter in particular seems to be the root cause of issues like #2005: since this request object contains a reference to the transport instance, if that request is destroyed while it's used (e.g., in a send_message on the transport) it can cause a crash. The fact that this is not refcounted makes it of course much harder to track and protect accordingly.

As such, this patch adds refcount support to that struct, and adds a way to (try and) ensure it's not destroyed while it's used. Specifically, now each attempt to mess with session->source (which is the pointer the janus_request the session owns) is protected by the session mutex: rather than just wrap any attempt to use that request and its transport (e.g., to interact with transport plugins), I added a new method that, using the mutex, copies the pointer and increases the refcount, while we work on the copy for what we need to do. Basically, you'll find something like this:

janus_request *source = janus_session_get_request(session);
if(source && source->transport)
	source->transport->session_over(source->instance, session->session_id, FALSE, FALSE);
janus_request_unref(source);

where janus_session_get_request returns a pointer to session->source with an additional reference (hence the janus_request_unref after we've used it) which has been obtained locking the session mutex. This way, even if something else unrefs the session request, it will not be destroyed until we're actually done with it.

This should also mitigate some risk of deadlock we had before, where we were invoking transport methods with the session mutex locked: while none of the transports invoked a core callback from those method, thus risking a new session mutex lock, new or third party transports might, and so this now avoids it, as we only use the mutex to get a reference to the request we need, before invoking the transport method.

I only did some brief tests here, and it doesn't seem to introduce leaks, but of course we'll need some more extensive testing to be sure this does indeed fix the HTTP issue mentioned in #2005, and if it does that it doesn't introduce regressions in HTTP or other transports.

@lminiero lminiero requested a review from atoppi March 24, 2020 18:24
Copy link
Member

@atoppi atoppi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After latest changes I'm not able to crash it anymore, so lgtm 👍

@agclark27
Copy link
Contributor

We're running Lorenzo's commits here in our production environments as of about 12 hours ago, and we're seeing very positive results as well. I'll be contributing a separate PR for review in a short bit that includes some other changes we've made that are improving stability.

@lminiero
Copy link
Member Author

@agclark81 you may want to start a fresh test with the latest commits, as we've fixed a few other (unrelated to the original issue) crashes that appeared. But if you can confirm the original crash isn't there anymore, that's good news indeed!

@lminiero
Copy link
Member Author

@agclark81 any objections to merge this? I'd like to tag a new release soon (possibly by today), and this is one of the fixes I want to be included since it's the release providing the HTTP refactoring.

@agclark27
Copy link
Contributor

Seems OK to merge, and we've been testing with the additional HTTP refactoring, too, and everything is working well.

@lminiero
Copy link
Member Author

Thanks, merging then! Looking forward to the additional fixes you mentioned 👍

@lminiero lminiero merged commit a0fe9ca into master Mar 26, 2020
@lminiero lminiero deleted the request-refcount branch March 26, 2020 14:45
voicenter added a commit to voicenter/janus-gateway that referenced this pull request Apr 24, 2020
* Updated link to project in resources (docs)

* Add exception var to catch stmt to fix rollup (meetecho#1848)

* Fixed typo

* fix nullptr dereference in streaming plugin (meetecho#1855)

* VP9 SVC fixes (meetecho#1849)

* Fixed SIP hangup not sending CANCEL, when inviting (fixes meetecho#1856)

* Use strtol more, and add checks when atoi is used (meetecho#1852)

* Fixed broken code in AudioBridge

* Fixed regression when setting up DataChannels

* Fix RTP fuzzing target according to recent VP9 changes.

* Fixed rare race condition in HTTP plugin that could cause leak (fixes meetecho#1665)

* add missing closing curly bracket (meetecho#1859)

* Don't scan libnice version if it wasn't retrieved (fixes meetecho#1858)

* Fixed wrong clock rate being used for RTP header updates when using G.722

* Feature/ignore unreachable ice server (meetecho#1854)

* Keep track of clock rates associated to payload types, for RTCP

* Don't send RTCP SR if outgoing media has been disabled via SDP update

* Bumped version in postprocessing tool as well

* Fixes to RTSP latching procedure (fixes meetecho#1536, replaces meetecho#1851) (meetecho#1866)

* New functionality to add custom Contact URI params to SIP REGISTER (meetecho#1874)

* Reduced verbosity of some lines in the SIP plugin

* Reduced default twcc_period value from 1s to 200ms

* SIP plugin: custom (non-standard) headers on incoming events (requests) (meetecho#1873)

* Bumped to version 0.8.0

* Gzip compression utility in the core (and sample event handler) (meetecho#1846)

* New category of plugins for modular logging (meetecho#1814)

* Fixed linking error for post-rocessing tools after recent changes

* Remove option to enable rtx (now always supported, when negotiated) (meetecho#1877)

* Updated documentation to include some info on the new logger modules

* Avoid gzip functions when fuzzing in OSS and add zlib dependency when fuzzing locally.

* Fixed exception to GPL code (see meetecho#713)

* Fixed wrong default folder for loggers

* Added link to new video on Simulcast and SVC to docs

* Add CHANGELOG.md file into the project (meetecho#1885)

* Fix RTSP SETUP when url includes query string parameters (fixes meetecho#1869) (meetecho#1875)

* Added changelog (and info on tagged versions) to documentation

* [Suggestion] Started the refactoring of the janus.js (meetecho#1830)

* Make sure libcurl is available before using CURL_AT_LEAST_VERSION (fixes meetecho#1887)

* Fixed small typos in demos

* Fixed obsolete value for TWCC period default in docs/hints

* Make sure the installed libcurl knows about CURL_AT_LEAST_VERSION

* Fixed variable shadowing

* Added fwrite checks in record.c (warnings only)

* Updated changelog (v0.8.0)

* Bumped to version 0.8.1

* Remove SIPre plugin from the repo (meetecho#1894)

* Binary data support in data channels (meetecho#1878)

* Fixed typo in SIP plugin

* Allow RTCP ports to be picked randomly using 0, in Streaming plugin

* Check if rtcp port is > 0 before creating a RTCP socket.

* Revert "Check if rtcp port is > 0 before creating a RTCP socket."

This reverts commit a0b7dbf.

* Check if rtcp port is > 0 before creating a RTCP socket, in Videoroom plugin.

* Add in mountpoint/forwarder create response the allocated RTCP ports.

* he 'referred_by' field currently holds the SIP URI value copied from the (meetecho#1896)

* Fixed warnings introduced in meetecho#1896

* Fixed leak in SIP plugin (fixes meetecho#1897)

* Fixed occasional memory leak in Streaming plugin (fixes meetecho#1900)

* Fix out of bounds array access for last_spatial_layer (meetecho#1906)

* startup: only close the logger directory if it was opened (meetecho#1903)

* Only close the event handlers directory if it was opened (see meetecho#1903)

* fixed typo (meetecho#1916)

* Move loggers cleanup to end of logger thread (fixes meetecho#1904)

* Fixed late initialization of janus.js constructor callbacks (fixes meetecho#1912)

* Added reference to Snap repo in resources (docs)

* Fixed warnings when building DTLS bio code

* Don't keep TextRoom plugin loaded if data channels were not compiled

* Updated year in demos and docs

* Use sendBeacon instead of sync XHR in onbeforeunload (fixes meetecho#1902) (meetecho#1918)

* Fixed occasional buffer overflow error when post-processing H.264 recordings

* Increase buffer when post-processing VP8/VP9 recordings too (see previous commit)

* Updated Changelog

* Bumped to version 0.8.2

* Fix a possible race condition when joining as a subscriber and destroying the session. (meetecho#1911)

* More verbose output on postprocessing output error

* Fixed reference to deprecated configuration file

* Added check on AudioBridge instance in setup_media (fixes meetecho#1923)

* Added missing check on SDP attribute value existence

* Add new configuration property to add protected folders not to save to (meetecho#1919)

* Fixed undefined reference when building postprocessor utilities

* Better parsing of RTSP messages (see meetecho#1922) (meetecho#1925)

* Fixed undefined reference when building fuzzers

* Add missing mutex unlocks in videoroom message handler.

* Add math library when fuzzing locally.

* Add audio skew compensation to janus-pp-rec. (meetecho#1870)

* Updated man file for janus-pp-rec

* Remove odd respond to automatically responded OPTIONS request (meetecho#1930)

* Fix g_async_queue usage (meetecho#1929)

* typo (meetecho#1934)

AudioBridge documentation typo in request mute|unmute

* Fixed broken links in docs (plugins list)

* Removed deprecated warning in screensharing demo

* Removed deprecated text from screensharing demo

* Fixed helpers not being able to send SUBSCRIBE requests in SIP plugin

* Small tweaks after static analysis

* Added Coverity badge

* Janus Travis CI integration (meetecho#1932)

* Updated Changelog (0.8.2)

* Bumped to version 0.9.0

* Refactoring of core-plugin callbacks and RTP extensions termination (meetecho#1884)

* Support for transport-wide CC on outgoing streams (meetecho#1889)

* Dynamically update NACK queue size depending on RTT (meetecho#1867)

* Fixed broken RTP fuzzer

* Fixed typo when adding audio attribute to SDP

* Fixed RTCP parsing issue found by OSS-fuzz

* Fix volume-related functions in janus.js (meetecho#1935)

* Fixed leak when parsing broken TWCC RTCP message (Credit to OSS-Fuzz)

* Add travis_retry to git clone commands.

* Fixed occasional segfault when parsing TWCC RTCP message (Credit to OSS-Fuzz)

* Add OSS-Fuzz badge.

* Fixed regression on video bitrates when using monodirectional PeerConnections

* Update janus_audiobridge.c (meetecho#1938)

The target of participant should also acknowledge the latest mute/unmute status which has been made by administrator.

* Travis libnice clang flags (meetecho#1941)

Do not check cast-alignment errors when compiling libnice with clang.

* Fixed occasional error messages on console when trying to add RTP extensions

* Update debugging section in Janus documentation.

* Optimized parsing of TWCC RTCP message (Credit to OSS-Fuzz)

* Renamed corpora file

* Avoid RTP header memory misalignment in rtx packets (meetecho#1943)

* We should allow to have ICE-TCP enabled without ICE Lite. Recent versions of libnice allow this combination and gather tcp passive candidates etc. in this setup. (meetecho#1946)

* conf: transports: document events option (meetecho#1952)

* Updated Changelog (0.9.0)

* Bumped to version 0.9.1

* Configurable global prefix for log lines (meetecho#1940)

* add missing callbacks.error check (meetecho#1959)

* janus_sip: add missing check for NULL (meetecho#1963)

Fixes meetecho#1962

* Remove Sofia reference from the title of the SIP demo

* rtp: drop dead code in rtp_header_update callers (meetecho#1964)

* Subtype for some event, and better docs for event handlers (fixes meetecho#1953) (meetecho#1957)

* Added link to new event handlers documentation to the doc main page

* Removed unused variables

* Added license badge to the README

* Small tweaks to demo intro text

* Detect H264 key frames with smaller SPS units (meetecho#1965)

Reduces the H264 keyframe length check from 16 to 6 bytes.
6 bytes seems to be the lower bound of any possibly valid SPS NAL unit,
based on Section 7.3 of the H264 specification.

For reference, we have been observing Chrome 80 producing SPS units
of 12 bytes or less.

* Support for strings as unique IDs in AudioBridge, VideoRoom, TextRoom (meetecho#1880)

* If glib is too old, generate uuid manually when needed (see meetecho#1880)

* Fixed errors creating VideoRoom when strings are used (see meetecho#1880)

* Remove duplicated codecs when answering SIP call (meetecho#1966)

* Fixed a couple of JSON attributes in VideoRoom when strings are used (see meetecho#1880)

* Make sure a publisher exists when asking for a VideoRoom subscriber renegotiation (fixes meetecho#1970)

* Added errno info when socket operations fail in Streaming plugin

* Fixed typos in TextRoom

* Support for strings as unique mountpoint IDs in Streaming plugin (meetecho#1969)

* fix meetecho#1967 (meetecho#1968)

Fixed error callback not being invoked when an HTTP error happens trying to attach to a plugin

* Added checks on nice_address_set_from_string (fixes meetecho#1973)

* Fixed broken method signature in Streaming plugin when not using libcurl

* Remove /root from the list of protected folders. Make comment text more clear.

* Valgrind fixes for sockaddr structs (meetecho#1976)

Avoid use of uninitialized members

* Hide libcurl from pkg-config when testing travis-ci with LIBCURL = NO.

* Fixed leak when creating Streaming mountpoint dynamically

* Reduced log level to info when logger and event handlers are not found (meetecho#1980)

* Always use base SSRC when recording VideoRoom simulcast participant

* Removed wrong comment

* Fixed broken DTMF in SIP demo

* Add UI to SIP demo to remove helpers, when created

* Fixed occasional missing referred-by info in SIP demo

* Reply to incoming REFER with 202 right away, not 100, in SIP plugin

* Added more checks on nice_address_set_from_string (fixes meetecho#1973) (meetecho#1981)

* Several enhancements to SIP demo

* Fixed abort at server shutdown after using SIP transfers

* Fixed typo in SIP demo code

* Updated Changelog (0.9.1)

* Bumped to version 0.9.2

* Make prebuffering in AudioBridge configurable (meetecho#1975)

* Add G.711 support to the AudioBridge plugin (meetecho#1979)

* Added maximum value for AudioBridge prebuffering property

* Converted HTTP transport plugin to single thread (meetecho#1173)

* Added -f to rm in html Makefile.am (fixes meetecho#1985)

* Small fixes for TypeScript declaration file (meetecho#1986)

Based on the current RTCConfiguration spec (https://w3c.github.io/webrtc-pc/#dom-rtcconfiguration), iceServers does not expect an array of strings.
Updating to type provided by TypeScript's lib.dom.d.ts

* ice: ensure that stream is non-NULL (meetecho#1987)

This fixes a crash on later stream checks (e.g., transport_wide_cc et al).

* Fixed typo in querylogger_parameters (copy/paste error) (meetecho#1989)

* Fixed double unlock when listing private rooms in AudioBridge (meetecho#1988)

* Make sure the session still has a reference when cleaning up HTTP requests

* Fixes to leaks and race conditions in VoiceMail plugin (meetecho#1993)

* Several fixes to session management in VideoCall plugin (meetecho#1994)

* update dtls ciphers (meetecho#1995)

* Implement ECDSA Certificate generation (meetecho#1997)

* Small tweaks to meetecho#1997 (renamed, moved and documented RSA property in janus.jcfg)

* Fix rare race condition when claiming sessions (meetecho#1990)

* Fix occasional deadlock in VideoRoom (2) (credits to @mivuDing, fixes meetecho#1982) (meetecho#1984)

* Added option to enforce validation on DTLS certificates (meetecho#1992)

Made DTLS ciphers configurable as well

* Fixed typo when renegotiating audio in janus.js (fixes meetecho#2002)

* Added option to ignore mDNS candidates (meetecho#1998)

* Fixed deadlock when using claim on HTTP transport (fixes meetecho#2000)

* Support for RTSP 'Content-Base' header in Streaming plugin (meetecho#1999)

* Added link to FOSDEM 2020 talk on RTP forwarders to the docs

* Fixed small leak in SIP plugin when holding calls

* Added called URI to 'incomingcall' and 'missed_call' events in SIP plugin

* Add repos for openSUSE and SUSE (meetecho#2009)

* Use user_id_str for kicked, leaving, and unpublished events, if enabled. (meetecho#2010)

Co-authored-by: Michael Shiel <[email protected]>

* http_transport: add NULL checks (meetecho#2012)

Refs meetecho#2005

* Update media direction in SIP plugin if remote address is 0.0.0.0 ('hold' fix) (meetecho#2013)

* Prepare RTCP Sender Reports by considering the last RTP timestamp sent. (meetecho#2007)

* Track pending nack cleanup tasks and cancel them when freeing a stream. (meetecho#2014)

* Fixed typo in janus.js error code (fixes meetecho#2018

* Reverted change on janus.js (see meetecho#2018)

* Resolve mDNS candidates asynchronously with GResolver (see meetecho#1998) (meetecho#2004)

* Reference count janus_request instances (meetecho#2020)

Added better management of refcount on HTTP session when using it too, and refcount support to hanus_http_msg as well

* Updates to mutex unlocking in textroom and videoroom plugins (meetecho#2026)

* Updated Changelog (0.9.2)

* Bumped to version 0.9.3

* Add Python aiortc-based functional testing. (meetecho#1971)

* test_aiortc: cleanup (meetecho#2027)

* Fixed missing refcount init for Admin API (fixes meetecho#2029)

* Bumping back to 0.9.2 to re-tag

* Updated changelog for 0.9.2

* Bumped to version 0.9.3 (again)

* janus_http: return earlier if request is NULL (meetecho#2031)

* Fixed janus-pp-rec build warnings when using ffmpeg >= 4.x

* Fixed VideoRoom destroy not working when using strings

* Fixed av_register_all deprecation check in post-processor

* plugins: drop tautology (meetecho#2041)

gateway is always set before initialized, so the latter is always true.

* Don't set ICE credentials when parsing remote credentials (meetecho#2046)

* Detect libsrtp(2) using pkg-config (fixes meetecho#2019) (meetecho#2033)

* Added support for static Opus files to Streaming plugin (meetecho#2040)

* Added support for generic metadata to Streaming mountpoints

* Fixed printout of metadata in Streaming demo

* Added notes on building libsrtp (see meetecho#2024)

* Add configurable DSCP ToS for PeerConnections (meetecho#2055)

* Always add remote candidates from the libnice loop (see meetecho#2045) (meetecho#2048)

* Fixed Streaming destroy not working when using strings

* Use refcount for Streaming plugin helper threads (meetecho#2039)

* Added option to disable building AES-GCM support (see meetecho#2024 and meetecho#2054)

* Fixed typo

* Fixed outdated info in VideoRoom docs

* Fixed syntax error in sample Streaming plugin configuration file

* Support for additional constraints on screenshare media (meetecho#2043)

* refactoring-clean up (const-var, semicolons, ===, etc.) (meetecho#2044)

* Reference subscriber when handling related messages (see meetecho#2045) (meetecho#2061)

* Added option to configure time needed to detect a missing simulcast substream (meetecho#2063)

* Reverted isTrickleEnabled check in janus.js (fixes meetecho#2064)

* Don't show warnings for rtx RTCP packets

* Made libnice warning clearer, and upped suggested version (fixes meetecho#2069)

* Add missing info to videoroom "list" response (meetecho#2068)

* Use custom GSource to handle HTTP request timeouts (see meetecho#2062 and meetecho#2066) (meetecho#2075)

* Define the libnice version string as extern in version.h (fixes gcc10 error)

* Fixed AudioBridge create API not working properly when using string IDs

* Fixed a few typos in AudioBridge errors

* Fix copy-paste error in Streaming plugin docs

* Fix libasan use after free in janus_videoroom_handler when events are enabled (meetecho#2091)

* Added project to resources in the docs

* Return mountpoint IP addresses, if a bind interface/IP was provided

* Swap RR/SR Report Blocks if the first block contains rtx data. (meetecho#2089)

* Add support for playback of audio files in AudioBridge (meetecho#2088)

* Updated Changelog (0.9.3)

* Bumped to version 0.9.4

* Fixed returned address when adding multicast Streaming mountpoints

* More checks when hanging up VideoRoom subscriber (see meetecho#2087) (meetecho#2093)

* Added new docker image to the resources in the docs

* Updated AudioBridge documentation with new playback feature

* Don't wait forever for candidates when half-trickling

* Add some missing static declarations to HTTP and WS transports.

Co-authored-by: Lorenzo Miniero <[email protected]>
Co-authored-by: Agustin Polo <[email protected]>
Co-authored-by: Yongje Lee <[email protected]>
Co-authored-by: Alessandro Toppi <[email protected]>
Co-authored-by: Sebastian Schmid <[email protected]>
Co-authored-by: Imer Husejnovic <[email protected]>
Co-authored-by: Oscar <[email protected]>
Co-authored-by: Irek <[email protected]>
Co-authored-by: Tristan Matthews <[email protected]>
Co-authored-by: Jon Rafkind <[email protected]>
Co-authored-by: kuekerino <[email protected]>
Co-authored-by: Yurii Cherniavskyi <[email protected]>
Co-authored-by: Meirza Arson <[email protected]>
Co-authored-by: Groupboard <[email protected]>
Co-authored-by: Cameron Lucas <[email protected]>
Co-authored-by: hxl-dy <[email protected]>
Co-authored-by: Alessandro Amirante <[email protected]>
Co-authored-by: mp16 <[email protected]>
Co-authored-by: Paul Zhang <[email protected]>
Co-authored-by: Philipp Hancke <[email protected]>
Co-authored-by: Sean DuBois <[email protected]>
Co-authored-by: Ancor Gonzalez Sosa <[email protected]>
Co-authored-by: Michael Shiel <[email protected]>
Co-authored-by: Michael Shiel <[email protected]>
Co-authored-by: agclark81 <[email protected]>
Co-authored-by: Alex Pavlov <[email protected]>
Co-authored-by: alexamirante <[email protected]>
Co-authored-by: Federico Lorenzi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants