-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory loss #2624
Comments
Better to use libasan to track leaks, valgrind makes everything too slow to be meaningful. |
Asan is reporting the same: |
@mirkobrankovic can you list the steps to reproduce the leak ? |
@atoppi using Audiobridge and Videoroom plugins, 3 members in each, all 3 members subscribe to other 2 in Videoroom. |
Why are you not destroying sessions too ? |
Unfortunate/Accidental design flow I guess :D |
for example: but I do have a reclaim_session option: |
It is not clear how many sessions are you using though ( |
This use case creates 4 janus sessions over one ws connection (each janus session is responsible for one plugin, for all users) and it is destroyed in the end. |
Another test I did:
Some leaks are reported, but I think they are the data being hold by janus due to session reclaiming. Anyway this is not exactly the same signalling as yours, because the connection here is being torn down when the tab refreshes. |
@atoppi thanks, I will do some more tests and come back if I find new info |
Managed to reproduce exactly your same trace
It is the result of:
So yeah maybe we could fix something to avoid this unpleasant leak report but it's not clear if this might be an issue in the long run. For example I'd like to know what happens if the session is going to be reclaimed: will the "leak" increase? |
I don't think this needs fixing: if the session hasn't been destroyed (and the reclaim increases the time it lives), then it will leak and it's normal. We do cleanup sessions at shutdown but some may still be there. It would be an issue if during the Janus lifetime a session ended and the reclaim timed out too, but resources were not cleared, which from what I understood is not what's happening here. |
Yeah, I agree. |
No need to close, actually, I was just discussing with @atoppi that there may be a better way of cleaning up sessions at shutdown, which apparently we're not doing exactly as I remembered we did. It is lower priority (since leaks at shutdown are less of an issue), but worth keeping this issue open as a reminder. I'll let you know when there's a PR to test, thanks for bringing this to our attention! |
Besides that enhancement on the shutdown operations, it's still unclear the root of the potential leak shown in the graph.
|
If Janus was compiled with libasan support, the memory increasing may be because of that (libasan will keep info on all allocations in memory to know about leaks, and so will never stop growing). |
The signaling backed logic didn't change, since Janus 0.7.4 running on Ubuntu16. |
@mirkobrankovic so did you manage to identify a critical use case that triggers the issue ? |
no I tried different scenarios but without reclaim session and no leak so far, i'll give it a bit more try with that setting again |
@atoppi tonight I will remove session reclaim setting and see if it will make a difference in production |
@mirkobrankovic |
Yes, zabbix agent is running all the time, there is one check that it is pooling from Janus admin api (number of handles of all sessions), js script run by nodejs, ws connection to admin API. |
Are event handlers enabled too, by the way? |
@lminiero yes indeed, |
@atoppi yes will do, fixed 3/4 of them, but i'm not sure about: |
what version of janus is that ? |
latest master, |
that line does not seem to match any malloc janus-gateway/plugins/janus_videoroom.c Line 6428 in 216d70c
|
yeah, I get the same line, which confused me :/
not sure how serious those 2 indirect leaks can be ... |
@mirkobrankovic are you compiling with |
This one seems to be triggered by the notification of a started instance to the event handlers. |
I used the docs example: |
Can you try again ( |
I did and I get these 2 outputs based on how many event handlers I have compiled and enabled (I was testing this): |
@atoppi let me see, it might be that I didn't json_decref(event). :/ EDIT:
Should be all in PR #2634 |
Thanks, we'll review it asap. |
I think this might be "normal", because even though that event may be freed, when you shutdown the server there are still events in the queue that are never sent (the backend is still not reachable) and so may be leaked. One way to check if an event lost for timeout is actually leaked, is to make sure the events server becomes reachable again before closing Janus, so that events not timed out yet are delivered and freed correctly: I'll try and replicate myself to see if I can see a leak in that case. |
Ok thanks for explanation. |
No need. I actually did make some tests with a different approach, but due to the asynchronous nature of some cleanup mechanisms, leaks would happen anyway. I modified the code to tie removing a session from the hashtable to triggering the unref code, which was supposed to lead to a more rigorous cleanup on shutdown as well (since we destroy the table there, it automatically removes all items it contains); this meant that handles would be marked as destroyed as well, but since handles are only cleaned up when their loop thread completes (which may take longer), they wouldn't be removed in time. As such, considering it's neglectible, I'm not particularly interested in shutdown at the moment. |
Fair points |
* meetecho_master: (345 commits) Remove support for framemarking RTP extension (meetecho#2640) Prevent race conditions on socket close in SIP and NoSIP plugins (meetecho#2599) Fixed overflow runtime error Fix for race condition between VideoRoom publisher leaving and subscriber hanging up (fixes meetecho#2582) (meetecho#2637) Send PLI when starting a paused stream (meetecho#2645) Prevent too high shift exponent Fixed type of seq/ts in file-based Streaming mountpoint threads Fix missing g_thread_unref when a streaming helper thread quits. Added custom headers for SIP INFO request (meetecho#2644) Free participant->user_id_str in case of opus enc/decoder error. Reject flexfec when offered, as still unsupported (see meetecho#2639) Don't add rtx ssrc if m-line is recvonly/inactive (see meetecho#2639) Added NULL checks for json_dumps (see meetecho#2629) Parse custom headers, if required, in successful REGISTER response (fixes meetecho#2636) Timestamp correction for janus-pp-rec (meetecho#2573) Don't chain error handler to success handler in Janus.httpAPICall (meetecho#2569) Add missing library link for WS event handler (fixes meetecho#2628) Fixed broken switch in Streaming plugin when using helper threads Resolves meetecho#2624 jansson double referencing (meetecho#2634) Unlock mountpoints mutex after the spawning of helper threads. ... # Conflicts: # transports/janus_mqtt.c
* meetecho_master: (345 commits) Remove support for framemarking RTP extension (meetecho#2640) Prevent race conditions on socket close in SIP and NoSIP plugins (meetecho#2599) Fixed overflow runtime error Fix for race condition between VideoRoom publisher leaving and subscriber hanging up (fixes meetecho#2582) (meetecho#2637) Send PLI when starting a paused stream (meetecho#2645) Prevent too high shift exponent Fixed type of seq/ts in file-based Streaming mountpoint threads Fix missing g_thread_unref when a streaming helper thread quits. Added custom headers for SIP INFO request (meetecho#2644) Free participant->user_id_str in case of opus enc/decoder error. Reject flexfec when offered, as still unsupported (see meetecho#2639) Don't add rtx ssrc if m-line is recvonly/inactive (see meetecho#2639) Added NULL checks for json_dumps (see meetecho#2629) Parse custom headers, if required, in successful REGISTER response (fixes meetecho#2636) Timestamp correction for janus-pp-rec (meetecho#2573) Don't chain error handler to success handler in Janus.httpAPICall (meetecho#2569) Add missing library link for WS event handler (fixes meetecho#2628) Fixed broken switch in Streaming plugin when using helper threads Resolves meetecho#2624 jansson double referencing (meetecho#2634) Unlock mountpoints mutex after the spawning of helper threads. ... # Conflicts: # transports/janus_mqtt.c
* meetecho_master: (345 commits) Remove support for framemarking RTP extension (meetecho#2640) Prevent race conditions on socket close in SIP and NoSIP plugins (meetecho#2599) Fixed overflow runtime error Fix for race condition between VideoRoom publisher leaving and subscriber hanging up (fixes meetecho#2582) (meetecho#2637) Send PLI when starting a paused stream (meetecho#2645) Prevent too high shift exponent Fixed type of seq/ts in file-based Streaming mountpoint threads Fix missing g_thread_unref when a streaming helper thread quits. Added custom headers for SIP INFO request (meetecho#2644) Free participant->user_id_str in case of opus enc/decoder error. Reject flexfec when offered, as still unsupported (see meetecho#2639) Don't add rtx ssrc if m-line is recvonly/inactive (see meetecho#2639) Added NULL checks for json_dumps (see meetecho#2629) Parse custom headers, if required, in successful REGISTER response (fixes meetecho#2636) Timestamp correction for janus-pp-rec (meetecho#2573) Don't chain error handler to success handler in Janus.httpAPICall (meetecho#2569) Add missing library link for WS event handler (fixes meetecho#2628) Fixed broken switch in Streaming plugin when using helper threads Resolves meetecho#2624 jansson double referencing (meetecho#2634) Unlock mountpoints mutex after the spawning of helper threads. ... # Conflicts: # transports/janus_mqtt.c
* meetecho_master: (345 commits) Remove support for framemarking RTP extension (meetecho#2640) Prevent race conditions on socket close in SIP and NoSIP plugins (meetecho#2599) Fixed overflow runtime error Fix for race condition between VideoRoom publisher leaving and subscriber hanging up (fixes meetecho#2582) (meetecho#2637) Send PLI when starting a paused stream (meetecho#2645) Prevent too high shift exponent Fixed type of seq/ts in file-based Streaming mountpoint threads Fix missing g_thread_unref when a streaming helper thread quits. Added custom headers for SIP INFO request (meetecho#2644) Free participant->user_id_str in case of opus enc/decoder error. Reject flexfec when offered, as still unsupported (see meetecho#2639) Don't add rtx ssrc if m-line is recvonly/inactive (see meetecho#2639) Added NULL checks for json_dumps (see meetecho#2629) Parse custom headers, if required, in successful REGISTER response (fixes meetecho#2636) Timestamp correction for janus-pp-rec (meetecho#2573) Don't chain error handler to success handler in Janus.httpAPICall (meetecho#2569) Add missing library link for WS event handler (fixes meetecho#2628) Fixed broken switch in Streaming plugin when using helper threads Resolves meetecho#2624 jansson double referencing (meetecho#2634) Unlock mountpoints mutex after the spawning of helper threads. ... # Conflicts: # transports/janus_mqtt.c
* meetecho_master: (345 commits) Remove support for framemarking RTP extension (meetecho#2640) Prevent race conditions on socket close in SIP and NoSIP plugins (meetecho#2599) Fixed overflow runtime error Fix for race condition between VideoRoom publisher leaving and subscriber hanging up (fixes meetecho#2582) (meetecho#2637) Send PLI when starting a paused stream (meetecho#2645) Prevent too high shift exponent Fixed type of seq/ts in file-based Streaming mountpoint threads Fix missing g_thread_unref when a streaming helper thread quits. Added custom headers for SIP INFO request (meetecho#2644) Free participant->user_id_str in case of opus enc/decoder error. Reject flexfec when offered, as still unsupported (see meetecho#2639) Don't add rtx ssrc if m-line is recvonly/inactive (see meetecho#2639) Added NULL checks for json_dumps (see meetecho#2629) Parse custom headers, if required, in successful REGISTER response (fixes meetecho#2636) Timestamp correction for janus-pp-rec (meetecho#2573) Don't chain error handler to success handler in Janus.httpAPICall (meetecho#2569) Add missing library link for WS event handler (fixes meetecho#2628) Fixed broken switch in Streaming plugin when using helper threads Resolves meetecho#2624 jansson double referencing (meetecho#2634) Unlock mountpoints mutex after the spawning of helper threads. ... # Conflicts: # transports/janus_mqtt.c
* meetecho_master: (345 commits) Remove support for framemarking RTP extension (meetecho#2640) Prevent race conditions on socket close in SIP and NoSIP plugins (meetecho#2599) Fixed overflow runtime error Fix for race condition between VideoRoom publisher leaving and subscriber hanging up (fixes meetecho#2582) (meetecho#2637) Send PLI when starting a paused stream (meetecho#2645) Prevent too high shift exponent Fixed type of seq/ts in file-based Streaming mountpoint threads Fix missing g_thread_unref when a streaming helper thread quits. Added custom headers for SIP INFO request (meetecho#2644) Free participant->user_id_str in case of opus enc/decoder error. Reject flexfec when offered, as still unsupported (see meetecho#2639) Don't add rtx ssrc if m-line is recvonly/inactive (see meetecho#2639) Added NULL checks for json_dumps (see meetecho#2629) Parse custom headers, if required, in successful REGISTER response (fixes meetecho#2636) Timestamp correction for janus-pp-rec (meetecho#2573) Don't chain error handler to success handler in Janus.httpAPICall (meetecho#2569) Add missing library link for WS event handler (fixes meetecho#2628) Fixed broken switch in Streaming plugin when using helper threads Resolves meetecho#2624 jansson double referencing (meetecho#2634) Unlock mountpoints mutex after the spawning of helper threads. ... # Conflicts: # transports/janus_mqtt.c
* meetecho_master: (345 commits) Remove support for framemarking RTP extension (meetecho#2640) Prevent race conditions on socket close in SIP and NoSIP plugins (meetecho#2599) Fixed overflow runtime error Fix for race condition between VideoRoom publisher leaving and subscriber hanging up (fixes meetecho#2582) (meetecho#2637) Send PLI when starting a paused stream (meetecho#2645) Prevent too high shift exponent Fixed type of seq/ts in file-based Streaming mountpoint threads Fix missing g_thread_unref when a streaming helper thread quits. Added custom headers for SIP INFO request (meetecho#2644) Free participant->user_id_str in case of opus enc/decoder error. Reject flexfec when offered, as still unsupported (see meetecho#2639) Don't add rtx ssrc if m-line is recvonly/inactive (see meetecho#2639) Added NULL checks for json_dumps (see meetecho#2629) Parse custom headers, if required, in successful REGISTER response (fixes meetecho#2636) Timestamp correction for janus-pp-rec (meetecho#2573) Don't chain error handler to success handler in Janus.httpAPICall (meetecho#2569) Add missing library link for WS event handler (fixes meetecho#2628) Fixed broken switch in Streaming plugin when using helper threads Resolves meetecho#2624 jansson double referencing (meetecho#2634) Unlock mountpoints mutex after the spawning of helper threads. ... # Conflicts: # transports/janus_mqtt.c (cherry picked from commit 1827315)
* meetecho_master: (345 commits) Remove support for framemarking RTP extension (meetecho#2640) Prevent race conditions on socket close in SIP and NoSIP plugins (meetecho#2599) Fixed overflow runtime error Fix for race condition between VideoRoom publisher leaving and subscriber hanging up (fixes meetecho#2582) (meetecho#2637) Send PLI when starting a paused stream (meetecho#2645) Prevent too high shift exponent Fixed type of seq/ts in file-based Streaming mountpoint threads Fix missing g_thread_unref when a streaming helper thread quits. Added custom headers for SIP INFO request (meetecho#2644) Free participant->user_id_str in case of opus enc/decoder error. Reject flexfec when offered, as still unsupported (see meetecho#2639) Don't add rtx ssrc if m-line is recvonly/inactive (see meetecho#2639) Added NULL checks for json_dumps (see meetecho#2629) Parse custom headers, if required, in successful REGISTER response (fixes meetecho#2636) Timestamp correction for janus-pp-rec (meetecho#2573) Don't chain error handler to success handler in Janus.httpAPICall (meetecho#2569) Add missing library link for WS event handler (fixes meetecho#2628) Fixed broken switch in Streaming plugin when using helper threads Resolves meetecho#2624 jansson double referencing (meetecho#2634) Unlock mountpoints mutex after the spawning of helper threads. ... # Conflicts: # transports/janus_mqtt.c (cherry picked from commit 1827315) (cherry picked from commit b9318b7)
We have been running 0.10.10 version lately and noticed quite some mem loss over time:
Running master in Valgrind showed me two places of loss, one initiating videoroom from file which was there before if i remember correctly:
but this one is new on incoming requests:
Let me know if I need to provide more details.
Thanks up front
The text was updated successfully, but these errors were encountered: