Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash in ssl3_shutdown #896

Closed
sslivins opened this issue May 19, 2017 · 5 comments
Closed

crash in ssl3_shutdown #896

sslivins opened this issue May 19, 2017 · 5 comments

Comments

@sslivins
Copy link

I see this crash fairly often. I thought it was related to well i internally call the end_session/close_pc callbacks but im not really sure

Core was generated by `/opt/janus-impstar/bin/janus -u -F /opt/janus-impstar/etc/janus/'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f878e9f2b69 in ssl3_shutdown () from /lib64/libssl.so.10
Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.26-20.el7_2.x86_64 glib2-2.46.2-4.el7.x86_64 glibc-2.17-157.el7_3.1.x86_64 jansson-2.4-6.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.14.1-27.el7_3.x86_64 libcom_err-1.42.9-9.el7.x86_64 libcurl-7.29.0-35.el7.centos.x86_64 libffi-3.0.13-18.el7.x86_64 libidn-1.28-4.el7.x86_64 libselinux-2.5-6.el7.x86_64 libssh2-1.4.3-10.el7_2.1.x86_64 nspr-4.13.1-1.0.el7_3.x86_64 nss-3.28.4-1.0.el7_3.x86_64 nss-softokn-3.16.2.3-14.4.el7.x86_64 nss-softokn-freebl-3.16.2.3-14.4.el7.x86_64 nss-sysinit-3.28.4-1.0.el7_3.x86_64 nss-util-3.28.4-1.0.el7_3.x86_64 openldap-2.4.40-13.el7.x86_64 openssl-libs-1.0.1e-60.el7_3.1.x86_64 pcre-8.32-15.el7_2.1.x86_64 sqlite-3.7.17-8.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt full
#0 0x00007f878e9f2b69 in ssl3_shutdown () from /lib64/libssl.so.10
No symbol table info available.
#1 0x00007f878ea0a3da in dtls1_shutdown () from /lib64/libssl.so.10
No symbol table info available.
#2 0x00000000004195a3 in janus_dtls_srtp_send_alert (dtls=0x7f86b82217b0) at dtls.c:781
No locals.
#3 0x000000000043498a in janus_ice_send_thread (data=0x7f86901a9630) at ice.c:3307
stream = 0x7f876e260000
now = 40756382234828
handle = 0x7f86901a9630
FUNCTION = "janus_ice_send_thread"
pkt = 0x67fc40 <janus_ice_dtls_alert>
before = 40756381652635
audio_rtcp_last_rr = 40756380666801
audio_rtcp_last_sr = 40756180221234
video_rtcp_last_rr = 40756380666801
video_rtcp_last_sr = 40756180221234
last_nack_cleanup = 40756382152715
#4 0x00007f878eeb20f5 in g_thread_proxy () from /lib64/libglib-2.0.so.0
No symbol table info available.
#5 0x00007f878da46dc5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#6 0x00007f878d77573d in clone () from /lib64/libc.so.6
No symbol table info available.

@lminiero
Copy link
Member

Please don't paste inline, use services like gist/pastebin.

Tons of symbols missing, apparently, so no easy way to know what's going on. I suspect a race condition when getting rid of the PeerConnection, e.g., the DTLS context already being shutdown when the code tries to clean that again. Have you tried going up to point 2 and see what the content of those variables is (if gdb does have info at that level)?

Anyway, this is one of the things the reference counters branch, which we plan to merge soon, should fix, as that was conceived to prevent multiple removals of that sort, and more importantly access to broken pointers (e.g., race conditions where you try to access something that was freed in the meanwhile), so you may want to give that a try.

Since you've written your own plugin, in case you want to test that there are a few changes needed, but nothing major: in fact, you're not required to use reference counters for your own memory management, but you do need to increase/decrease the reference to the object that core and plugin use to talk to each other. The easiest way to see how this is done is just looking at the changes in the EchoTest plugin, which are tiny (you'll find them in the overall list of changes below):

https://github.com/meetecho/janus-gateway/pull/403/files

@lminiero
Copy link
Member

PS: as a short time fix, if your client side already hangs up when the streams goes away, try avoiding close_pc or end_session from the plugin to avoid the risk of causing concurrent cleanups (which again the reference counters branch should take care of, so I encourage you to look into).

@sslivins
Copy link
Author

i do find it happens less often if i dont call end_session()...but it doesnt go down to zero...if the ref count branch is stable enough im happy to try it out

@lminiero
Copy link
Member

@sslivins can you check if this commit fixes it for you? It tries to better discipline the ways you can tear down a PeerConnection, so including calls to close_pc/end_session, c63ddb3

@lminiero
Copy link
Member

lminiero commented Jun 30, 2017

Closing as it fixed the issue for me. Feel free to open a new issue and provide updated details if it's still happening

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants