Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix moutpoint_mutex deadlock when rtsp reconnect fail #2542

Merged
merged 10 commits into from
Apr 8, 2021

Conversation

lucylu-star
Copy link
Contributor

fix moutpoint_mutex deadlock when rtsp reconnect fail.

@januscla
Copy link

januscla commented Feb 2, 2021

Thanks for your contribution, @lucylu-star! Please make sure you sign our CLA, as it's a required step before we can merge this.

Copy link
Member

@lminiero lminiero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, but this doesn't seem to work. Please see my notes below.

README.md Outdated Show resolved Hide resolved
plugins/janus_streaming.c Outdated Show resolved Hide resolved
plugins/janus_streaming.c Outdated Show resolved Hide resolved
Copy link
Member

@lminiero lminiero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more notes. Thanks!

README.md Outdated Show resolved Hide resolved
plugins/janus_streaming.c Show resolved Hide resolved
plugins/janus_streaming.c Outdated Show resolved Hide resolved
@lucylu-star
Copy link
Contributor Author

del newline and brackets

@lucylu-star
Copy link
Contributor Author

hi lminiero,
I have adjust something and add comments.
the push request is ok or not now?
Should I adjust others?

Copy link
Member

@lminiero lminiero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added just one note inline. Thanks!

plugins/janus_streaming.c Show resolved Hide resolved
@lucylu-star
Copy link
Contributor Author

lucylu-star commented Feb 24, 2021 via email

@lminiero
Copy link
Member

I found out why we had made those changes. You can find the rationale in these two commits:

ced1c7a
262e997

The first commit started using mountpoints_mutex to avoid collisions on local ports, while the second limited the mutex usage only around janus_streaming_rtsp_parse_sdp, which is indeed the method that may then allocate new local ports. We use a slider for ports, and so need a global mutex to avoid different mountpoints taking the same port.

For this reason, we cannot accept changing the mutex from mountpoints_mutex to mp->mutex as you did. I think adding the checks on mp->destroyed as you did is correct, instead, so if you update your patch to only do that it can definitely be merged.

@lminiero
Copy link
Member

@lucylu-star ping 🙂
Please notice we just merged some changes to the reconnect code, which you can see in #2598

@eLvErDe
Copy link

eLvErDe commented Apr 3, 2021

Hello,

@lucylu-star I guess you just fixed a 3-years old bug occuring in production on a bi-monthly basis.
I'll provide more feedbacks tomorrow but it looks like I cannot reproduce anymore.

@lucylu-star got a paypal account or something ? Would like to send a small tip for that :)

Regards

@eLvErDe
Copy link

eLvErDe commented Apr 7, 2021

@lminiero confirmed fixing janus deadlock for us, please consider merging it or rewrite yourself if the submitter is gone ;-)

Regards

@lminiero
Copy link
Member

lminiero commented Apr 7, 2021

As I already said, it can't be merged as it is. If you can verify that limiting the patch to what I mentioned still fixes your issue, I can do a separate commit.

@eLvErDe
Copy link

eLvErDe commented Apr 7, 2021

As I already said, it can't be merged as it is. If you can verify that limiting the patch to what I mentioned still fixes your issue, I can do a separate commit.

I sure can, can you show me which line must be changed ? I'll try to get you a feedback today

@lminiero
Copy link
Member

lminiero commented Apr 7, 2021

It's explained in this comment I added a few weeks ago:

For this reason, we cannot accept changing the mutex from mountpoints_mutex to mp->mutex as you did. I think adding the checks on mp->destroyed as you did is correct, instead, so if you update your patch to only do that it can definitely be merged.

@eLvErDe
Copy link

eLvErDe commented Apr 7, 2021

Okay I guess I got it, will keep you informed asap

@lucylu-star
Copy link
Contributor Author

lucylu-star commented Apr 7, 2021 via email

@lminiero
Copy link
Member

lminiero commented Apr 7, 2021

@lucylu-star mh it looks like your merge did something weird, as I now see part of other commits (e.g., the RTSP reconnect PR) in your commits. Maybe this needs a rebase on master and a force push?

@lucylu-star
Copy link
Contributor Author

lucylu-star commented Apr 8, 2021 via email

@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

Yes you need your fix only, you need to force push your branch with --push after rebasing against current master.

And yes I'm serious about paypal, send me your email at [email protected]

@lminiero
Copy link
Member

lminiero commented Apr 8, 2021

Still the same issues with commits 🙁
Github reports conflicts too.

@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

@lucylu-star not a git expert right ? :p

If you want you can add me to your GitHub fork of janus, I'll fix it for you

@lucylu-star
Copy link
Contributor Author

lucylu-star commented Apr 8, 2021 via email

@lminiero
Copy link
Member

lminiero commented Apr 8, 2021

You need to rebase, not merge. That said, it might be easier to create a new PR from scratch.

@lucylu-star
Copy link
Contributor Author

lucylu-star commented Apr 8, 2021 via email

@lminiero
Copy link
Member

lminiero commented Apr 8, 2021

Now I see a rebase but the commits removed those new features instead 🤭
Please just create a new PR, it will be easier that way: I can close this one, no problem.

@lminiero
Copy link
Member

lminiero commented Apr 8, 2021

Thanks, looks good now! @eLvErDe could you give this patch a test?

@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

@lminiero kinda busy right now but I'll find some time

@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

@lminiero built and deployed, I asked someone else to perform the load test and will get back to you

@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

@lminiero confirmed, no crash after intensive testing

@lminiero
Copy link
Member

lminiero commented Apr 8, 2021

Thanks for the feedback, and thanks again to @lucylu-star for the contribution and patience: merging this then!

@lminiero lminiero merged commit 56d1470 into meetecho:master Apr 8, 2021
@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

well, I just got this really weird behavior:

20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1134624841, index 0)
20:31:36 janus janus[16443]: [100290576] New video stream! (ssrc=1694606358, index 0)

Do you think it can be related ?

@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

Just rolled back to 0.10.10 and I'm getting the same log so I guess it's related to something else

Oh god, what a day :/

@lminiero
Copy link
Member

lminiero commented Apr 8, 2021

I doubt it, this PR only interrupts any RTSP reconnect attempt if the mountpoint has been destroyed.

@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

Yes, just tested with previous version, got the same bug. Server shutdown for now others seems to be working correctly.
Any idea what it could be ?

@lminiero
Copy link
Member

lminiero commented Apr 8, 2021

No, and at any rate it shouldn't be discussed here. If you think it's a bug and not something you're doing wrong, please open a new issue.

@eLvErDe
Copy link

eLvErDe commented Apr 8, 2021

Well, I thought it could have been related. Nevermind, I have no idea why this crap happened and no way to reproduce so let's just forget about this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants