Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix EndTranscodingSession() call and potential race #2735

Merged
merged 1 commit into from
Jan 25, 2023

Conversation

cyberj0g
Copy link
Contributor

@cyberj0g cyberj0g commented Jan 25, 2023

Addresses #2732

Despite all efforts, I was yet unable to reproduce the error locally, however, I added a fix for related issue, and the fix which potentially should address this issue.

Findings

  • EndTranscodingSession() call here was failing, because of context cancelled error. It's not clear why this context, which is, presumably, goes all the way down from HandlePush(), is causing that error on gRPC communication, but it's consistently reproducible. Fixed by changing it to Context.Background().
  • EndTranscodingSession() is called again here on stream end. There might be a race condition between receiving from the stop channel and locking the mutex to cleanup the session on T. We had reports of noop transcoding jobs (empty segments?), could it be that it caused EndTranscodingSession() to be called twice in a very quick succession, if above context cancelled issue doesn't happen on all systems? Addressed by removing the session id from the map right in EndTranscodingSession() with mutex locked.

@codecov
Copy link

codecov bot commented Jan 25, 2023

Codecov Report

Merging #2735 (def4696) into master (af495ed) will increase coverage by 0.01272%.
The diff coverage is 50.00000%.

Impacted file tree graph

@@                 Coverage Diff                 @@
##              master       #2735         +/-   ##
===================================================
+ Coverage   56.33553%   56.34825%   +0.01272%     
===================================================
  Files             88          88                 
  Lines          19146       19147          +1     
===================================================
+ Hits           10786       10789          +3     
+ Misses          7768        7767          -1     
+ Partials         592         591          -1     
Impacted Files Coverage Δ
core/orchestrator.go 78.20690% <0.00000%> (+0.13793%) ⬆️
core/lb.go 80.11696% <50.00000%> (+0.70520%) ⬆️
server/rpc.go 72.02381% <100.00000%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update af495ed...def4696. Read the comment docs.

Impacted Files Coverage Δ
core/orchestrator.go 78.20690% <0.00000%> (+0.13793%) ⬆️
core/lb.go 80.11696% <50.00000%> (+0.70520%) ⬆️
server/rpc.go 72.02381% <100.00000%> (ø)

@cyberj0g cyberj0g merged commit d8ad5c7 into master Jan 25, 2023
@cyberj0g cyberj0g deleted the ip/fix-fv-errors branch January 25, 2023 15:38
eliteprox pushed a commit to eliteprox/go-livepeer that referenced this pull request Feb 3, 2023
@thomshutt thomshutt mentioned this pull request Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants