Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: Use http.Transport for HTTP client #2110

Merged
merged 3 commits into from
Nov 22, 2021
Merged

Conversation

yondonfu
Copy link
Member

@yondonfu yondonfu commented Nov 18, 2021

What does this pull request do? Explain your changes. (required)

Replace http2.Transport to have option to downgrade from http2 if needed.

We've observed what appears to be dead locks with multiple concurrent HTTP requests during segment submission from a B to O. According to golang/go#32388 there was a dead lock bug fixed in the http2 package that was backported into the latest go1.16 and go1.17. go-livepeer is currently built using go1.15. While we could upgrade to one of the go versions with this fix, we should probably do more testing when bumping to those go versions. So, this PR instead allows http2 to be disabled which should in theory bypass the dead lock http2 bug.

Specific updates (required)

See commit history.

How did you test each of these updates (required)

  • Ran a B with GODEBUG=http2debug=2 and observed http2 logs when B submits a segment to O.
  • Ran a B with GODEBUG=http2debug=2,http2client=0 and did not observe http2 logs when B submits a segment to O.

Note: We've yet to test whether running with GODEBUG=http2client=0 with these changes actually resolves the issue we observed with dead locking concurrent HTTP requests and it is difficult to repro that.

Does this pull request close any open issues?

N/A

Checklist:

},
ForceAttemptHTTP2: true,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the http.Transport docs and the http package code, ForceAttemptHTTP2: true is needed for the transport to try to upgrade to HTTP/2 if TLSClientConfig is non-nil or if custom dialers (i.e. DialTLSContext) are used. By including this, we can default to transparently supporting HTTP/2 while maintaining the flexibility to use HTTP/1 by running with GODEBUG=http2client=0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can add a comment into the code for the explanation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ca94081

Copy link
Contributor

@leszko leszko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks fine. I just have one question.

Since if I understand correctly, we encountered the bug you mentioned on prod in livepeer.com. And x/net/http2 is used internally in net/http. Does it mean that we'll need to use http2client=0 in livepeer.com prod to avoid this bug?

},
ForceAttemptHTTP2: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can add a comment into the code for the explanation.

@yondonfu yondonfu marked this pull request as ready for review November 19, 2021 20:41
Copy link
Member

@victorges victorges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

cctx, cancel := context.WithTimeout(ctx, dialTimeout)
defer cancel()

tlsDialer := &tls.Dialer{Config: tlsConfig}
Copy link
Member

@victorges victorges Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be shared among multiple connections (declared on the same lvl as tlsConfig and httpClient or is it ok/advised to instantiate a new one for every dial?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure, but I haven't seen any information recommending re-using a single dialer or indicating that instantiating a new dialer for each dial could result in inefficiencies. One data point here is that the code for tls.Dial actually instantiates a new net.Dialer (not tls.Dialer, but I imagine both types would have the same considerations) under the hood whenever it is called so that makes me feel like instantiating a new tls.Dialer in our case is probably fine.

Copy link
Contributor

@darkdarkdragon darkdarkdragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yondonfu
Copy link
Member Author

@leszko

Since if I understand correctly, we encountered the bug you mentioned on prod in livepeer.com. And x/net/http2 is used internally in net/http. Does it mean that we'll need to use http2client=0 in livepeer.com prod to avoid this bug?

Yeah that's the idea for now. And once we upgrade to the latest Go version (along with completing the necessary testing and review of the changelog) with the bug fix in the http2 package we could stop using GODEBUG=http2client altogether.

Replace http2.Transport to have option to downgrade from http2 if needed.
While also allowing HTTP/1 via GODEBUG runtime flags.
@yondonfu yondonfu merged commit 5a62585 into master Nov 22, 2021
@yondonfu yondonfu deleted the yf/http-transport branch November 22, 2021 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants