Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client hang with hyper 0.14 (tokio, async-std) #2312

Open
pandaman64 opened this issue Oct 27, 2020 · 21 comments
Open

Client hang with hyper 0.14 (tokio, async-std) #2312

pandaman64 opened this issue Oct 27, 2020 · 21 comments

Comments

@pandaman64
Copy link

Context: we are investigating if upgrading hyper to 0.13 fixes #2306, and it seems not.

Steps to reproduce

Prerequisites:

  • You need to run a docker daemon. The default setting should be ok as the reproducer uses Unix domain sockets.
  • ulimit -n 65536 (increasing open file limits)
  1. clone https://github.com/pandaman64/hyper-hang-tokio
  2. cargo run

Expected behavior

The program should run until the system resource is exhausted.

Actual behavior

It hangs after indeterminate iterations.

Log (last several iterations)
435
Oct 27 16:48:35.474 DEBUG hyper::client::pool: reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.474 DEBUG hyper::proto::h1::io: flushed 76 bytes
Oct 27 16:48:35.475 DEBUG hyper::proto::h1::io: read 110 bytes
Oct 27 16:48:35.475 DEBUG hyper::proto::h1::io: parsed 3 headers
Oct 27 16:48:35.475 DEBUG hyper::proto::h1::conn: incoming body is empty
Oct 27 16:48:35.476 DEBUG hyper::client::pool: reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.477 DEBUG hyper::proto::h1::io: flushed 75 bytes
Oct 27 16:48:35.477 DEBUG hyper::client::pool: pooling idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.478 DEBUG hyper::proto::h1::io: read 211 bytes
Oct 27 16:48:35.478 DEBUG hyper::proto::h1::io: parsed 7 headers
Oct 27 16:48:35.478 DEBUG hyper::proto::h1::conn: incoming body is chunked encoding
436
Oct 27 16:48:35.478 DEBUG hyper::client::pool: reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.479 DEBUG hyper::proto::h1::io: flushed 76 bytes
Oct 27 16:48:35.479 DEBUG hyper::proto::h1::io: read 110 bytes
Oct 27 16:48:35.479 DEBUG hyper::proto::h1::io: parsed 3 headers
Oct 27 16:48:35.479 DEBUG hyper::proto::h1::conn: incoming body is empty
Oct 27 16:48:35.480 DEBUG hyper::client::pool: reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.480 DEBUG hyper::client::pool: pooling idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.481 DEBUG hyper::proto::h1::io: flushed 75 bytes
Oct 27 16:48:35.481 DEBUG hyper::proto::h1::io: read 211 bytes
Oct 27 16:48:35.481 DEBUG hyper::proto::h1::io: parsed 7 headers
Oct 27 16:48:35.481 DEBUG hyper::proto::h1::conn: incoming body is chunked encoding
437
Oct 27 16:48:35.482 DEBUG hyper::client::pool: reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.482 DEBUG hyper::proto::h1::io: flushed 76 bytes
Oct 27 16:48:35.482 DEBUG hyper::proto::h1::io: read 110 bytes
Oct 27 16:48:35.482 DEBUG hyper::proto::h1::io: parsed 3 headers
Oct 27 16:48:35.483 DEBUG hyper::proto::h1::conn: incoming body is empty
Oct 27 16:48:35.483 DEBUG hyper::client::pool: reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.484 DEBUG hyper::proto::h1::io: flushed 75 bytes
Oct 27 16:48:35.484 DEBUG hyper::client::pool: pooling idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
Oct 27 16:48:35.484 DEBUG hyper::proto::h1::io: read 211 bytes
Oct 27 16:48:35.484 DEBUG hyper::proto::h1::io: parsed 7 headers
Oct 27 16:48:35.484 DEBUG hyper::proto::h1::conn: incoming body is chunked encoding
Oct 27 16:48:35.484 DEBUG hyper::client::pool: pooling idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)
438
Oct 27 16:48:35.485 DEBUG hyper::client::pool: reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)

Reproducer

I'm pasting the reproducer here for ease of reference.

use futures::prelude::*;
fn main() {
    // env_logger::Builder::from_default_env()
    //     .target(env_logger::fmt::Target::Stdout)
    //     .init();
    tracing_subscriber::fmt::init();

    let mut runtime = tokio::runtime::Builder::new()
        .threaded_scheduler()
        .enable_all()
        .build()
        .unwrap();
    let client: hyper::Client<hyperlocal::UnixConnector> =
        hyper::Client::builder().build(hyperlocal::UnixConnector);
    for i in 0.. {
        println!("{}", i);
        let res = runtime
            .block_on(async {
                let _resp = client
                    .get(hyperlocal::Uri::new("/var/run/docker.sock", "//events?").into())
                    .await;
                client
                    .get(hyperlocal::Uri::new("/var/run/docker.sock", "/events?").into())
                    .await
            })
            .unwrap();
        runtime.spawn(res.into_body().into_future());
    }
}

Notes

  1. Double slashes in the uri ("//events?") does matter. We need two requests with different uris.
  2. I suspect this is a hyper issue because we can reproduce the same hang both with tokio and async-std:
@seanmonstar
Copy link
Member

Thanks for trying this out with the newer version. I wonder about using block_on so much, since the detection of an unusable connection happens in a spawned task that might not be getting run as much.

Have you seen the same issue if you change this to async/await?

#[tokio::main]
async fn main() {
    // env_logger::Builder::from_default_env()
    //     .target(env_logger::fmt::Target::Stdout)
    //     .init();
    tracing_subscriber::fmt::init();
    let client: hyper::Client<hyperlocal::UnixConnector> =
        hyper::Client::builder().build(hyperlocal::UnixConnector);
    for i in 0.. {
        println!("{}", i);
                let _resp = client
                    .get(hyperlocal::Uri::new("/var/run/docker.sock", "//events?").into())
                    .await;
                client
                    .get(hyperlocal::Uri::new("/var/run/docker.sock", "/events?").into())
                    .await;
    }
}

@pandaman64
Copy link
Author

Thank you for the response!

I tried the following version without block_on, still it hangs:

#[tokio::main]
async fn main() {
    // env_logger::Builder::from_default_env()
    //     .target(env_logger::fmt::Target::Stdout)
    //     .init();
    tracing_subscriber::fmt::init();

    let client: hyper::Client<hyperlocal::UnixConnector> =
        hyper::Client::builder().build(hyperlocal::UnixConnector);
    for i in 0.. {
        println!("{}", i);
        let _resp = client
            .get(hyperlocal::Uri::new("/var/run/docker.sock", "//events?").into())
            .await;
        let res = client
            .get(hyperlocal::Uri::new("/var/run/docker.sock", "/events?").into())
            .await
            .unwrap();

        tokio::spawn(res.into_body().into_future());
    }
}

The last tokio::spawn seems necessary for reproduction.

@pandaman64
Copy link
Author

pandaman64 commented Oct 27, 2020

The spawned future (res.into_body().into_future()) will receive bytes on docker operations (e.g. docker run --rm hello-world).
And I find that the program resumes looping after running docker commands (and eventually hangs again).

Probably too many spawned tasks block .get() futures from resolving?

@sfackler
Copy link
Contributor

I ran your latest example in a Ubuntu WSL2 install and it ran just fine until I eventually killed it at iteration ~15,000. Are you sure the problem is with hyper and not your Docker driver not responding for some reason?

@pandaman64
Copy link
Author

Hmm. My WSL2 Box (Ubuntu 20.04.1 LTS (Focal Fossa)) does reproduce the hang.
In my environment, async-std version tends to hang earlier. Did you try that?

@sfackler
Copy link
Contributor

I do see the hang with the async-std version after a couple hundred iterations, yeah.

@pandaman64
Copy link
Author

My coworker reported that the following version randomly stops working when invoking repeatedly:

use futures::prelude::*;
#[tokio::main]
async fn main() {
    let args: Vec<String> = std::env::args().collect();
    env_logger::init();
    let client: hyper::Client<hyperlocal::UnixConnector> =
        hyper::Client::builder().build(hyperlocal::UnixConnector);
    let _resp = client
        .get(hyperlocal::Uri::new("/var/run/docker.sock", "//events").into()) // this uri can be "//"
        .await;
    let resp = client
        .get(hyperlocal::Uri::new("/var/run/docker.sock", "/events").into())
        .await
        .unwrap();
    tokio::spawn(resp.into_body().into_future());
    let _resp = client
        .get(hyperlocal::Uri::new("/var/run/docker.sock", "//events").into()) // this uri can be "//", too
        .await;
    println!("ok: {}", args[1]);
}

I couldn't reproduce it with RUST_LOG=debug, though RUST_LOG=trace or RUST_LOG= do reproduce.

trace log of the last invocation
[2020-10-28T03:39:24Z TRACE want] signal: Want
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle } 
[2020-10-28T03:39:24Z TRACE want] poll_want: taker wants!
[2020-10-28T03:39:24Z TRACE hyper::client::pool] pool closed, canceling idle interval 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE hyper::client::pool] pool dropped, dropping pooled (("unix", 2f7661722f72756e2f646f636b65722e736f636b:0)) 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -- task
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::dispatch] client tx closed 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] State::close_read() 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] State::close_write() 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Closed, writing: Closed, keep_alive: Disabled } 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] shut down IO complete 
[2020-10-28T03:39:24Z TRACE mio::poll] deregistering handle with poller
[2020-10-28T03:39:24Z TRACE want] signal: Closed
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -- task
[2020-10-28T03:39:24Z TRACE mio::poll] deregistering handle with poller
[2020-10-28T03:39:24Z TRACE want] signal: Closed
[2020-10-28T03:39:24Z TRACE want] signal found waiting giver, notifying
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -- task
[2020-10-28T03:39:24Z TRACE hyper::client::pool] checkout waiting for idle connection: ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z TRACE mio::poll] registering with poller
[2020-10-28T03:39:24Z TRACE hyper::client::conn] client handshake HTTP/1 
[2020-10-28T03:39:24Z TRACE hyper::client] handshake complete, spawning background dispatcher task 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] task; kind=task future=futures_util::future::future::Map<futures_util::future::try_future::MapErr<hyper::client::conn::Connection<hyperlocal::client::UnixStream, hyper::body::body::Body>, hyper::client::Client<hyperlocal::client::UnixConnector>::connect_to::{{closure}}::{{closure}}::{{closure}}::{{closure}}>, hyper::client::Client<hyperlocal::client::UnixConnector>::connect_to::{{closure}}::{{closure}}::{{closure}}::{{closure}}>
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE want] signal: Want
[2020-10-28T03:39:24Z TRACE want] signal found waiting giver, notifying
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Busy } 
[2020-10-28T03:39:24Z TRACE want] poll_want: taker wants!
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE hyper::client::pool] checkout dropped for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] encode_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] -> encode_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] Client::encode method=GET, body=None 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] <- encode_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] -- encode_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::io] detected no usage of vectored write, flattening 
[2020-10-28T03:39:24Z DEBUG hyper::proto::h1::io] flushed 69 bytes 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Init, writing: KeepAlive, keep_alive: Busy } 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] Conn::read_head 
[2020-10-28T03:39:24Z DEBUG hyper::proto::h1::io] read 103 bytes 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] parse_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] -> parse_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] Response.parse([Header; 100], [u8; 103]) 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] Response.parse Complete(103) 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] <- parse_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] -- parse_headers
[2020-10-28T03:39:24Z DEBUG hyper::proto::h1::io] parsed 3 headers 
[2020-10-28T03:39:24Z DEBUG hyper::proto::h1::conn] incoming body is empty 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] maybe_notify; read_from_io blocked 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] task; kind=task future=futures_util::future::future::Map<futures_util::future::poll_fn::PollFn<hyper::client::Client<hyperlocal::client::UnixConnector>::send_request::{{closure}}::{{closure}}::{{closure}}>, hyper::client::Client<hyperlocal::client::UnixConnector>::send_request::{{closure}}::{{closure}}::{{closure}}>
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE want] signal: Want
[2020-10-28T03:39:24Z TRACE hyper::client::pool] checkout waiting for idle connection: ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z TRACE want] signal found waiting giver, notifying
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle } 
[2020-10-28T03:39:24Z TRACE want] poll_want: taker wants!
[2020-10-28T03:39:24Z TRACE hyper::client::pool] put; add idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z TRACE mio::poll] registering with poller
[2020-10-28T03:39:24Z TRACE hyper::client::pool] put; found waiter for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z DEBUG hyper::client::pool] reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] task; kind=task future=futures_util::future::future::Map<futures_util::future::try_future::MapErr<hyper::common::lazy::Lazy<hyper::client::Client<hyperlocal::client::UnixConnector>::connect_to::{{closure}}, futures_util::future::either::Either<futures_util::future::try_future::AndThen<futures_util::future::try_future::MapErr<hyper::service::oneshot::Oneshot<hyperlocal::client::UnixConnector, http::uri::Uri>, hyper::error::Error::new_connect<std::io::error::Error>>, futures_util::future::either::Either<core::pin::Pin<alloc::boxed::Box<futures_util::future::try_future::MapOk<futures_util::future::try_future::AndThen<core::future::from_generator::GenFuture<hyper::client::conn::Builder::handshake<hyperlocal::client::UnixStream, hyper::body::body::Body>::{{closure}}>, futures_util::future::poll_fn::PollFn<hyper::client::conn::SendRequest<hyper::body::body::Body>::when_ready::{{closure}}>, hyper::client::Client<hyperlocal::client::UnixConnector>::connect_to::{{closure}}::{{closure}}::{{closure}}>, hyper::client::Client<hyperlocal::client::UnixConnector>::connect_to::{{closure}}::{{closure}}::{{closure}}>>>, futures_util::future::ready::Ready<core::result::Result<hyper::client::pool::Pooled<hyper::client::PoolClient<hyper::body::body::Body>>, hyper::error::Error>>>, hyper::client::Client<hyperlocal::client::UnixConnector>::connect_to::{{closure}}::{{closure}}>, futures_util::future::ready::Ready<core::result::Result<hyper::client::pool::Pooled<hyper::client::PoolClient<hyper::body::body::Body>>, hyper::error::Error>>>>, hyper::client::Client<hyperlocal::client::UnixConnector>::connection_for::{{closure}}::{{closure}}>, hyper::client::Client<hyperlocal::client::UnixConnector>::connection_for::{{closure}}::{{closure}}>
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE want] signal: Want
[2020-10-28T03:39:24Z TRACE hyper::client::conn] client handshake HTTP/1 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle } 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE hyper::client] handshake complete, spawning background dispatcher task 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] task; kind=task future=futures_util::future::future::Map<futures_util::future::try_future::MapErr<hyper::client::conn::Connection<hyperlocal::client::UnixStream, hyper::body::body::Body>, hyper::client::Client<hyperlocal::client::UnixConnector>::connect_to::{{closure}}::{{closure}}::{{closure}}::{{closure}}>, hyper::client::Client<hyperlocal::client::UnixConnector>::connect_to::{{closure}}::{{closure}}::{{closure}}::{{closure}}>
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE want] signal: Want
[2020-10-28T03:39:24Z TRACE want] signal found waiting giver, notifying
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] encode_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Busy } 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] -> encode_headers
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE want] poll_want: taker wants!
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] Client::encode method=GET, body=None 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] <- encode_headers
[2020-10-28T03:39:24Z TRACE hyper::client::pool] put; add idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] -- encode_headers
[2020-10-28T03:39:24Z DEBUG hyper::client::pool] pooling idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z DEBUG hyper::proto::h1::io] flushed 74 bytes 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Init, writing: KeepAlive, keep_alive: Busy } 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] task; kind=task future=hyper::client::pool::IdleTask<hyper::client::PoolClient<hyper::body::body::Body>>
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] Conn::read_head 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE hyper::client::pool] idle interval checking for expired 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z DEBUG hyper::proto::h1::io] read 211 bytes 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] parse_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] -> parse_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] Response.parse([Header; 100], [u8; 211]) 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] Response.parse Complete(211) 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] <- parse_headers
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::role] -- parse_headers
[2020-10-28T03:39:24Z DEBUG hyper::proto::h1::io] parsed 7 headers 
[2020-10-28T03:39:24Z DEBUG hyper::proto::h1::conn] incoming body is chunked encoding 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::decode] decode; state=Chunked(Size, 0) 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::decode] Read chunk hex size 
[2020-10-28T03:39:24Z TRACE hyper::client::pool] put; add idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z DEBUG hyper::client::pool] pooling idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] task; kind=task future=futures_util::stream::stream::into_future::StreamFuture<hyper::body::body::Body>
[2020-10-28T03:39:24Z TRACE tokio::util::trace] -> task
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task
[2020-10-28T03:39:24Z TRACE hyper::client::pool] take? ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0): expiration = Some(90s) 
[2020-10-28T03:39:24Z TRACE hyper::proto::h1::conn] flushed({role=client}): State { reading: Body(Chunked(Size, 0)), writing: KeepAlive, keep_alive: Busy } 
[2020-10-28T03:39:24Z DEBUG hyper::client::pool] reuse idle connection for ("unix", 2f7661722f72756e2f646f636b65722e736f636b:0) 
[2020-10-28T03:39:24Z TRACE tokio::util::trace] <- task

@pandaman64
Copy link
Author

Another coworker reported that adding

    .pool_idle_timeout(std::time::Duration::from_millis(0))
    .pool_max_idle_per_host(0)

to the client builder is a work around.

nyuichi added a commit to nyuichi/dockworker that referenced this issue Nov 12, 2020
nyuichi added a commit to nyuichi/dockworker that referenced this issue Nov 12, 2020
lynzrand added a commit to BUAA-SE-Compiling/rurikawa that referenced this issue Nov 16, 2020
eldesh pushed a commit to Idein/dockworker that referenced this issue Nov 18, 2020
@pandaman64
Copy link
Author

Update:

@pandaman64 pandaman64 changed the title Client hang with hyper 0.13 (tokio, async-std) Client hang with hyper 0.14 (tokio, async-std) Feb 12, 2021
@univerz
Copy link

univerz commented Dec 2, 2021

  * I suspect there is something wrong with [the pool implementation](https://github.com/hyperium/hyper/blob/42587059e6175735b1a8656c5ddbff0edb19294c/src/client/pool.rs), but I couldn't find it.

there is TODO: unhack inside Pool::reuse, but hard for me to say how serious it is.

from my debugging it looks like there is no initial Connection poll (& then through ProtoClient -> Dispatcher -> Conn request is not encoded), but i have surprisingly hard time to figure out where that poll should be initiated to investigate it further.

@univerz
Copy link

univerz commented Dec 3, 2021

i have added id field to Connection & SendRequest and a few tracing events. in the log below, there is the end of previous successfull "round" and the last unsuccessfull round separated by *****. between them there is some spawning, handles, threads etc.

send_request_retryable{id=1}: hyper::client::conn: before rx si emited after dispatch.try_send & before rx.then, hyper::client::conn: after rx=1 inside it if successfull.
poll{id=1}: hyper::client::conn: called indicates Connection::poll.
there are several more inside PoolInner::spawn_idle_interval & Pool::reuse, but it looks like execution paths are always the same.

when the problem arises, it's always with the last dropped/pooled Connection from previous round, in this case id=2

any pointers where to look @seanmonstar ?

2021-12-03T10:36:33.319828Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.319833Z  WARN hyper::client::pool: can't share
2021-12-03T10:36:33.319836Z  WARN hyper::client::pool: downgrade
2021-12-03T10:36:33.319845Z  WARN send_request_retryable{id=2}: hyper::client::conn: before rx
2021-12-03T10:36:33.319859Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.319863Z  WARN hyper::client::pool: can't share
2021-12-03T10:36:33.319865Z  WARN hyper::client::pool: downgrade
2021-12-03T10:36:33.319874Z  WARN send_request_retryable{id=3}: hyper::client::conn: before rx
2021-12-03T10:36:33.319883Z  WARN poll{id=2}: hyper::client::conn: called
2021-12-03T10:36:33.319893Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.319898Z  WARN hyper::client::pool: can't share
2021-12-03T10:36:33.319900Z  WARN hyper::client::pool: downgrade
2021-12-03T10:36:33.319904Z DEBUG poll{id=2}: hyper::proto::h1::io: flushed 183 bytes
2021-12-03T10:36:33.319909Z  WARN send_request_retryable{id=0}: hyper::client::conn: before rx
2021-12-03T10:36:33.319913Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T10:36:33.320836Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.320841Z DEBUG poll{id=3}: hyper::proto::h1::io: flushed 182 bytes
2021-12-03T10:36:33.320861Z DEBUG poll{id=0}: hyper::proto::h1::io: flushed 219 bytes
2021-12-03T10:36:33.321024Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T10:36:33.321047Z DEBUG poll{id=3}: hyper::proto::h1::io: parsed 5 headers
2021-12-03T10:36:33.321053Z DEBUG poll{id=3}: hyper::proto::h1::conn: incoming body is content-length (66 bytes)
2021-12-03T10:36:33.321060Z DEBUG poll{id=3}: hyper::proto::h1::conn: incoming body completed
2021-12-03T10:36:33.321076Z  WARN hyper::client::conn: after rx=3
2021-12-03T10:36:33.321083Z DEBUG drop: hyper::client::pool: pooling idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.321088Z  WARN drop:spawn_idle_interval: hyper::client::pool: idle_interval_ref is_some
2021-12-03T10:36:33.321094Z DEBUG reqwest::async_impl::client: response '200 OK' for http://localhost:8545/    
2021-12-03T10:36:33.321148Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T10:36:33.322250Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.322262Z DEBUG poll{id=0}: hyper::proto::h1::io: parsed 5 headers
2021-12-03T10:36:33.322266Z DEBUG poll{id=0}: hyper::proto::h1::conn: incoming body is chunked encoding
2021-12-03T10:36:33.322272Z DEBUG poll{id=0}: hyper::proto::h1::decode: incoming chunked header: 0x800 (2048 bytes)
2021-12-03T10:36:33.322281Z  WARN hyper::client::conn: after rx=0
2021-12-03T10:36:33.322289Z DEBUG reqwest::async_impl::client: response '200 OK' for http://localhost:8545/    
2021-12-03T10:36:33.322357Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.322362Z DEBUG poll{id=0}: hyper::proto::h1::decode: incoming chunked header: 0x800 (2048 bytes)
2021-12-03T10:36:33.322391Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.322415Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.322421Z DEBUG poll{id=0}: hyper::proto::h1::decode: incoming chunked header: 0x723 (1827 bytes)
2021-12-03T10:36:33.322534Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.322541Z DEBUG poll{id=0}: hyper::proto::h1::conn: incoming body completed
2021-12-03T10:36:33.322551Z DEBUG drop: hyper::client::pool: pooling idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.322557Z  WARN drop:spawn_idle_interval: hyper::client::pool: idle_interval_ref is_some
2021-12-03T10:36:33.323183Z  WARN poll{id=2}: hyper::client::conn: called
2021-12-03T10:36:33.323206Z DEBUG poll{id=2}: hyper::proto::h1::io: parsed 5 headers
2021-12-03T10:36:33.323212Z DEBUG poll{id=2}: hyper::proto::h1::conn: incoming body is content-length (75 bytes)
2021-12-03T10:36:33.323218Z DEBUG poll{id=2}: hyper::proto::h1::conn: incoming body completed
2021-12-03T10:36:33.323229Z  WARN hyper::client::conn: after rx=2
2021-12-03T10:36:33.323236Z DEBUG drop: hyper::client::pool: pooling idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.323241Z  WARN drop:spawn_idle_interval: hyper::client::pool: idle_interval_ref is_some
2021-12-03T10:36:33.323247Z DEBUG reqwest::async_impl::client: response '200 OK' for http://localhost:8545/    

************* setup_evm *************

2021-12-03T10:36:33.323408Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.323419Z  WARN hyper::client::pool: can't share
2021-12-03T10:36:33.323422Z  WARN hyper::client::pool: downgrade
2021-12-03T10:36:33.323433Z  WARN send_request_retryable{id=2}: hyper::client::conn: before rx
2021-12-03T10:36:33.323449Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.323454Z  WARN hyper::client::pool: can't share
2021-12-03T10:36:33.323456Z  WARN hyper::client::pool: downgrade
2021-12-03T10:36:33.323465Z  WARN send_request_retryable{id=0}: hyper::client::conn: before rx
2021-12-03T10:36:33.323481Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.323486Z  WARN hyper::client::pool: can't share
2021-12-03T10:36:33.323489Z  WARN hyper::client::pool: downgrade
2021-12-03T10:36:33.323495Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.323509Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T10:36:33.323521Z DEBUG poll{id=0}: hyper::proto::h1::io: flushed 262 bytes
2021-12-03T10:36:33.323499Z  WARN send_request_retryable{id=3}: hyper::client::conn: before rx
2021-12-03T10:36:33.323530Z DEBUG poll{id=3}: hyper::proto::h1::io: flushed 250 bytes
2021-12-03T10:36:33.323728Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.323747Z DEBUG poll{id=0}: hyper::proto::h1::io: parsed 5 headers
2021-12-03T10:36:33.323752Z DEBUG poll{id=0}: hyper::proto::h1::conn: incoming body is content-length (66 bytes)
2021-12-03T10:36:33.323761Z DEBUG poll{id=0}: hyper::proto::h1::conn: incoming body completed
2021-12-03T10:36:33.323766Z  WARN hyper::client::conn: after rx=0
2021-12-03T10:36:33.323776Z DEBUG drop: hyper::client::pool: pooling idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.323783Z  WARN drop:spawn_idle_interval: hyper::client::pool: idle_interval_ref is_some
2021-12-03T10:36:33.323789Z DEBUG reqwest::async_impl::client: response '200 OK' for http://localhost:8545/    
2021-12-03T10:36:33.323839Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:36:33.323867Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T10:36:33.323881Z DEBUG poll{id=3}: hyper::proto::h1::io: parsed 5 headers
2021-12-03T10:36:33.323885Z DEBUG poll{id=3}: hyper::proto::h1::conn: incoming body is content-length (1167 bytes)
2021-12-03T10:36:33.323893Z DEBUG poll{id=3}: hyper::proto::h1::conn: incoming body completed
2021-12-03T10:36:33.323896Z  WARN hyper::client::conn: after rx=3
2021-12-03T10:36:33.323903Z DEBUG drop: hyper::client::pool: pooling idle connection for ("http", localhost:8545)
2021-12-03T10:36:33.323908Z  WARN drop:spawn_idle_interval: hyper::client::pool: idle_interval_ref is_some
2021-12-03T10:36:33.323914Z DEBUG reqwest::async_impl::client: response '200 OK' for http://localhost:8545/    
2021-12-03T10:36:33.323964Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T10:38:02.859635Z  WARN hyper::client::pool: IdleTask::poll
2021-12-03T10:38:02.894832Z  WARN hyper::client::pool: IdleTask::poll
2021-12-03T10:38:33.182905Z  WARN poll{id=1}: hyper::client::conn: called
2021-12-03T10:38:33.182958Z  WARN poll{id=1}: hyper::client::conn: shutdown
2021-12-03T10:38:33.189008Z  WARN poll{id=4}: hyper::client::conn: called
2021-12-03T10:38:33.189042Z  WARN poll{id=4}: hyper::client::conn: shutdown
2021-12-03T10:38:33.324401Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T10:38:33.324443Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T10:38:33.324463Z  WARN poll{id=3}: hyper::client::conn: shutdown
2021-12-03T10:38:33.324475Z  WARN poll{id=0}: hyper::client::conn: shutdown

@univerz
Copy link

univerz commented Dec 3, 2021

ah, there is some difference. previous round before every sucessfull ends with chunked encoding response & that accidentaly prevents reusing connection. notice that reqwest is returning before chunked body is completed, but on client side the body is consumed:

let res = self.client.post(self.url.as_ref()).json(&payload).send().await?;
let text = res.text().await?;

every round is spawned inside stream, but debugged with .buffer_unordered(1); => processing is serial.

2021-12-03T11:18:55.002783Z DEBUG poll{id=3}: hyper::proto::h1::conn: incoming body is chunked encoding
2021-12-03T11:18:55.002789Z DEBUG reqwest::async_impl::client: returning poll::ready    
2021-12-03T11:18:55.002795Z DEBUG poll{id=3}: hyper::proto::h1::decode: incoming chunked header: 0x800 (2048 bytes)
2021-12-03T11:18:55.002848Z  WARN hyper::client::conn: after rx=3
2021-12-03T11:18:55.002858Z DEBUG reqwest::async_impl::client: response '200 OK' for http://localhost:8545/
2021-12-03T11:18:55.002862Z DEBUG reqwest::async_impl::client: returning poll::ready
2021-12-03T11:18:55.002864Z  WARN poll{id=2}: hyper::client::conn: called
2021-12-03T11:18:55.002904Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T11:18:55.002910Z DEBUG poll{id=3}: hyper::proto::h1::decode: incoming chunked header: 0x800 (2048 bytes)
2021-12-03T11:18:55.002938Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T11:18:55.002943Z DEBUG poll{id=3}: hyper::proto::h1::decode: incoming chunked header: 0x2BC (700 bytes)

************* setup_evm *************

2021-12-03T11:18:55.003152Z  WARN reqwest::async_impl::request: returning from reqwest::send()
2021-12-03T11:18:55.003169Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T11:18:55.003173Z  WARN hyper::client::pool: can't share
2021-12-03T11:18:55.003176Z  WARN hyper::client::pool: downgrade
2021-12-03T11:18:55.003188Z  WARN send_request_retryable{id=2}: hyper::client::conn: before rx
2021-12-03T11:18:55.003209Z  WARN reqwest::async_impl::request: returning from reqwest::send()
2021-12-03T11:18:55.003216Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T11:18:55.003219Z  WARN hyper::client::pool: can't share
2021-12-03T11:18:55.003217Z  WARN poll{id=2}: hyper::client::conn: called
2021-12-03T11:18:55.003223Z  WARN hyper::client::pool: downgrade
2021-12-03T11:18:55.003241Z  WARN send_request_retryable{id=0}: hyper::client::conn: before rx
2021-12-03T11:18:55.003248Z DEBUG poll{id=2}: hyper::proto::h1::io: flushed 253 bytes
2021-12-03T11:18:55.003250Z  WARN poll{id=0}: hyper::client::conn: called
2021-12-03T11:18:55.003257Z  WARN reqwest::async_impl::request: returning from reqwest::send()
2021-12-03T11:18:55.003267Z DEBUG hyper::client::pool: reuse idle connection for ("http", localhost:8545)
2021-12-03T11:18:55.003270Z  WARN hyper::client::pool: can't share
2021-12-03T11:18:55.003273Z  WARN hyper::client::pool: downgrade
2021-12-03T11:18:55.003278Z DEBUG poll{id=0}: hyper::proto::h1::io: flushed 262 bytes
2021-12-03T11:18:55.003281Z  WARN send_request_retryable{id=1}: hyper::client::conn: before rx
...
2021-12-03T11:18:55.008947Z  WARN poll{id=3}: hyper::client::conn: called
2021-12-03T11:18:55.008952Z DEBUG poll{id=3}: hyper::proto::h1::conn: incoming body completed

@univerz
Copy link

univerz commented Dec 3, 2021

when comparing problematic content-length body request on trace debug level, the hang occurs if it is last completed request & round ends with

2021-12-03T12:00:52.385383Z  WARN poll{id=1}: hyper::client::conn: called
2021-12-03T12:00:52.385391Z TRACE poll{id=1}: hyper::proto::h1::conn: Conn::read_head
2021-12-03T12:00:52.385406Z TRACE poll{id=1}: hyper::proto::h1::io: received 222 bytes
2021-12-03T12:00:52.385412Z TRACE poll{id=1}:parse_headers: hyper::proto::h1::role: Response.parse bytes=222
2021-12-03T12:00:52.385418Z TRACE poll{id=1}:parse_headers: hyper::proto::h1::role: Response.parse Complete(146)
2021-12-03T12:00:52.385427Z DEBUG poll{id=1}: hyper::proto::h1::io: parsed 5 headers
2021-12-03T12:00:52.385431Z DEBUG poll{id=1}: hyper::proto::h1::conn: incoming body is content-length (76 bytes)
2021-12-03T12:00:52.385437Z TRACE poll{id=1}: hyper::proto::h1::decode: decode; state=Length(76)
2021-12-03T12:00:52.385442Z DEBUG poll{id=1}: hyper::proto::h1::conn: incoming body completed
2021-12-03T12:00:52.385447Z TRACE poll{id=1}: hyper::proto::h1::conn: maybe_notify; read_from_io blocked
2021-12-03T12:00:52.385453Z TRACE poll{id=1}: want: signal: Want    
2021-12-03T12:00:52.385457Z TRACE poll{id=1}: hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
2021-12-03T12:00:52.385462Z TRACE poll{id=1}: want: signal: Want    
2021-12-03T12:00:52.385465Z TRACE poll{id=1}: hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }
2021-12-03T12:00:52.385473Z  WARN rpc{method="eth_gasPrice" params="null"}: hyper::client::conn: after rx=1
2021-12-03T12:00:52.385480Z TRACE rpc{method="eth_gasPrice" params="null"}:drop: hyper::client::pool: put; add idle connection for ("http", localhost:8545)
2021-12-03T12:00:52.385487Z DEBUG rpc{method="eth_gasPrice" params="null"}:drop: hyper::client::pool: pooling idle connection for ("http", localhost:8545)
2021-12-03T12:00:52.385492Z  WARN rpc{method="eth_gasPrice" params="null"}:drop:spawn_idle_interval: hyper::client::pool: idle_interval_ref is_some
2021-12-03T12:00:52.385501Z DEBUG rpc{method="eth_gasPrice" params="null"}: reqwest::async_impl::client: response '200 OK' for http://localhost:8545/    
2021-12-03T12:00:52.385507Z DEBUG rpc{method="eth_gasPrice" params="null"}: reqwest::async_impl::client: returning poll::ready    
2021-12-03T12:00:52.385546Z TRACE rpc{method="eth_gasPrice" params="null"}: tokio_util::codec::framed_impl: attempting to decode a frame    
2021-12-03T12:00:52.385551Z TRACE rpc{method="eth_gasPrice" params="null"}: tokio_util::codec::framed_impl: frame decoded from buffer    
2021-12-03T12:00:52.385555Z TRACE rpc{method="eth_gasPrice" params="null"}: tokio_util::codec::framed_impl: attempting to decode a frame    
2021-12-03T12:00:52.385569Z TRACE rpc{method="eth_gasPrice" params="null"}: ethers_providers::provider: rx="\"0x183f284dd0\""

but in successfull case (when it is not the last completed request in round) there is this last poll later with third flush (notice that it happend after pooling idle connection)

2021-12-03T12:00:52.385580Z  WARN poll{id=1}: hyper::client::conn: called
2021-12-03T12:00:52.385585Z TRACE poll{id=1}: want: signal: Want    
2021-12-03T12:00:52.385589Z TRACE poll{id=1}: hyper::proto::h1::conn: flushed({role=client}): State { reading: Init, writing: Init, keep_alive: Idle }

it looks to me this should be close to enough info to fix this for someone with deep understanding of internals. happy to investigate further.

@threema-danilo
Copy link

I'm experiencing a similar issue.

In an application I'm working on, I'm sending multiple requests (from a shared reqwest Client instance) from a future (which is launched through handle.block_on(...) within a worker thread). The first three HTTP requests reproducibly succeed, while the fourth request to the same host reproducibly fails.

According to a Wireshark trace, the fourth request is never sent to the network. I added debug logging deep down into hyper, but didn't really find the source of the issue. I suspected that it's related to connection reuse (because the first three requests work), and just as described above, setting .pool_max_idle_per_host(0) on the client makes it work again.

In case it's useful, here are a few screenshots from my debugging session. The reqwest library delegates the HTTP request to hyper:

image

When looking at the program using tokio-console, I noticed that the task 72 (with Hyper's Exec::execute method) is reported as having lost its waker. I don't know if this could be relevant.

image

image

@jobot0
Copy link

jobot0 commented Jun 23, 2022

Hello, is there any news about this one?

@arthurprs
Copy link
Contributor

We've been hit by #2312 yesterday (and I think once before) and tracked it down to this. The example program from #2312 can easily show the problem.

We're using pool_max_idle_per_host(0) while no better solution exists 😭

@kpark-hrp
Copy link

I just wasted my entire day because of this. This also affects reqwest obviously because it implements hyper underneath. How I encountered this bug...

    std::thread::spawn(move || {
        let a = tokio::runtime::Builder::new_multi_thread().build().unwrap();
        a.block_on(
            async {
                // reqwest call here then `.await`
            }
        );
    }).join().expect("Thread panicked")

But, .pool_max_idle_per_host(0) did the trick of workaround

olanod pushed a commit to virto-network/conduit that referenced this issue Feb 14, 2023
This is a workaround for hyperium/hyper#2312
@numberjuani
Copy link

is there any timeline for fixing this or have we given up?

@GunnarMorrigan
Copy link

I have a two step container build process to reduce container size. In the initial build container everything works fine. However, In my second container it breaks with the same behaviour as described here.

Could this be the same issue or is it a docker driver issue that @sfackler described above?
It does the same under docker and podman..

2023-09-20T18:00:00.716743Z TRACE hyper::client::pool: 638: checkout waiting for idle connection: ("https", _URL_HERE_)
2023-09-20T18:00:00.716810Z DEBUG reqwest::connect: 429: starting new connection: https://_URL_HERE_/    
2023-09-20T18:00:00.716878Z TRACE hyper::client::connect::http: 278: Http::connect; scheme=Some("https"), host=Some("_URL_HERE_"), port=None
2023-09-20T18:00:00.717060Z DEBUG hyper::client::connect::dns: 122: resolving host="_URL_HERE_"
2023-09-20T18:00:00.905903Z DEBUG hyper::client::connect::http: 537: connecting to IP:443
2023-09-20T18:00:00.909242Z DEBUG hyper::client::connect::http: 540: connected to IP:443
2023-09-20T18:00:00.993186Z TRACE hyper::client::pool: 680: checkout dropped for ("https", _URL_HERE_)
2023-09-20T18:00:00.993237Z DEBUG azure_core::policies::retry_policies::retry_policy: 114: io error occurred when making request which will be retried: failed to execute `reqwest` request    

demoray pushed a commit to demoray/azure-sdk-for-rust that referenced this issue Jan 5, 2024
As indicated in Azure#1549, there is an issue with hyper (the underlying
layer used by reqwest) that hangs in some cases on connection pools.
This PR uses a commonly discussed workaround of setting
`pool_max_idle_per_host` to 0.

Ref: hyperium/hyper#2312
demoray added a commit to Azure/azure-sdk-for-rust that referenced this issue Jan 5, 2024
As indicated in #1549, there is an issue with hyper (the underlying
layer used by reqwest) that hangs in some cases on connection pools.
This PR uses a commonly discussed workaround of setting
`pool_max_idle_per_host` to 0.

Ref: hyperium/hyper#2312
jevolk added a commit to jevolk/hyper that referenced this issue Mar 18, 2024
girlbossceo pushed a commit to girlbossceo/hyper that referenced this issue Mar 19, 2024
andybalaam added a commit to matrix-org/matrix-rust-sdk that referenced this issue Mar 22, 2024
@rami3l
Copy link

rami3l commented Jun 6, 2024

@jevolk Just being curious, I saw you pushed a commit in a fork (jevolk@56a64cf) that seemingly to have "[fixed] the deadlock". Would you mind shedding a light on this issue?

@jevolk
Copy link

jevolk commented Jun 17, 2024

@jevolk Just being curious, I saw you pushed a commit in a fork (jevolk@56a64cf) that seemingly to have "[fixed] the deadlock". Would you mind shedding a light on this issue?

The patch appears to prevent the deadlock for the test https://github.com/pandaman64/hyper-hang-tokio but it is not the best solution. I've observed some regressions when using it under normal circumstances which is why I never opened a PR. I left it for future reference.

I should note my application makes an arguably exotic use of reqwest and hyper (legacy) client pools by communicating with thousands of remote hosts in a distributed system, often with several connections each, in an async+multi-threaded process: we do not set pool_max_idle_per_host=0. I have never hit this deadlock and no users have ever reported it. Thus I have not revisited this issue in some time now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests