Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetConn waiting queue as FIFO queue #212

Closed
cloneable opened this issue Nov 11, 2022 · 3 comments · Fixed by #213
Closed

GetConn waiting queue as FIFO queue #212

cloneable opened this issue Nov 11, 2022 · 3 comments · Fixed by #213

Comments

@cloneable
Copy link
Contributor

cloneable commented Nov 11, 2022

Hi, we noticed that under higher load that get_conn() requests get served out-of-order resulting in high tail latencies. This happens when a queued waker gets called, but before the GetConn is polled a newer GetConn grabs the available db connection. On top of that, the previously queued GetConn gets pushed to the back of the waiting queue, adding more latency.

Would you consider turning the waiting queue in Exchange into a FIFO queue? I have a patch that we tested in prod and it shows much lower tail latencies for us. See #213.

@cloneable
Copy link
Contributor Author

I wonder if this could be done more easily by marking connections as reserved. Because the main problem is that woken GetConns don't get the expected connections. So maybe we can simply move a connection from available: Vec to a new reserved: Vec and woken GetConns from there.

@blackbeam
Copy link
Owner

Hi. Hmm, hard to say, really.. Reserved connection may end up being dead, this could add to complexity..
Btw, I've updated your PR, could you please look at it? The main idea is to keep queue position for caller whose connection end up being dead and to tidy up the Future impl.

@cloneable
Copy link
Contributor Author

Yes, I looked at adding a reserved: Vec to Exchange earlier, but keeping track of where things are makes this pretty complex, true.

I looked at your commit. Looks great! Much simpler and covers the case when a connection is bad.

What are the next steps? Do you need me to do anything?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants