-
-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client doesn't disconnect when ping task exits #127
Comments
Okay. I don't think I've ever tried putting the server machine to sleep. Just to confirm, the system works correctly if you kill the server, correct? So the problem is only when the server goes to sleep. |
Yeah, I know, it's an odd situation, but at least it resembles the problem I started out investigating :-) Yes, simply killing/terminating the server works as expected - been doing that more than enough lately ;-) Isn't the client supposed to disconnect when the ping times out? |
Yes, it should. It's a bug. |
I've been further looking into this. The "problem" seems to be related to the HTTP requests that simply wait forever. Now there are two calls into requests, a POST and a long-polling GET. The post seems easy, adding Maybe if we take the ping interval into consideration, we know that a polling GET only lives as long as the ping interval, so we could take that interval as the timeout for the polling, as it should have already returned the pong anyway. I was already using an overridden EngineIOClient to allow for unverified SSL connections (see #99 ) So, the following "patch" to this "patch" seems to do the trick for me in this specific case where I put the server process to sleep. I'll have to see if it also solves the similar problem on my production machines. class MyEngineIoClient(eio.Client):
def _send_request(self, method, url, headers=None, body=None):
if self.http is None:
self.http = requests.Session()
try:
wait_for = 10 if method == "POST" else 45
return self.http.request(method, url, timeout=wait_for, headers=headers, data=body, verify=False)
except requests.exceptions.Timeout:
self.logger.info("{0} request to {1} timed out after {2}".format(method, url, wait_for))
except requests.exceptions.ConnectionError:
pass Now, both the write loop (first when the sending of the ping fails) and (max) 30 seconds later the read loop also exits because no pong is received, and the client actually disconnects, terminates its [2019-07-12 17:47:00 +0200] [engineio.client] [9485] [WARNING] POST request to https://localhost:8000/socket.io/?transport=polling&EIO=3&sid=8fbdefc9c0804b3d9cb0b8d67635effc timed out after 10
[2019-07-12 17:47:00 +0200] [engineio.client] [9485] [WARNING] Connection refused by the server, aborting
...
[2019-07-12 17:47:10 +0200] [engineio.client] [9485] [WARNING] GET request to https://localhost:8000/socket.io/?transport=polling&EIO=3&sid=8fbdefc9c0804b3d9cb0b8d67635effc&t=1562924785.948868 timed out after 45
[2019-07-12 17:47:10 +0200] [engineio.client] [9485] [WARNING] Connection refused by the server, aborting This is a rough-cut to experiment with. I'll look further into it over the weekend, as I need a (temporary) solution quickly to avoid my hosts become non-responsive ;-) Until now, I hadn't looked into the WebSocket support. After installing it and testing this sleep-case, it detects a problem: [2019-07-12 18:06:13 +0200] [engineio.client] [9834] [ERROR] packet queue is empty, aborting ... but then also keeps waiting... So here also some additional work is needed. Adding a |
@christophevg would you like to give the master branch a try? All requests are now issued with timeouts, so I believe this problem is fully addressed, but let me know if missed anything. |
(Sorry for the late reply, laptop was with service center to fix an issue under extended warranty.) I've reviewed your commit and you applied the same fixes I applied to my custom class to introduce the timeouts. With that fix in place, I had +200 clients running for over two weeks, so it sure did the trick. In September, I continue to work on this project, so I can do more testing then. If any issue (re)occurs, I'll follow up. But for now, I would surely consider it closed :-) |
I'm experiencing this issue on AsyncClient, disconnect event not getting fired.
when await sio.disconnect() gets called:
or when I stop the server:
it's not even trying to reconnect, I think maybe it thinks it's still connected? I tried this on the synchronous client it works okay, automatically reconnects. |
I tried downgrading to 3.9.2, seem to solve the problem. EDIT: 3.10 seems to work too. not sure what changes 3.11 onward made. I'm not sure if this is a bug or I have to do things differently 3.11+? Ill use 3.10 for now |
@Dwyte can I ask you to install the master branch of python-engineio and retest? If you still get the error, I'm going to need you to provide an example that I can use to reproduce your problem. Thanks. |
@miguelgrinberg, sorry for the late reply. I just tried testing with the master branch, seems the problem is fixed now. Disconnect event gets triggered and reconnects now, cool. |
Hi,
I'm chasing some kind of deadlock that causes a client (a socketio client to be precise) not to return from its waiting state. So far I've only seen it a few times in my production environment when the wifi network was unstable.
On my development machine I've been trying to replicate the problem, but haven't succeeded so far. While trying though, I ran into the following "similar" problematic behaviour, which might point to the same root cause ... maybe :-)
So, a plain socketio client connects to a flask-socketio server and after connecting, I put the server to sleep (ctrl+z). The logs of the client then look like this:
Here I put the server to sleep...
And the client stays in this state endlessly. It seems to "think" it's still connected, but the ping task has ended, while I would suspect that it comes to the conclusion it has lost its connection to the server and better considers itself disconnected, returning from the wait and return control to the application, that now can respond by retrying the connection or whatever ;-)
Also, this little testing app sends out messages from a background task. Again, I would expect that trying to send such a message unsuccessfully would also make the client realise it is no longer actually connected. So after a long time, after the previous log excerpts I even have these:
This is almost 15 minutes after the ping task exited, and even failing to actually send this message doesn't make the client terminate its wait.
Now, somehow this will be technically correct ;-) but in a way this also seems "wrong" ?!
The text was updated successfully, but these errors were encountered: