Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiving zero byte chunks under certain conditions #3525

Closed
marconfus opened this issue Jan 11, 2019 · 11 comments
Closed

Receiving zero byte chunks under certain conditions #3525

marconfus opened this issue Jan 11, 2019 · 11 comments

Comments

@marconfus
Copy link

marconfus commented Jan 11, 2019

Long story short

When running under "Red Hat Enterprise Linux Server release 7.5" the client only gets a partial result from an https URL using chunked transfer encoding.

In another ticket I found the following testcode:

import asyncio
import aiohttp
import ssl

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get(url=INTERNALURL, timeout=None, ssl=False) as resp:
            buffer = b""
            async for raw_data, end_of_http_chunk in resp.content.iter_chunks():
                print('chunk received', len(raw_data))
                buffer += raw_data
                if not end_of_http_chunk:
                    continue
            print("len(buffer)", len(buffer))

print(ssl.OPENSSL_VERSION)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Expected behaviour

Output of Python 3.6.6 in an up-to-date alpine docker container:

LibreSSL 2.7.4
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 8184
chunk received 5491
len(buffer) 111883

Actual behaviour

Ouput under RHEL7.5 with Python 3.6.3:

OpenSSL 1.0.1e-fips 11 Feb 2013
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 0
chunk received 8184
chunk received 5491
chunk received 0
len(buffer) 111883

The amount and position of zero byte length chunks is sometimes varying from call to call.
Of course, awaiting response.text() I get only 8184 (or a multiple) bytes of content instead of the expected 111883.
Calling the same URL over http I get varying chunk sizes, but never a zero length one before the end, so everything is good.
Using the Requests library in the same environment using the same https URL I also get the correct result.

Steps to reproduce

Unfortunately the URL is not publicly reachable. It's a webservice with many clients on different platforms (curl, wget, Java, Firefox) and noone else has the problem.

Your environment

aiohttp 3.5.3 (client)
The only obvious difference between the working/nonworking environment is the ssl library.

Is there anything else I can contribute to find the error?

@aio-libs-bot
Copy link

GitMate.io thinks the contributor most likely able to help you is @asvetlov.

Possibly related issues are #1428 (Method "read_chunk" of "BodyPartReader" returns zero bytes before eof), #1777 (parse_frame receives a 1 byte buf when using proxy and fails to parse header), #1615 (Chunk size is deprecated), #3281 (aiohttp client corrupts uploads that use chunked encoding due to race condition in socket reuse), and #1814 (Close websocket connection when pong not received).

@asvetlov
Copy link
Member

@socketpair it looks close to your last fix, isn't it?

@socketpair
Copy link
Contributor

Yes, maybe. @marconfus please tell exact aiohttp version. It will be very hard to debug if I can't reproduce the bug. It will be nice if you can give me traffic dump of failed request/response with unencrypted traffic.

@socketpair
Copy link
Contributor

socketpair commented Jan 11, 2019

@mnach

@marconfus
Copy link
Author

Version is 3.5.3 installed via pip. Is there a way to create a dump with aiohttp (after decryption)?
As the response from my setup is sensitive data, I have to see if I can reproduce it with a different server...

@marconfus
Copy link
Author

I can reproduce it with a simple setup:
Server-side is a cgi script on an Apache httpd 2.4 on CentOS 7:
https://jira.marconfus.org/test

Client-side with CentOS 7 or MacOS 10.13 both OpenSSL 1.0.2x

As before, an up-to-date alpine system is fine.
I looks like setups of OpenSSL 1.0.? as client and server are affected.

(venv) [marco@zbox tmp]$ python test.py 
OpenSSL 1.0.2k-fips  26 Jan 2017
https://jira.marconfus.org/test
chunk received 8000
chunk received 0
chunk received 8000
chunk received 4480
chunk received 8000
chunk received 8000
chunk received 4480
len(buffer) 40960
https://httpbin.org/stream-bytes/40960
chunk received 4338
chunk received 5902
chunk received 5786
chunk received 4454
chunk received 1442
chunk received 8798
chunk received 4338
chunk received 5902
len(buffer) 40960
(venv) [marco@zbox tmp]$ python test.py 
OpenSSL 1.0.2k-fips  26 Jan 2017
https://jira.marconfus.org/test
chunk received 8000
chunk received 0
chunk received 8000
chunk received 4480
chunk received 8000
chunk received 8000
chunk received 4480
len(buffer) 40960
https://httpbin.org/stream-bytes/40960
chunk received 7234
chunk received 3006
chunk received 2890
chunk received 7350
chunk received 4338
chunk received 5902
chunk received 5786
chunk received 4454
len(buffer) 40960

@socketpair
Copy link
Contributor

socketpair commented Jan 11, 2019

Yes, I have reproduced that. Unfortunatelly, I can't fix it easily :(

Sequence of events:

  1. EVENT: begin chunk of length 0x8000
  2. EVENT: 0x8000 bytes of data
  3. your app receives (0x8000 bytes of data, False)
  4. EVENT: end of chunk
  5. your app receives (0 bytes, True)

It looks like we may track size of the chunk and don't trigger your app if it is a last subchunk of HTTP chunk. But actually it is not so. Because for compressed payload we can not track size of a chunk -- it is the size of compressed data, but we receive from nginx parser uncompressed data.

Possible solution: to make a hack with tracking size of a chunk when no compression involved. For compressed payload we can do nothing, and so behavior will be the same.

You may ask, if so, why previous implementation did not trigger all that ? I will answer: it had the bugs proven with testcases. For example, for your case it did not report event "end of chunk", so applications that use etcd watchers could not distinguish border of the messages. So, if HTTP chunk borders are not in your interest, I advice you to ignore chunks of zero lengths, or, just use iter_any():

#!/usr/bin/python3

import asyncio
import aiohttp
import ssl

INTERNALURL='https://jira.marconfus.org/test'

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get(url=INTERNALURL, timeout=None, ssl=False) as resp:
            buffer = b""
            async for raw_data in resp.content.iter_any():
                print('chunk received', len(raw_data))
                buffer += raw_data
            print("len(buffer)", len(buffer))

print(ssl.OPENSSL_VERSION)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

UPD! URGENT! IMPORTANT @asvetlov .read() and .readany() (and iter_any()) are broken(!) silient data truncation. I have a fix. We have no tests for that case. My apologize for bugs introduced by patch of @mnach. I will fix that urgently. Stay tuned.

@socketpair
Copy link
Contributor

socketpair commented Jan 11, 2019

@marconfus thanks for providing the way to trigger bug.

asvetlov pushed a commit that referenced this issue Jan 11, 2019
(cherry picked from commit 5c4cb82)

Co-authored-by: Коренберг Марк <[email protected]>
@socketpair
Copy link
Contributor

After merging my latest urgent changes, example given by me in earlier message (iter_any) will work. @asvetlov. Without ones, it works, but yields only one chunk

asvetlov added a commit that referenced this issue Jan 12, 2019
…3528)

(cherry picked from commit 5c4cb82)

Co-authored-by: Коренберг Марк <[email protected]>
@marconfus
Copy link
Author

The fix seems to work fine. Thanks for the fast response!

asvetlov pushed a commit that referenced this issue Jan 21, 2019
…) (#3560)

(cherry picked from commit c3f494f)

Co-authored-by: Коренберг Марк <[email protected]>
asvetlov added a commit that referenced this issue Jan 21, 2019
…) (#3560) (#3565)

(cherry picked from commit c3f494f)

Co-authored-by: Коренберг Марк <[email protected]>
@lock
Copy link

lock bot commented Jan 14, 2020

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Jan 14, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Jan 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants