Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_auto_calculate_content_length_doesnt_override_existing_value fails #635

Closed
mcepl opened this issue May 6, 2023 · 3 comments · Fixed by #636
Closed

test_auto_calculate_content_length_doesnt_override_existing_value fails #635

mcepl opened this issue May 6, 2023 · 3 comments · Fixed by #636
Assignees
Labels

Comments

@mcepl
Copy link

mcepl commented May 6, 2023

Describe the bug

When running test suite of 0.23.1 while packaging this on openSUSE/Tumbleweed (with urllib3 2.0.1, and requests 2.30.0) the test case test_auto_calculate_content_length_doesnt_override_existing_value fails:

[   25s] ______ test_auto_calculate_content_length_doesnt_override_existing_value _______
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>
[   25s] 
[   25s]     @contextmanager
[   25s]     def _error_catcher(self) -> typing.Generator[None, None, None]:
[   25s]         """
[   25s]         Catch low-level python exceptions, instead re-raising urllib3
[   25s]         variants, so that low-level exceptions are not leaked in the
[   25s]         high-level api.
[   25s]     
[   25s]         On exit, release the connection back to the pool.
[   25s]         """
[   25s]         clean_exit = False
[   25s]     
[   25s]         try:
[   25s]             try:
[   25s] >               yield
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:705: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>, amt = 10240
[   25s] 
[   25s]     def _raw_read(
[   25s]         self,
[   25s]         amt: int | None = None,
[   25s]     ) -> bytes:
[   25s]         """
[   25s]         Reads `amt` of bytes from the socket.
[   25s]         """
[   25s]         if self._fp is None:
[   25s]             return None  # type: ignore[return-value]
[   25s]     
[   25s]         fp_closed = getattr(self._fp, "closed", False)
[   25s]     
[   25s]         with self._error_catcher():
[   25s]             data = self._fp_read(amt) if not fp_closed else b""
[   25s]             if amt is not None and amt != 0 and not data:
[   25s]                 # Platform-specific: Buggy versions of Python.
[   25s]                 # Close the connection when no data is returned
[   25s]                 #
[   25s]                 # This is redundant to what httplib/http.client _should_
[   25s]                 # already do.  However, versions of python released before
[   25s]                 # December 15, 2012 (http://bugs.python.org/issue16298) do
[   25s]                 # not properly close the connection in all cases. There is
[   25s]                 # no harm in redundantly calling close.
[   25s]                 self._fp.close()
[   25s]                 if (
[   25s]                     self.enforce_content_length
[   25s]                     and self.length_remaining is not None
[   25s]                     and self.length_remaining != 0
[   25s]                 ):
[   25s]                     # This is an edge case that httplib failed to cover due
[   25s]                     # to concerns of backward compatibility. We're
[   25s]                     # addressing it here to make sure IncompleteRead is
[   25s]                     # raised during streaming, so all calls with incorrect
[   25s]                     # Content-Length are caught.
[   25s] >                   raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
[   25s] E                   urllib3.exceptions.IncompleteRead: IncompleteRead(4 bytes read, -2 more expected)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:830: IncompleteRead
[   25s] 
[   25s] The above exception was the direct cause of the following exception:
[   25s] 
[   25s]     def generate():
[   25s]         # Special case for urllib3.
[   25s]         if hasattr(self.raw, "stream"):
[   25s]             try:
[   25s] >               yield from self.raw.stream(chunk_size, decode_content=True)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/requests/models.py:816: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>, amt = 10240
[   25s] decode_content = True
[   25s] 
[   25s]     def stream(
[   25s]         self, amt: int | None = 2**16, decode_content: bool | None = None
[   25s]     ) -> typing.Generator[bytes, None, None]:
[   25s]         """
[   25s]         A generator wrapper for the read() method. A call will block until
[   25s]         ``amt`` bytes have been read from the connection or until the
[   25s]         connection is closed.
[   25s]     
[   25s]         :param amt:
[   25s]             How much of the content to read. The generator will return up to
[   25s]             much data per iteration, but may return less. This is particularly
[   25s]             likely when using compressed data. However, the empty string will
[   25s]             never be returned.
[   25s]     
[   25s]         :param decode_content:
[   25s]             If True, will attempt to decode the body based on the
[   25s]             'content-encoding' header.
[   25s]         """
[   25s]         if self.chunked and self.supports_chunked_reads():
[   25s]             yield from self.read_chunked(amt, decode_content=decode_content)
[   25s]         else:
[   25s]             while not is_fp_closed(self._fp):
[   25s] >               data = self.read(amt=amt, decode_content=decode_content)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:935: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>, amt = 10240
[   25s] decode_content = True, cache_content = False
[   25s] 
[   25s]     def read(
[   25s]         self,
[   25s]         amt: int | None = None,
[   25s]         decode_content: bool | None = None,
[   25s]         cache_content: bool = False,
[   25s]     ) -> bytes:
[   25s]         """
[   25s]         Similar to :meth:`http.client.HTTPResponse.read`, but with two additional
[   25s]         parameters: ``decode_content`` and ``cache_content``.
[   25s]     
[   25s]         :param amt:
[   25s]             How much of the content to read. If specified, caching is skipped
[   25s]             because it doesn't make sense to cache partial content as the full
[   25s]             response.
[   25s]     
[   25s]         :param decode_content:
[   25s]             If True, will attempt to decode the body based on the
[   25s]             'content-encoding' header.
[   25s]     
[   25s]         :param cache_content:
[   25s]             If True, will save the returned data such that the same result is
[   25s]             returned despite of the state of the underlying file object. This
[   25s]             is useful if you want the ``.data`` property to continue working
[   25s]             after having ``.read()`` the file object. (Overridden if ``amt`` is
[   25s]             set.)
[   25s]         """
[   25s]         self._init_decoder()
[   25s]         if decode_content is None:
[   25s]             decode_content = self.decode_content
[   25s]     
[   25s]         if amt is not None:
[   25s]             cache_content = False
[   25s]     
[   25s]             if len(self._decoded_buffer) >= amt:
[   25s]                 return self._decoded_buffer.get(amt)
[   25s]     
[   25s]         data = self._raw_read(amt)
[   25s]     
[   25s]         flush_decoder = False
[   25s]         if amt is None:
[   25s]             flush_decoder = True
[   25s]         elif amt != 0 and not data:
[   25s]             flush_decoder = True
[   25s]     
[   25s]         if not data and len(self._decoded_buffer) == 0:
[   25s]             return data
[   25s]     
[   25s]         if amt is None:
[   25s]             data = self._decode(data, decode_content, flush_decoder)
[   25s]             if cache_content:
[   25s]                 self._body = data
[   25s]         else:
[   25s]             # do not waste memory on buffer when not decoding
[   25s]             if not decode_content:
[   25s]                 if self._has_decoded_content:
[   25s]                     raise RuntimeError(
[   25s]                         "Calling read(decode_content=False) is not supported after "
[   25s]                         "read(decode_content=True) was called."
[   25s]                     )
[   25s]                 return data
[   25s]     
[   25s]             decoded_data = self._decode(data, decode_content, flush_decoder)
[   25s]             self._decoded_buffer.put(decoded_data)
[   25s]     
[   25s]             while len(self._decoded_buffer) < amt and data:
[   25s]                 # TODO make sure to initially read enough data to get past the headers
[   25s]                 # For example, the GZ file header takes 10 bytes, we don't want to read
[   25s]                 # it one byte at a time
[   25s] >               data = self._raw_read(amt)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:906: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>, amt = 10240
[   25s] 
[   25s]     def _raw_read(
[   25s]         self,
[   25s]         amt: int | None = None,
[   25s]     ) -> bytes:
[   25s]         """
[   25s]         Reads `amt` of bytes from the socket.
[   25s]         """
[   25s]         if self._fp is None:
[   25s]             return None  # type: ignore[return-value]
[   25s]     
[   25s]         fp_closed = getattr(self._fp, "closed", False)
[   25s]     
[   25s]         with self._error_catcher():
[   25s]             data = self._fp_read(amt) if not fp_closed else b""
[   25s]             if amt is not None and amt != 0 and not data:
[   25s]                 # Platform-specific: Buggy versions of Python.
[   25s]                 # Close the connection when no data is returned
[   25s]                 #
[   25s]                 # This is redundant to what httplib/http.client _should_
[   25s]                 # already do.  However, versions of python released before
[   25s]                 # December 15, 2012 (http://bugs.python.org/issue16298) do
[   25s]                 # not properly close the connection in all cases. There is
[   25s]                 # no harm in redundantly calling close.
[   25s]                 self._fp.close()
[   25s]                 if (
[   25s]                     self.enforce_content_length
[   25s]                     and self.length_remaining is not None
[   25s]                     and self.length_remaining != 0
[   25s]                 ):
[   25s]                     # This is an edge case that httplib failed to cover due
[   25s]                     # to concerns of backward compatibility. We're
[   25s]                     # addressing it here to make sure IncompleteRead is
[   25s]                     # raised during streaming, so all calls with incorrect
[   25s]                     # Content-Length are caught.
[   25s] >                   raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:830: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <contextlib._GeneratorContextManager object at 0x7f43e47e6880>
[   25s] typ = <class 'urllib3.exceptions.IncompleteRead'>
[   25s] value = IncompleteRead(4 bytes read, -2 more expected)
[   25s] traceback = <traceback object at 0x7f43e4e25d80>
[   25s] 
[   25s]     def __exit__(self, typ, value, traceback):
[   25s]         if typ is None:
[   25s]             try:
[   25s]                 next(self.gen)
[   25s]             except StopIteration:
[   25s]                 return False
[   25s]             else:
[   25s]                 raise RuntimeError("generator didn't stop")
[   25s]         else:
[   25s]             if value is None:
[   25s]                 # Need to force instantiation so we can reliably
[   25s]                 # tell if we get the same exception back
[   25s]                 value = typ()
[   25s]             try:
[   25s] >               self.gen.throw(typ, value, traceback)
[   25s] 
[   25s] /usr/lib64/python3.9/contextlib.py:137: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>
[   25s] 
[   25s]     @contextmanager
[   25s]     def _error_catcher(self) -> typing.Generator[None, None, None]:
[   25s]         """
[   25s]         Catch low-level python exceptions, instead re-raising urllib3
[   25s]         variants, so that low-level exceptions are not leaked in the
[   25s]         high-level api.
[   25s]     
[   25s]         On exit, release the connection back to the pool.
[   25s]         """
[   25s]         clean_exit = False
[   25s]     
[   25s]         try:
[   25s]             try:
[   25s]                 yield
[   25s]     
[   25s]             except SocketTimeout as e:
[   25s]                 # FIXME: Ideally we'd like to include the url in the ReadTimeoutError but
[   25s]                 # there is yet no clean way to get at it from this context.
[   25s]                 raise ReadTimeoutError(self._pool, None, "Read timed out.") from e  # type: ignore[arg-type]
[   25s]     
[   25s]             except BaseSSLError as e:
[   25s]                 # FIXME: Is there a better way to differentiate between SSLErrors?
[   25s]                 if "read operation timed out" not in str(e):
[   25s]                     # SSL errors related to framing/MAC get wrapped and reraised here
[   25s]                     raise SSLError(e) from e
[   25s]     
[   25s]                 raise ReadTimeoutError(self._pool, None, "Read timed out.") from e  # type: ignore[arg-type]
[   25s]     
[   25s]             except (HTTPException, OSError) as e:
[   25s]                 # This includes IncompleteRead.
[   25s] >               raise ProtocolError(f"Connection broken: {e!r}", e) from e
[   25s] E               urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(4 bytes read, -2 more expected)', IncompleteRead(4 bytes read, -2 more expected))
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:722: ProtocolError
[   25s] 
[   25s] During handling of the above exception, another exception occurred:
[   25s] 
[   25s]     def test_auto_calculate_content_length_doesnt_override_existing_value():
[   25s]         @responses.activate
[   25s]         def run():
[   25s]             url = "http://example.com/"
[   25s]             responses.add(
[   25s]                 responses.GET,
[   25s]                 url,
[   25s]                 body="test",
[   25s]                 headers={"Content-Length": "2"},
[   25s]                 auto_calculate_content_length=True,
[   25s]             )
[   25s]             resp = requests.get(url)
[   25s]             assert_response(resp, "test")
[   25s]             assert resp.headers["Content-Length"] == "2"
[   25s]     
[   25s] >       run()
[   25s] 
[   25s] responses/tests/test_responses.py:1506: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] responses/__init__.py:218: in wrapper
[   25s]     return func(*args, **kwargs)
[   25s] responses/tests/test_responses.py:1502: in run
[   25s]     resp = requests.get(url)
[   25s] /usr/lib/python3.9/site-packages/requests/api.py:73: in get
[   25s]     return request("get", url, params=params, **kwargs)
[   25s] /usr/lib/python3.9/site-packages/requests/api.py:59: in request
[   25s]     return session.request(method=method, url=url, **kwargs)
[   25s] /usr/lib/python3.9/site-packages/requests/sessions.py:587: in request
[   25s]     resp = self.send(prep, **send_kwargs)
[   25s] /usr/lib/python3.9/site-packages/requests/sessions.py:745: in send
[   25s]     r.content
[   25s] /usr/lib/python3.9/site-packages/requests/models.py:899: in content
[   25s]     self._content = b"".join(self.iter_content(CONTENT_CHUNK_SIZE)) or b""
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s]     def generate():
[   25s]         # Special case for urllib3.
[   25s]         if hasattr(self.raw, "stream"):
[   25s]             try:
[   25s]                 yield from self.raw.stream(chunk_size, decode_content=True)
[   25s]             except ProtocolError as e:
[   25s] >               raise ChunkedEncodingError(e)
[   25s] E               requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(4 bytes read, -2 more expected)', IncompleteRead(4 bytes read, -2 more expected))
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/requests/models.py:818: ChunkedEncodingError

Additional context

Complete build log with all details about packages used and steps taken to reproduce.

Version of responses

0.23.1

Steps to Reproduce

Run the test suite

Expected Result

Test passes

Actual Result

It doesn’t (see above).

Actually, I was trying to make responses work with the new urllib3, so I have added these two patches:

---
 responses/tests/test_responses.py |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/responses/tests/test_responses.py
+++ b/responses/tests/test_responses.py
@@ -2417,7 +2417,7 @@ class TestMaxRetry:
                 total=total,
                 backoff_factor=0.1,
                 status_forcelist=[500],
-                method_whitelist=["GET", "POST", "PATCH"],
+                allowed_methods=["GET", "POST", "PATCH"],
                 raise_on_status=raise_on_status,
             )
         )

and

a bit larger one (I assumed, that it is better to deal with the upgraded urllib3 head on instead of pretending to use the bundled one):

---
 responses/__init__.py             |    8 ++++----
 responses/matchers.py             |    2 +-
 responses/tests/test_responses.py |    5 +++--
 setup.py                          |    2 +-
 4 files changed, 9 insertions(+), 8 deletions(-)

--- a/responses/__init__.py
+++ b/responses/__init__.py
@@ -42,16 +42,16 @@ except ImportError:  # pragma: no cover
     from typing import Literal  # type: ignore  # pragma: no cover
 
 try:
-    from requests.packages.urllib3.response import HTTPResponse
+    from urllib3.response import HTTPResponse
 except ImportError:  # pragma: no cover
     from urllib3.response import HTTPResponse  # pragma: no cover
 
 try:
-    from requests.packages.urllib3.connection import HTTPHeaderDict
+    from urllib3.connection import HTTPHeaderDict
 except ImportError:  # pragma: no cover
     from urllib3.response import HTTPHeaderDict  # type: ignore[attr-defined]
 try:
-    from requests.packages.urllib3.util.url import parse_url
+    from urllib3.util.url import parse_url
 except ImportError:  # pragma: no cover
     from urllib3.util.url import parse_url  # pragma: no cover
 
@@ -1065,7 +1065,7 @@ class RequestsMock(object):
 
         retries = retries or adapter.max_retries
         # first validate that current request is eligible to be retried.
-        # See ``requests.packages.urllib3.util.retry.Retry`` documentation.
+        # See ``urllib3.util.retry.Retry`` documentation.
         if retries.is_retry(
             method=response.request.method, status_code=response.status_code  # type: ignore[misc]
         ):
--- a/responses/matchers.py
+++ b/responses/matchers.py
@@ -11,7 +11,7 @@ from urllib.parse import parse_qsl
 from urllib.parse import urlparse
 
 from requests import PreparedRequest
-from requests.packages.urllib3.util.url import parse_url
+from urllib3.util.url import parse_url
 
 
 def _create_key_val_str(input_dict: Union[Dict[Any, Any], Any]) -> str:
--- a/responses/tests/test_responses.py
+++ b/responses/tests/test_responses.py
@@ -13,6 +13,7 @@ from unittest.mock import patch
 
 import pytest
 import requests
+import urllib3
 from requests.exceptions import ChunkedEncodingError
 from requests.exceptions import ConnectionError
 from requests.exceptions import HTTPError
@@ -1324,14 +1325,14 @@ def test_content_length_error(monkeypatc
     # Type errors here and on 1250 are ignored because the stubs for requests
     # are off https://github.com/python/typeshed/blob/f8501d33c737482a829c6db557a0be26895c5941
     #   /stubs/requests/requests/packages/__init__.pyi#L1
-    original_init = getattr(requests.packages.urllib3.HTTPResponse, "__init__")  # type: ignore
+    original_init = getattr(urllib3.HTTPResponse, "__init__")  # type: ignore
 
     def patched_init(self, *args, **kwargs):
         kwargs["enforce_content_length"] = True
         original_init(self, *args, **kwargs)
 
     monkeypatch.setattr(
-        requests.packages.urllib3.HTTPResponse, "__init__", patched_init  # type: ignore
+        urllib3.HTTPResponse, "__init__", patched_init  # type: ignore
     )
 
     run()
--- a/setup.py
+++ b/setup.py
@@ -18,7 +18,7 @@ setup_requires = []
 
 install_requires = [
     "requests>=2.22.0,<3.0",
-    "urllib3>=1.25.10",
+    "urllib3>=2.0.0",
     "pyyaml",
     "types-PyYAML",
     "typing_extensions; python_version < '3.8'",

However, even the second one didn’t prevent this failed test from happening.

Any ideas, what’s wrong?

@beliaev-maksim
Copy link
Collaborator

this is caused by the new version of urllib. I will see what we can do to fix it ASAP

@beliaev-maksim
Copy link
Collaborator

@markstory
I found the issue. Basically, in new urllib3 incorrect content length is not allowed: https://github.com/urllib3/urllib3/blob/b234aaf7ccbcb64012d8b33d21eb8bc9f768935d/src/urllib3/response.py#L546

Rationale you can find here: urllib3/urllib3#2514

I propose to bump urllib dependency and on the side of responses always enforce auto_calculate_content_length=True
I feel it to be a backwards incompatible change. But at the same time urllib3 is also backwards incompatible from now on

what do you think?
if you give +1 then I have an almost ready to go PR for this

@mcepl
Copy link
Author

mcepl commented May 8, 2023

I admit requests has nicer API than what’s in the standard library, but otherwise as a distro package maintainer I dislike it (and urllib3) with a passion.

Which is to say: thank you, this is probably the best we can do.

Martchus added a commit to Martchus/qem-bot that referenced this issue Jun 12, 2023
Avoid installing urllib3>=2 for CI runs as the responses module is not
compatible with that version yet (leading to failing tests because
mocking HTTP responses in not working correctly).

Supposedly getsentry/responses#635 is the
relevant upstream issue.
Martchus added a commit to Martchus/qem-bot that referenced this issue Jun 13, 2023
Avoid installing urllib3>=2 for CI runs as the responses module is not
compatible with that version yet (leading to failing tests because
mocking HTTP responses in not working correctly).

Supposedly getsentry/responses#635 is the
relevant upstream issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants