test_auto_calculate_content_length_doesnt_override_existing_value fails #635

mcepl · 2023-05-06T15:24:15Z

Describe the bug

When running test suite of 0.23.1 while packaging this on openSUSE/Tumbleweed (with urllib3 2.0.1, and requests 2.30.0) the test case test_auto_calculate_content_length_doesnt_override_existing_value fails:

[   25s] ______ test_auto_calculate_content_length_doesnt_override_existing_value _______
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>
[   25s] 
[   25s]     @contextmanager
[   25s]     def _error_catcher(self) -> typing.Generator[None, None, None]:
[   25s]         """
[   25s]         Catch low-level python exceptions, instead re-raising urllib3
[   25s]         variants, so that low-level exceptions are not leaked in the
[   25s]         high-level api.
[   25s]     
[   25s]         On exit, release the connection back to the pool.
[   25s]         """
[   25s]         clean_exit = False
[   25s]     
[   25s]         try:
[   25s]             try:
[   25s] >               yield
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:705: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>, amt = 10240
[   25s] 
[   25s]     def _raw_read(
[   25s]         self,
[   25s]         amt: int | None = None,
[   25s]     ) -> bytes:
[   25s]         """
[   25s]         Reads `amt` of bytes from the socket.
[   25s]         """
[   25s]         if self._fp is None:
[   25s]             return None  # type: ignore[return-value]
[   25s]     
[   25s]         fp_closed = getattr(self._fp, "closed", False)
[   25s]     
[   25s]         with self._error_catcher():
[   25s]             data = self._fp_read(amt) if not fp_closed else b""
[   25s]             if amt is not None and amt != 0 and not data:
[   25s]                 # Platform-specific: Buggy versions of Python.
[   25s]                 # Close the connection when no data is returned
[   25s]                 #
[   25s]                 # This is redundant to what httplib/http.client _should_
[   25s]                 # already do.  However, versions of python released before
[   25s]                 # December 15, 2012 (http://bugs.python.org/issue16298) do
[   25s]                 # not properly close the connection in all cases. There is
[   25s]                 # no harm in redundantly calling close.
[   25s]                 self._fp.close()
[   25s]                 if (
[   25s]                     self.enforce_content_length
[   25s]                     and self.length_remaining is not None
[   25s]                     and self.length_remaining != 0
[   25s]                 ):
[   25s]                     # This is an edge case that httplib failed to cover due
[   25s]                     # to concerns of backward compatibility. We're
[   25s]                     # addressing it here to make sure IncompleteRead is
[   25s]                     # raised during streaming, so all calls with incorrect
[   25s]                     # Content-Length are caught.
[   25s] >                   raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
[   25s] E                   urllib3.exceptions.IncompleteRead: IncompleteRead(4 bytes read, -2 more expected)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:830: IncompleteRead
[   25s] 
[   25s] The above exception was the direct cause of the following exception:
[   25s] 
[   25s]     def generate():
[   25s]         # Special case for urllib3.
[   25s]         if hasattr(self.raw, "stream"):
[   25s]             try:
[   25s] >               yield from self.raw.stream(chunk_size, decode_content=True)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/requests/models.py:816: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>, amt = 10240
[   25s] decode_content = True
[   25s] 
[   25s]     def stream(
[   25s]         self, amt: int | None = 2**16, decode_content: bool | None = None
[   25s]     ) -> typing.Generator[bytes, None, None]:
[   25s]         """
[   25s]         A generator wrapper for the read() method. A call will block until
[   25s]         ``amt`` bytes have been read from the connection or until the
[   25s]         connection is closed.
[   25s]     
[   25s]         :param amt:
[   25s]             How much of the content to read. The generator will return up to
[   25s]             much data per iteration, but may return less. This is particularly
[   25s]             likely when using compressed data. However, the empty string will
[   25s]             never be returned.
[   25s]     
[   25s]         :param decode_content:
[   25s]             If True, will attempt to decode the body based on the
[   25s]             'content-encoding' header.
[   25s]         """
[   25s]         if self.chunked and self.supports_chunked_reads():
[   25s]             yield from self.read_chunked(amt, decode_content=decode_content)
[   25s]         else:
[   25s]             while not is_fp_closed(self._fp):
[   25s] >               data = self.read(amt=amt, decode_content=decode_content)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:935: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>, amt = 10240
[   25s] decode_content = True, cache_content = False
[   25s] 
[   25s]     def read(
[   25s]         self,
[   25s]         amt: int | None = None,
[   25s]         decode_content: bool | None = None,
[   25s]         cache_content: bool = False,
[   25s]     ) -> bytes:
[   25s]         """
[   25s]         Similar to :meth:`http.client.HTTPResponse.read`, but with two additional
[   25s]         parameters: ``decode_content`` and ``cache_content``.
[   25s]     
[   25s]         :param amt:
[   25s]             How much of the content to read. If specified, caching is skipped
[   25s]             because it doesn't make sense to cache partial content as the full
[   25s]             response.
[   25s]     
[   25s]         :param decode_content:
[   25s]             If True, will attempt to decode the body based on the
[   25s]             'content-encoding' header.
[   25s]     
[   25s]         :param cache_content:
[   25s]             If True, will save the returned data such that the same result is
[   25s]             returned despite of the state of the underlying file object. This
[   25s]             is useful if you want the ``.data`` property to continue working
[   25s]             after having ``.read()`` the file object. (Overridden if ``amt`` is
[   25s]             set.)
[   25s]         """
[   25s]         self._init_decoder()
[   25s]         if decode_content is None:
[   25s]             decode_content = self.decode_content
[   25s]     
[   25s]         if amt is not None:
[   25s]             cache_content = False
[   25s]     
[   25s]             if len(self._decoded_buffer) >= amt:
[   25s]                 return self._decoded_buffer.get(amt)
[   25s]     
[   25s]         data = self._raw_read(amt)
[   25s]     
[   25s]         flush_decoder = False
[   25s]         if amt is None:
[   25s]             flush_decoder = True
[   25s]         elif amt != 0 and not data:
[   25s]             flush_decoder = True
[   25s]     
[   25s]         if not data and len(self._decoded_buffer) == 0:
[   25s]             return data
[   25s]     
[   25s]         if amt is None:
[   25s]             data = self._decode(data, decode_content, flush_decoder)
[   25s]             if cache_content:
[   25s]                 self._body = data
[   25s]         else:
[   25s]             # do not waste memory on buffer when not decoding
[   25s]             if not decode_content:
[   25s]                 if self._has_decoded_content:
[   25s]                     raise RuntimeError(
[   25s]                         "Calling read(decode_content=False) is not supported after "
[   25s]                         "read(decode_content=True) was called."
[   25s]                     )
[   25s]                 return data
[   25s]     
[   25s]             decoded_data = self._decode(data, decode_content, flush_decoder)
[   25s]             self._decoded_buffer.put(decoded_data)
[   25s]     
[   25s]             while len(self._decoded_buffer) < amt and data:
[   25s]                 # TODO make sure to initially read enough data to get past the headers
[   25s]                 # For example, the GZ file header takes 10 bytes, we don't want to read
[   25s]                 # it one byte at a time
[   25s] >               data = self._raw_read(amt)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:906: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>, amt = 10240
[   25s] 
[   25s]     def _raw_read(
[   25s]         self,
[   25s]         amt: int | None = None,
[   25s]     ) -> bytes:
[   25s]         """
[   25s]         Reads `amt` of bytes from the socket.
[   25s]         """
[   25s]         if self._fp is None:
[   25s]             return None  # type: ignore[return-value]
[   25s]     
[   25s]         fp_closed = getattr(self._fp, "closed", False)
[   25s]     
[   25s]         with self._error_catcher():
[   25s]             data = self._fp_read(amt) if not fp_closed else b""
[   25s]             if amt is not None and amt != 0 and not data:
[   25s]                 # Platform-specific: Buggy versions of Python.
[   25s]                 # Close the connection when no data is returned
[   25s]                 #
[   25s]                 # This is redundant to what httplib/http.client _should_
[   25s]                 # already do.  However, versions of python released before
[   25s]                 # December 15, 2012 (http://bugs.python.org/issue16298) do
[   25s]                 # not properly close the connection in all cases. There is
[   25s]                 # no harm in redundantly calling close.
[   25s]                 self._fp.close()
[   25s]                 if (
[   25s]                     self.enforce_content_length
[   25s]                     and self.length_remaining is not None
[   25s]                     and self.length_remaining != 0
[   25s]                 ):
[   25s]                     # This is an edge case that httplib failed to cover due
[   25s]                     # to concerns of backward compatibility. We're
[   25s]                     # addressing it here to make sure IncompleteRead is
[   25s]                     # raised during streaming, so all calls with incorrect
[   25s]                     # Content-Length are caught.
[   25s] >                   raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:830: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <contextlib._GeneratorContextManager object at 0x7f43e47e6880>
[   25s] typ = <class 'urllib3.exceptions.IncompleteRead'>
[   25s] value = IncompleteRead(4 bytes read, -2 more expected)
[   25s] traceback = <traceback object at 0x7f43e4e25d80>
[   25s] 
[   25s]     def __exit__(self, typ, value, traceback):
[   25s]         if typ is None:
[   25s]             try:
[   25s]                 next(self.gen)
[   25s]             except StopIteration:
[   25s]                 return False
[   25s]             else:
[   25s]                 raise RuntimeError("generator didn't stop")
[   25s]         else:
[   25s]             if value is None:
[   25s]                 # Need to force instantiation so we can reliably
[   25s]                 # tell if we get the same exception back
[   25s]                 value = typ()
[   25s]             try:
[   25s] >               self.gen.throw(typ, value, traceback)
[   25s] 
[   25s] /usr/lib64/python3.9/contextlib.py:137: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s] self = <urllib3.response.HTTPResponse object at 0x7f43e48b9c40>
[   25s] 
[   25s]     @contextmanager
[   25s]     def _error_catcher(self) -> typing.Generator[None, None, None]:
[   25s]         """
[   25s]         Catch low-level python exceptions, instead re-raising urllib3
[   25s]         variants, so that low-level exceptions are not leaked in the
[   25s]         high-level api.
[   25s]     
[   25s]         On exit, release the connection back to the pool.
[   25s]         """
[   25s]         clean_exit = False
[   25s]     
[   25s]         try:
[   25s]             try:
[   25s]                 yield
[   25s]     
[   25s]             except SocketTimeout as e:
[   25s]                 # FIXME: Ideally we'd like to include the url in the ReadTimeoutError but
[   25s]                 # there is yet no clean way to get at it from this context.
[   25s]                 raise ReadTimeoutError(self._pool, None, "Read timed out.") from e  # type: ignore[arg-type]
[   25s]     
[   25s]             except BaseSSLError as e:
[   25s]                 # FIXME: Is there a better way to differentiate between SSLErrors?
[   25s]                 if "read operation timed out" not in str(e):
[   25s]                     # SSL errors related to framing/MAC get wrapped and reraised here
[   25s]                     raise SSLError(e) from e
[   25s]     
[   25s]                 raise ReadTimeoutError(self._pool, None, "Read timed out.") from e  # type: ignore[arg-type]
[   25s]     
[   25s]             except (HTTPException, OSError) as e:
[   25s]                 # This includes IncompleteRead.
[   25s] >               raise ProtocolError(f"Connection broken: {e!r}", e) from e
[   25s] E               urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(4 bytes read, -2 more expected)', IncompleteRead(4 bytes read, -2 more expected))
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/urllib3/response.py:722: ProtocolError
[   25s] 
[   25s] During handling of the above exception, another exception occurred:
[   25s] 
[   25s]     def test_auto_calculate_content_length_doesnt_override_existing_value():
[   25s]         @responses.activate
[   25s]         def run():
[   25s]             url = "http://example.com/"
[   25s]             responses.add(
[   25s]                 responses.GET,
[   25s]                 url,
[   25s]                 body="test",
[   25s]                 headers={"Content-Length": "2"},
[   25s]                 auto_calculate_content_length=True,
[   25s]             )
[   25s]             resp = requests.get(url)
[   25s]             assert_response(resp, "test")
[   25s]             assert resp.headers["Content-Length"] == "2"
[   25s]     
[   25s] >       run()
[   25s] 
[   25s] responses/tests/test_responses.py:1506: 
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] responses/__init__.py:218: in wrapper
[   25s]     return func(*args, **kwargs)
[   25s] responses/tests/test_responses.py:1502: in run
[   25s]     resp = requests.get(url)
[   25s] /usr/lib/python3.9/site-packages/requests/api.py:73: in get
[   25s]     return request("get", url, params=params, **kwargs)
[   25s] /usr/lib/python3.9/site-packages/requests/api.py:59: in request
[   25s]     return session.request(method=method, url=url, **kwargs)
[   25s] /usr/lib/python3.9/site-packages/requests/sessions.py:587: in request
[   25s]     resp = self.send(prep, **send_kwargs)
[   25s] /usr/lib/python3.9/site-packages/requests/sessions.py:745: in send
[   25s]     r.content
[   25s] /usr/lib/python3.9/site-packages/requests/models.py:899: in content
[   25s]     self._content = b"".join(self.iter_content(CONTENT_CHUNK_SIZE)) or b""
[   25s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[   25s] 
[   25s]     def generate():
[   25s]         # Special case for urllib3.
[   25s]         if hasattr(self.raw, "stream"):
[   25s]             try:
[   25s]                 yield from self.raw.stream(chunk_size, decode_content=True)
[   25s]             except ProtocolError as e:
[   25s] >               raise ChunkedEncodingError(e)
[   25s] E               requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(4 bytes read, -2 more expected)', IncompleteRead(4 bytes read, -2 more expected))
[   25s] 
[   25s] /usr/lib/python3.9/site-packages/requests/models.py:818: ChunkedEncodingError

Additional context

Complete build log with all details about packages used and steps taken to reproduce.

Version of `responses`

0.23.1

Steps to Reproduce

Run the test suite

Expected Result

Test passes

Actual Result

It doesn’t (see above).

Actually, I was trying to make responses work with the new urllib3, so I have added these two patches:

---
 responses/tests/test_responses.py |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/responses/tests/test_responses.py
+++ b/responses/tests/test_responses.py
@@ -2417,7 +2417,7 @@ class TestMaxRetry:
                 total=total,
                 backoff_factor=0.1,
                 status_forcelist=[500],
-                method_whitelist=["GET", "POST", "PATCH"],
+                allowed_methods=["GET", "POST", "PATCH"],
                 raise_on_status=raise_on_status,
             )
         )

and

a bit larger one (I assumed, that it is better to deal with the upgraded urllib3 head on instead of pretending to use the bundled one):

---
 responses/__init__.py             |    8 ++++----
 responses/matchers.py             |    2 +-
 responses/tests/test_responses.py |    5 +++--
 setup.py                          |    2 +-
 4 files changed, 9 insertions(+), 8 deletions(-)

--- a/responses/__init__.py
+++ b/responses/__init__.py
@@ -42,16 +42,16 @@ except ImportError:  # pragma: no cover
     from typing import Literal  # type: ignore  # pragma: no cover
 
 try:
-    from requests.packages.urllib3.response import HTTPResponse
+    from urllib3.response import HTTPResponse
 except ImportError:  # pragma: no cover
     from urllib3.response import HTTPResponse  # pragma: no cover
 
 try:
-    from requests.packages.urllib3.connection import HTTPHeaderDict
+    from urllib3.connection import HTTPHeaderDict
 except ImportError:  # pragma: no cover
     from urllib3.response import HTTPHeaderDict  # type: ignore[attr-defined]
 try:
-    from requests.packages.urllib3.util.url import parse_url
+    from urllib3.util.url import parse_url
 except ImportError:  # pragma: no cover
     from urllib3.util.url import parse_url  # pragma: no cover
 
@@ -1065,7 +1065,7 @@ class RequestsMock(object):
 
         retries = retries or adapter.max_retries
         # first validate that current request is eligible to be retried.
-        # See ``requests.packages.urllib3.util.retry.Retry`` documentation.
+        # See ``urllib3.util.retry.Retry`` documentation.
         if retries.is_retry(
             method=response.request.method, status_code=response.status_code  # type: ignore[misc]
         ):
--- a/responses/matchers.py
+++ b/responses/matchers.py
@@ -11,7 +11,7 @@ from urllib.parse import parse_qsl
 from urllib.parse import urlparse
 
 from requests import PreparedRequest
-from requests.packages.urllib3.util.url import parse_url
+from urllib3.util.url import parse_url
 
 
 def _create_key_val_str(input_dict: Union[Dict[Any, Any], Any]) -> str:
--- a/responses/tests/test_responses.py
+++ b/responses/tests/test_responses.py
@@ -13,6 +13,7 @@ from unittest.mock import patch
 
 import pytest
 import requests
+import urllib3
 from requests.exceptions import ChunkedEncodingError
 from requests.exceptions import ConnectionError
 from requests.exceptions import HTTPError
@@ -1324,14 +1325,14 @@ def test_content_length_error(monkeypatc
     # Type errors here and on 1250 are ignored because the stubs for requests
     # are off https://github.com/python/typeshed/blob/f8501d33c737482a829c6db557a0be26895c5941
     #   /stubs/requests/requests/packages/__init__.pyi#L1
-    original_init = getattr(requests.packages.urllib3.HTTPResponse, "__init__")  # type: ignore
+    original_init = getattr(urllib3.HTTPResponse, "__init__")  # type: ignore
 
     def patched_init(self, *args, **kwargs):
         kwargs["enforce_content_length"] = True
         original_init(self, *args, **kwargs)
 
     monkeypatch.setattr(
-        requests.packages.urllib3.HTTPResponse, "__init__", patched_init  # type: ignore
+        urllib3.HTTPResponse, "__init__", patched_init  # type: ignore
     )
 
     run()
--- a/setup.py
+++ b/setup.py
@@ -18,7 +18,7 @@ setup_requires = []
 
 install_requires = [
     "requests>=2.22.0,<3.0",
-    "urllib3>=1.25.10",
+    "urllib3>=2.0.0",
     "pyyaml",
     "types-PyYAML",
     "typing_extensions; python_version < '3.8'",

However, even the second one didn’t prevent this failed test from happening.

Any ideas, what’s wrong?

The text was updated successfully, but these errors were encountered:

beliaev-maksim · 2023-05-08T18:13:30Z

this is caused by the new version of urllib. I will see what we can do to fix it ASAP

beliaev-maksim · 2023-05-08T18:44:08Z

@markstory
I found the issue. Basically, in new urllib3 incorrect content length is not allowed: https://github.com/urllib3/urllib3/blob/b234aaf7ccbcb64012d8b33d21eb8bc9f768935d/src/urllib3/response.py#L546

Rationale you can find here: urllib3/urllib3#2514

I propose to bump urllib dependency and on the side of responses always enforce auto_calculate_content_length=True
I feel it to be a backwards incompatible change. But at the same time urllib3 is also backwards incompatible from now on

what do you think?
if you give +1 then I have an almost ready to go PR for this

mcepl · 2023-05-08T19:40:31Z

I admit requests has nicer API than what’s in the standard library, but otherwise as a distro package maintainer I dislike it (and urllib3) with a passion.

Which is to say: thank you, this is probably the best we can do.

Avoid installing urllib3>=2 for CI runs as the responses module is not compatible with that version yet (leading to failing tests because mocking HTTP responses in not working correctly). Supposedly getsentry/responses#635 is the relevant upstream issue.

getsantry bot added the Status: Untriaged label May 6, 2023

beliaev-maksim self-assigned this May 8, 2023

beliaev-maksim added Status: Needs More Information bug and removed Status: Untriaged labels May 8, 2023

beliaev-maksim mentioned this issue May 9, 2023

fix test compatibility with urllib3 #636

Merged

markstory closed this as completed in #636 May 9, 2023

Martchus mentioned this issue Jun 12, 2023

Fix CI checks by avoiding incompatible Python modules openSUSE/qem-bot#120

Merged

damian-gn mentioned this issue Jul 26, 2023

Overly strict version constraint on urllib3 #657

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_auto_calculate_content_length_doesnt_override_existing_value fails #635

test_auto_calculate_content_length_doesnt_override_existing_value fails #635

mcepl commented May 6, 2023 •

edited

Loading

beliaev-maksim commented May 8, 2023

beliaev-maksim commented May 8, 2023

mcepl commented May 8, 2023

test_auto_calculate_content_length_doesnt_override_existing_value fails #635

test_auto_calculate_content_length_doesnt_override_existing_value fails #635

Comments

mcepl commented May 6, 2023 • edited Loading

Describe the bug

Additional context

Version of responses

Steps to Reproduce

Expected Result

Actual Result

beliaev-maksim commented May 8, 2023

beliaev-maksim commented May 8, 2023

mcepl commented May 8, 2023

mcepl commented May 6, 2023 •

edited

Loading

Version of `responses`