Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultipartWriter quotes field name wrong #4012

Closed
kohtala opened this issue Aug 29, 2019 · 2 comments
Closed

MultipartWriter quotes field name wrong #4012

kohtala opened this issue Aug 29, 2019 · 2 comments

Comments

@kohtala
Copy link
Contributor

kohtala commented Aug 29, 2019

Long story short

My client needs to send multipart/form-data to an API that expects field names with [] in the name. The server does not accept the submission with default set_content_disposition parameters due to wrong quoting.

Expected behaviour

Content-Disposition: form-data; name="files[]"; filename="filename"

Actual behaviour

Content-Disposition: form-data; name="files%5B%5D"; filename="filename"; filename*=utf-8''filename

Steps to reproduce

Client code is like

        with aiohttp.MultipartWriter('form-data') as mpw:
                f = mpw.append(file)
                f.set_content_disposition("form-data", name="files[]", filename="filename")

        res = await self.session.post(url, data=mpw)

Your environment

aiohttp==3.5.4 async client, Ubuntu 18.04, python 3.6.8.

Analysis

Returning Values from Forms: multipart/form-data says

In most multipart types, the MIME header fields in each part are
restricted to US-ASCII; for compatibility with those systems, file
names normally visible to users MAY be encoded using the percent-
encoding method in Section 2, following how a "file:" URI
[URI-SCHEME] might be encoded.

NOTE: The encoding method described in [RFC5987], which would add a
"filename*" parameter to the Content-Disposition header field, MUST
NOT be used.

It would seem the current implementation misinterpreted this to mean all field values are to be percent-encoded. But the RFC7578 is clear that the encoding is only to be used on file names. Furthermore, the filename*= form from MIME Parameter Value and Encoded Word Extensions should be used only for the other fields, but as the filename is already via percent-encoding to within US-ASCII, filename*= is not to be used on the filename.

For converting from unicode string to bytes for the percent-encoding, user will need to specify charset in some cases, as in the RFC:

The encoding used for the file names is typically UTF-8, although
HTML forms will use the charset associated with the form.

Thus, in some cases, an additional charset parameter is needed in set_content_disposition. Is it needed in other functions?

The RFCs refer to RFC822 for quoted-string definition, which is currently obsoleted by Internet Message Format RFC5322.

   qtext           =   %d33 /             ; Printable US-ASCII
                       %d35-91 /          ;  characters not including
                       %d93-126 /         ;  "\" or the quote character
                       obs-qtext

   qcontent        =   qtext / quoted-pair

   quoted-string   =   [CFWS]
                       DQUOTE *([FWS] qcontent) [FWS] DQUOTE
                       [CFWS]

   quoted-pair     =   ("\" (VCHAR / WSP)) / obs-qp

And from Augmented BNF for Syntax Specifications: ABNF

  VCHAR          =  %x21-7E
                                ; visible (printing) characters

         WSP            =  SP / HTAB
                                ; white space```

The quoted-pair quoting of quoted-string is missing in the current implementation.

There is also a rather far-fetched case of extremely long values causing the line length limit of 998 characters to be exceeded https://tools.ietf.org/html/rfc5322#section-2.1.1 and requiring using the Folding White Space (FWS).

I can not tell if there would be any compatibility impact of just changing the percent-quoting to the correct quoted-pair quoting. Should the quote_fields parameter concern the percent-encoding of filename or the quoted-pair of all fields?

The current behavior seems to be result of discussion in #916 to fix #903.

@kohtala
Copy link
Contributor Author

kohtala commented Aug 31, 2019

I noticed, the filename* is specified in RFC 6266 Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP). It however states clearly

  Note: This document does not apply to Content-Disposition header
  fields appearing in payload bodies transmitted over HTTP, such as
  when using the media type "multipart/form-data" ([RFC2388]).

RFC2388 is now obsoleted by the RFC7578.

kohtala added a commit to okoko/aiohttp that referenced this issue Aug 31, 2019
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
kohtala added a commit to okoko/aiohttp that referenced this issue Aug 31, 2019
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
kohtala added a commit to okoko/aiohttp that referenced this issue Aug 31, 2019
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
kohtala added a commit to okoko/aiohttp that referenced this issue Sep 2, 2019
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
kohtala added a commit to okoko/aiohttp that referenced this issue Oct 10, 2019
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
@kohtala
Copy link
Contributor Author

kohtala commented Oct 10, 2019

I checked with Firefox, and any form fields names containing "[]" are sent without being enoded. Also filenames are sent in 8-bit utf-8 without any encoding. Rereading #916, it would seem the quote_fields=False option has been added to get correct behavior. Any API expecting the default quote_fields=True behavior can not work with submissions from html forms. With my patch and quote_fields=False, it is further fixed to quote any quotes in the file names.

kohtala added a commit to okoko/aiohttp that referenced this issue Nov 20, 2019
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
kohtala added a commit to okoko/aiohttp that referenced this issue Oct 16, 2020
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
kohtala added a commit to okoko/aiohttp that referenced this issue Oct 16, 2020
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
kohtala added a commit to okoko/aiohttp that referenced this issue Nov 23, 2020
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
kohtala added a commit to okoko/aiohttp that referenced this issue Nov 24, 2020
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
commonism pushed a commit to commonism/aiohttp that referenced this issue Apr 27, 2021
…s#4031)

Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
commonism pushed a commit to commonism/aiohttp that referenced this issue Apr 27, 2021
…s#4031)

Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
asvetlov pushed a commit that referenced this issue Oct 24, 2021
Posting form data with field names containing "[]" were not recognized
by other multipart/form-data parsers. The percent-encoding is only to
be used on the filename parameter that follows how of a "file:" URI
may be encoded according to RFC7578.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant