Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PLR6201 should only hit for membership test on constant collections #8322

Open
LefterisJP opened this issue Oct 29, 2023 · 1 comment
Open
Labels
rule Implementing or modifying a lint rule

Comments

@LefterisJP
Copy link

LefterisJP commented Oct 29, 2023

Ruff version

ruff 0.1.3

Ruff settings

Just make sure PLR6201 is enabled

Problem

a = 1
b = 2
x = 5

test1 = 2 in [1, 2, 3]
test2 = 'c' in ('a', 'b', 'c')
test3 = x in (a, b)

Run ruff on this one.

It will hit you with something like:

test4.py:5:14: PLR6201 Use a `set` literal when testing for membership
test4.py:6:16: PLR6201 Use a `set` literal when testing for membership
test4.py:7:14: PLR6201 Use a `set` literal when testing for membership
Found 3 errors.
No fixes available (3 hidden fixes can be enabled with the `--unsafe-fixes` option).

As far as I understand this rule should only hit us for constant collections as the optimization in question mentions: https://docs.python.org/3/whatsnew/3.2.html#optimizations . So I would not expect to see it in test3.

For collections that have variables inside the optimizer does not, or more like can not do anything so initializing a set does not help in any way and incurs the set initialization "penalty" which the optimizer fixes for constant values.

Does this make sense?

@LefterisJP
Copy link
Author

LefterisJP commented Oct 29, 2023

I also just interalized after writing this that in Python constants for the interpreter is just literals. Anything else, even if say you use the typing Final attribute won't be counted as constant in any way. Even Enum values are not considered constant. So this optimization happens in a much smaller amount of cases than I originally thought. Which is sad.

And there is a difference between using set and tuple and list for performance. OR more like between set and tuple/list. The latter are almost identical in some quick tests.

import dis
import timeit
from enum import Enum, auto
from typing import Final


class Color(Enum):
    RED = auto()
    BLUE = auto()
    GREEN = auto()
    WHITE = auto()
    YELLOW = auto()
    BB = auto()
    CC = auto()
    DD = auto()
    FF = auto()
    EE = auto()

a: Final = 1
b: Final = 2

def foo():
    test1 = 2 in [1, 2, 3]
    test2 = 'c' in {'a', 'b', 'c'}
    test3 = b in {Color.RED, Color.GREEN}
    return None


dis.dis(foo)

def with_set():
    return b in {Color.RED, Color.BLUE, Color.GREEN, Color.WHITE, Color.YELLOW, Color.BB, Color.CC, Color.DD, Color.FF, Color.EE}

def with_tuple():
    return b in (Color.RED, Color.BLUE, Color.GREEN, Color.WHITE, Color.YELLOW, Color.BB, Color.CC, Color.DD, Color.FF, Color.EE)

def with_list():
    return b in [Color.RED, Color.BLUE, Color.GREEN, Color.WHITE, Color.YELLOW, Color.BB, Color.CC, Color.DD, Color.FF, Color.EE]


print(f'With set: {timeit.timeit(with_set, number=10000)}')
print(f'With tuple: {timeit.timeit(with_tuple, number=10000)}')
print(f'With list: {timeit.timeit(with_list, number=10000)}')

Output:

 23           0 LOAD_CONST               1 (2)
              2 LOAD_CONST               2 ((1, 2, 3))
              4 CONTAINS_OP              0
              6 STORE_FAST               0 (test1)

 24           8 LOAD_CONST               3 ('c')
             10 LOAD_CONST               4 (frozenset({'a', 'c', 'b'}))
             12 CONTAINS_OP              0
             14 STORE_FAST               1 (test2)

 25          16 LOAD_GLOBAL              0 (b)
             18 LOAD_GLOBAL              1 (Color)
             20 LOAD_ATTR                2 (RED)
             22 LOAD_GLOBAL              1 (Color)
             24 LOAD_ATTR                3 (GREEN)
             26 BUILD_SET                2
             28 CONTAINS_OP              0
             30 STORE_FAST               2 (test3)

 26          32 LOAD_CONST               0 (None)
             34 RETURN_VALUE
With set: 0.011320679999698768
With tuple: 0.005961606000710162
With list: 0.0059832759998244

LefterisJP added a commit to LefterisJP/rotkehlchen that referenced this issue Oct 29, 2023
According to https://docs.python.org/3/whatsnew/3.2.html#optimizations
since Python 3.2 constant literal collections are automatically loaded
as a frozen set at compile time which make their loading time
insignificant.

So at that point it's suggested to replace constant collections of
list/tuples with sets as set IN comparison is faster.

Used ruff's https://docs.astral.sh/ruff/rules/literal-membership/ for
this but had to be very careful and only accept constant literal
suggestions. Since it suggests all IN collection checks to have this
rule, I also disabled it from ruff's pyproject toml for now and opened
an issue: astral-sh/ruff#8322
LefterisJP added a commit to LefterisJP/rotkehlchen that referenced this issue Oct 29, 2023
According to https://docs.python.org/3/whatsnew/3.2.html#optimizations
since Python 3.2 constant literal collections are automatically loaded
as a frozen set at compile time which make their loading time
insignificant.

So at that point it's suggested to replace constant collections of
list/tuples with sets as set IN comparison is faster.

Used ruff's https://docs.astral.sh/ruff/rules/literal-membership/ for
this but had to be very careful and only accept constant literal
suggestions. Since it suggests all IN collection checks to have this
rule, I also disabled it from ruff's pyproject toml for now and opened
an issue: astral-sh/ruff#8322
LefterisJP added a commit to rotki/rotki that referenced this issue Oct 29, 2023
According to https://docs.python.org/3/whatsnew/3.2.html#optimizations
since Python 3.2 constant literal collections are automatically loaded
as a frozen set at compile time which make their loading time
insignificant.

So at that point it's suggested to replace constant collections of
list/tuples with sets as set IN comparison is faster.

Used ruff's https://docs.astral.sh/ruff/rules/literal-membership/ for
this but had to be very careful and only accept constant literal
suggestions. Since it suggests all IN collection checks to have this
rule, I also disabled it from ruff's pyproject toml for now and opened
an issue: astral-sh/ruff#8322
@charliermarsh charliermarsh added the rule Implementing or modifying a lint rule label Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
Development

No branches or pull requests

2 participants