Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splits invalid when collection order not deterministic #25

Open
mbkroese opened this issue Jun 17, 2021 · 5 comments
Open

Splits invalid when collection order not deterministic #25

mbkroese opened this issue Jun 17, 2021 · 5 comments

Comments

@mbkroese
Copy link

mbkroese commented Jun 17, 2021

The big assumption underlying the two splitting algorithms is that the order of collected items is constant.
However, I've come across a case where this assumption was violated.
In my case I had a test parametrised with pytest.mark.parametrize, but the items to parametrize with would sometimes change order.

Take this example:

import pytest

@pytest.mark.parametrize('name', set(['henk', 'ingrid']))
def test_hello(name):
    pass

If you run this often enough you'll see that the order changes:

[2021-06-17 22:47:10] test_temp.py::test_hello[henk] PASSED                                                                                                                [ 50%]
[2021-06-17 22:47:10] test_temp.py::test_hello[ingrid] PASSED                                                                                                              [100%]

and

[2021-06-17 22:47:10] test_temp.py::test_hello[ingrid] PASSED                                                                                                              [ 50%]
[2021-06-17 22:47:10] test_temp.py::test_hello[henk] PASSED                                                                                                                [100%]

I'm not sure how to address this, but I think there are a few options:

  1. not splitting over different values of parametrize for the same test. In other words, make sure that a single group will run all tests for test_hello.
  2. try to create some deterministic order out of test cases by sorting. I'm not sure this will work in all cases tho (for example it might not work for objects)
  3. do splitting on one machine, save the splits and just call pytest with those pre-calculated groups (so not really using this plugin as a plugin :p)
@jerry-git
Copy link
Owner

I assume it's not deterministic here because of set, or can you repro it also with list or tuple?

@mbkroese
Copy link
Author

No, this problem occurs when either the data structure has non-deterministic order or the code generating the parametrised test cases is for some reason not deterministic.

@jerry-git
Copy link
Owner

I think we could go with 1. aka make sure the tests inside same parametrize are run in the same group. However, the downside is ofc that if one parametrised test is very time consuming vs the rest of the suite, the splits would not be great.

OTOH, maybe it's better to make sure that we don't accidentally skip tests (or run some test in multiple groups) 🤔

With these thoughts, I'd go with 1. 🙂

@mbkroese
Copy link
Author

downside is ofc that if one parametrised test is very time consuming vs the rest of the suite, the splits would not be great.

Yes, and I wonder if we should perhaps be safe by default (i.e. option 1) and allow users to do the unsafe thing (existing behaviour). If we make clear what the tradeoffs are, the user can then decide for him/herself. In other words add a parameter that --split-level=func by default but can be set to --split-level=parametrized or --split-level=file?

@dpanici
Copy link

dpanici commented Aug 8, 2024

Was this implemented yet? If not I think it would be a great feature to have, as I have some tests I want ran in the same group under the same parametrize decorator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants