Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add splitting algorithm #16

Merged
merged 10 commits into from
Jun 17, 2021
Merged

Add splitting algorithm #16

merged 10 commits into from
Jun 17, 2021

Conversation

mbkroese
Copy link

@mbkroese mbkroese commented Jun 11, 2021

This makes two updates:

  1. adds an argument to use a different splitting algorithm
  2. we print the estimated duration of each group.
  3. adds algorithm used to split tests

I'll share some anecdotal information about how the new algorithm performs vs the old.

@mbkroese
Copy link
Author

For my testing suite:

1790 tests, 3 splits

new:
947.01s, 948.57s, 948.51s

old:
961s, 955s, 914s

@mbkroese
Copy link
Author

So the new grouping is more balanced, although the difference is not massive.

@mbkroese
Copy link
Author

@jerry-git @sondrelg please review

@mbkroese
Copy link
Author

Links back to this discussion: #14 (comment)

Copy link
Contributor

@sondrelg sondrelg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes a lot of sense to me @mbkroese 👍 Never seen heapq before - looks pretty neat 🙂

@jerry-git jerry-git self-requested a review June 11, 2021 10:39
tests/test_plugin.py Outdated Show resolved Hide resolved
@mbkroese mbkroese changed the title Mkroese/update splitting algorithm Add splitting algorithm Jun 12, 2021
* Absolute Order: whether each group contains all tests between first and last element in the same order as the original list of tests
* Relative Order: whether each test in each group has the same relative order to its neighbours in the group as in the original list of tests

The `duration_based_chunks` algorithm aims to find optimal boundaries for the list of tests and every test group contains all tests between the start and end bounary.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be valuable to mention also in the usage section that one can specify the splitting algorithm via command line arg and also mention what is the default behaviour

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a small note about it.

from _pytest import nodes


ALGORITHMS = ["duration_based_chunks", "least_duration"]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think enum could make more sense here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to enum. Please let me know if I made the change you intended.


class TestAlgorithms:
@pytest.mark.parametrize("algo_name", algorithms.ALGORITHMS)
def test__split_test(self, algo_name):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep consistency, other test modules seem to use single _ after test 🙂

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is consistent in the sense that I use the format: test_{func}_{does_something}_{when}.
And in this case the func is called _split_test.

Please let me know if you still prefer the change.

tests/test_plugin.py Outdated Show resolved Hide resolved
@mbkroese
Copy link
Author

@jerry-git I don't understand why the CI fails. It seems like the module is not actually installed?

@jerry-git
Copy link
Owner

Hmm, it's interesting that it works for pull_request_target but not for pull_request 🤔 @sondrelg Any knowledge on this?
Screenshot 2021-06-16 at 10 32 51

@sondrelg
Copy link
Contributor

I had the same problem in the Poetry setup PR but I can't remember how I fixed it. I wonder if you should just remove the pull_request_target trigger in the test workflow - as long as it passes in this repo, it must work right? 😛

@jerry-git
Copy link
Owner

Yeah I was considering removing pull_request_target but in this case it's the pull_request which is failing, not pull_request_target

@sondrelg
Copy link
Contributor

Which ever it is, the tests running in this repo seem to pass and the tests running in the fork fail, right
image

I can't think of a reason why this would happen, but there might be a subtle difference between poetry install --no-interaction and pip install -e . that's relevant 🤔

@jerry-git
Copy link
Owner

@mbkroese please rebase from master, I think I got it fixed in #21

mbk added 9 commits June 16, 2021 20:30
The new splitting algorithm goes over the items, determines a best
estimate for the duration and then adds it to the smallest group.

The are edge-cases that this algorithm can't handle properly. For
example, when the last test item has a large test duration, the
current solution won't create the optimal solution. This can be
improved by sorting the test durations first, and starting with assignment
of tests that have the largest test duration.
The old algorithm was restored, and the new one was added as well.
The user can now choose which algorithm is most suitable for their
use case.
@mbkroese
Copy link
Author

thanks @jerry-git , the tests succeed now.

@jerry-git jerry-git merged commit 627c6ee into jerry-git:master Jun 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants