Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derandomize std.algorithm #5202

Merged
merged 1 commit into from
Mar 1, 2017
Merged

Conversation

wilzbach
Copy link
Member

@wilzbach wilzbach commented Feb 26, 2017

So it turns out that using a random seed leads to "interesting" effects. One example is that our coverage report jumps randomly up & down.

Don't believe me? Check it out for yourself:

https://codecov.io/gh/dlang/phobos/compare/77f1a9067b59395f8444cb2d01047680aebcfe5f...f2b58341725a6d195b388c1b6f31dcdaa9f000f7/changes
https://codecov.io/gh/dlang/phobos/compare/22331db571d874ba44e412777ed8310536a18dff...89e4014e3d89c7de4d2e2c50eca7924e465ff062/changes
https://codecov.io/gh/dlang/phobos/compare/86050a1c59eda327fa263549f0b53e56a4cfa9ed...2749175ceb24e4b780725677b7e233f65e4f791c/changes
https://codecov.io/gh/dlang/phobos/compare/77f1a9067b59395f8444cb2d01047680aebcfe5f...2e05f29d3c4cc7690198c3ad0d06eb23293da48a/changes
...
(endless list)

I think it's quite reasonable to fix the random test to a constant value, s.t. we know what parts are actually covered and if needed add tests for the uncovered bits.

Note that there's also floppy line in std.parallelism, but that one might be harder to get.
edit: for std.parallelism we should probably have a look at this #4399

@andralex
Copy link
Member

Hmmm... there's good advantage to randomized unittesting, we don't want to give that up. How about non-random tests that ensure coverage and then a randomized unittest? Would that be difficult?

@wilzbach wilzbach force-pushed the derandomize-std-algorithm branch 2 times, most recently from db6fb73 to a670227 Compare February 26, 2017 21:37
@wilzbach
Copy link
Member Author

How about non-random tests that ensure coverage and then a randomized unittest? Would that be difficult?

So here's what I did:

  • run coverage and save as before.lst
  • comment all random tests in sorting.d
  • run coverage and save as after.lst
  • LOOP:
  • diff both .lst files and search for uncovered bits
  • add assert(0) in the uncovered bit to find the regarding test
  • print out the seed and add to the list of pre-defined seeds
  • continue LOOP until no coverage difference is found

88.629% (+0.007%) compared to 0fef09a

Seems like it worked :)

@schveiguy
Copy link
Member

From a practical position, I don't like random unit tests, because there can be weird interactions (e.g. some PR fails for totally unrelated random unit test failure, cannot repeat it).

However, I agree random testing has good benefits. I would propose that we have true random testing, but not part of unit tests. Make it a separate testing system, such that random failures generate alerts (probably posts to mailing list), with enough info to add the failures as non-random tests. But they aren't creating spurious failures for PR testing. Something to think about, probably not a quick solution.

In a previous life I wrote problem sets for topcoder and also tested other's problem sets. One thing I would do frequently is generate tons of random data to see if I could cause my solution or the other solutions to fail. If I found a test case that caused failure, then I would add it as a non-random test case for the testing phase of the competition (the hypothesis being that if one of our solutions got caught by some corner case exposed by the random data, someone else might too). During the actual competition, all solutions were run against the exact same test cases to judge whether they were correct or not. So the results are objective, even if they might have been incomplete. Such a stance is well founded.

@dnadlinger
Copy link
Member

While we are at it, could you please change them to print the seed when the tests fail for reproducibility? IIRC we had already settled on that last time the discussion on randomised testing came up, so perhaps I'm just missing the respective bits of code.

@andralex
Copy link
Member

Yah, printing the seed along with indications on how to set it to repro is essential. BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own.

@wilzbach
Copy link
Member Author

wilzbach commented Feb 28, 2017

From a practical position, I don't like random unit tests, because there can be weird interactions (e.g. some PR fails for totally unrelated random unit test failure, cannot repeat it).

You can scratch the "can be weird interactions" - just have a look at the CodeCov CI project status of a recent PR - it's moving "randomly" ;-)
edit: for reference, e.g:

image

While we are at it, could you please change them to print the seed when the tests fail for reproducibility?
Yah, printing the seed along with indications on how to set it to repro is essential. BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own.

Could we please go step by step?
While having a common seeding mechanism in Phobos would be nice, (1) I would prefer not to have them in the first place, (2) the other randomized tests seem to test not such complicated algorithms, (3) it's a design decision decision (e.g. mixin failsafeSeed, runInSeededEnv, shared seed, ...) that blocks this PR unnecessarily, (4) my interest is atm primarily in fixing the random "red cross" for PRs rather soon than later and this PR is hopefully a first step towards it.

auto pieces = partition3(a, 25);
assert(pieces[0].length + pieces[1].length + pieces[2].length == a.length);
foreach (e; pieces[0])
uint[] seeds = [3923355730, 1927035882, unpredictableSeed];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistic suggestion: immutable seeds = …

assert(left <= a[k]);
}
if (k + 1 < a.length)
uint[] seeds = [90027751, 2709791795, 1374631933, 995751648, 3541495258, 984840953, unpredictableSeed];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here: const/immutable

auto r = Random(s);

int[] a = new int[uniform(1, 10000, r)];
foreach (ref e; a) e = uniform(-1000, 1000, r);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonky indentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's from the existing source ;-)

@dnadlinger
Copy link
Member

Could we please go step by step?

Fair enough – lgtm, feel free to merge after fixing the broken indentation.

@dnadlinger
Copy link
Member

BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own.

This would prohibit us from using pure on unit tests to verify the inferred attributes, though.

(Hmm, so does printing the seed, though, unless we use exception/assertion messages for that. Maybe the randomised tests should be a separate thing, although this doesn't seem like an elegant solution.)

@schveiguy
Copy link
Member

BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own.

Careful here, I think if you are having a random test, you need the seed printed before each test, not one global seed. Otherwise, other factors (which modules are included, which ones test random data, etc) can affect whether the seed is set correctly at the time of the test. If you just mean there needs to be a standardized way to set the seed for an individual test, yes, that is ideal.

Yes, fix the issue that is causing random PR failures first, and then maybe we can have a bigger discussion on random testing elsewhere.

@wilzbach
Copy link
Member Author

Fair enough – lgtm, feel free to merge after fixing the broken indentation.
..
Yes, fix the issue that is causing random PR failures first,

Done, but a PR needs to be approved via an GH review otherwise merging is blocked.
If you want, you could give the new "auto-merge-squash" label a try ;-)

@dnadlinger
Copy link
Member

Done, but a PR needs to be approved via an GH review otherwise merging is blocked.

I didn't realise one couldn't green-light their own PR's.

@wilzbach
Copy link
Member Author

I didn't realise one couldn't green-light their own PR's.

This is a rather new change due to us trying to catching up with the improvements GH makes ;-)
If case you haven't seen this summary, it might be interesting as well:

http://forum.dlang.org/post/[email protected]
(or summarized as Wiki entry: https://wiki.dlang.org/Guidelines_for_maintainers)

@dnadlinger
Copy link
Member

If case you haven't seen this summary, it might be interesting as well

I have seen (and read) your summaries, but the last comment still stands. ;)

@andralex
Copy link
Member

Thanks @wilzbach this is terrific work.

@andralex
Copy link
Member

@schveiguy well unless there's randomization in the order of running unittests, we can set things up so one seed sets up an entire test battery. At any rate please let's not print random numbers with every successful run :).

Agreed on taking the discussion to a higher level, maybe you could lead it. Thanks!

@schveiguy
Copy link
Member

At any rate please let's not print random numbers with every successful run

Right, I was thinking only printing after a failure, but I meant you need to look at the seed for the specific test (and store it before running the test), not once for the beginning of all unit tests.

@wilzbach
Copy link
Member Author

wilzbach commented Mar 1, 2017

Huh?

image

@andralex could you please have a look? (and maybe for the other repos as well)

@wilzbach
Copy link
Member Author

wilzbach commented Mar 1, 2017

Auto-merge toggled on

@wilzbach
Copy link
Member Author

wilzbach commented Mar 1, 2017

Auto-merge toggled on

The labelled PRs still don't have priority on the auto-tester. Hence I went for the faster option and squashed manually.

@wilzbach wilzbach merged commit 6153116 into dlang:master Mar 1, 2017
@andralex
Copy link
Member

andralex commented Mar 1, 2017

Alright, squash merges should be on now. Thanks for your diligence!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants