Derandomize std.algorithm #5202

wilzbach · 2017-02-26T19:21:06Z

So it turns out that using a random seed leads to "interesting" effects. One example is that our coverage report jumps randomly up & down.

Don't believe me? Check it out for yourself:

https://codecov.io/gh/dlang/phobos/compare/77f1a9067b59395f8444cb2d01047680aebcfe5f...f2b58341725a6d195b388c1b6f31dcdaa9f000f7/changes
https://codecov.io/gh/dlang/phobos/compare/22331db571d874ba44e412777ed8310536a18dff...89e4014e3d89c7de4d2e2c50eca7924e465ff062/changes
https://codecov.io/gh/dlang/phobos/compare/86050a1c59eda327fa263549f0b53e56a4cfa9ed...2749175ceb24e4b780725677b7e233f65e4f791c/changes
https://codecov.io/gh/dlang/phobos/compare/77f1a9067b59395f8444cb2d01047680aebcfe5f...2e05f29d3c4cc7690198c3ad0d06eb23293da48a/changes
...
(endless list)

I think it's quite reasonable to fix the random test to a constant value, s.t. we know what parts are actually covered and if needed add tests for the uncovered bits.

Note that there's also floppy line in std.parallelism, but that one might be harder to get.
edit: for std.parallelism we should probably have a look at this #4399

andralex · 2017-02-26T19:24:59Z

Hmmm... there's good advantage to randomized unittesting, we don't want to give that up. How about non-random tests that ensure coverage and then a randomized unittest? Would that be difficult?

wilzbach · 2017-02-26T21:47:24Z

How about non-random tests that ensure coverage and then a randomized unittest? Would that be difficult?

So here's what I did:

run coverage and save as before.lst
comment all random tests in sorting.d
run coverage and save as after.lst
LOOP:
diff both .lst files and search for uncovered bits
add assert(0) in the uncovered bit to find the regarding test
print out the seed and add to the list of pre-defined seeds
continue LOOP until no coverage difference is found

88.629% (+0.007%) compared to 0fef09a

Seems like it worked :)

schveiguy · 2017-02-27T04:27:51Z

From a practical position, I don't like random unit tests, because there can be weird interactions (e.g. some PR fails for totally unrelated random unit test failure, cannot repeat it).

However, I agree random testing has good benefits. I would propose that we have true random testing, but not part of unit tests. Make it a separate testing system, such that random failures generate alerts (probably posts to mailing list), with enough info to add the failures as non-random tests. But they aren't creating spurious failures for PR testing. Something to think about, probably not a quick solution.

In a previous life I wrote problem sets for topcoder and also tested other's problem sets. One thing I would do frequently is generate tons of random data to see if I could cause my solution or the other solutions to fail. If I found a test case that caused failure, then I would add it as a non-random test case for the testing phase of the competition (the hypothesis being that if one of our solutions got caught by some corner case exposed by the random data, someone else might too). During the actual competition, all solutions were run against the exact same test cases to judge whether they were correct or not. So the results are objective, even if they might have been incomplete. Such a stance is well founded.

dnadlinger · 2017-02-27T14:24:31Z

While we are at it, could you please change them to print the seed when the tests fail for reproducibility? IIRC we had already settled on that last time the discussion on randomised testing came up, so perhaps I'm just missing the respective bits of code.

andralex · 2017-02-27T15:10:02Z

Yah, printing the seed along with indications on how to set it to repro is essential. BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own.

wilzbach · 2017-02-28T05:07:00Z

From a practical position, I don't like random unit tests, because there can be weird interactions (e.g. some PR fails for totally unrelated random unit test failure, cannot repeat it).

You can scratch the "can be weird interactions" - just have a look at the CodeCov CI project status of a recent PR - it's moving "randomly" ;-)
edit: for reference, e.g:

While we are at it, could you please change them to print the seed when the tests fail for reproducibility?
Yah, printing the seed along with indications on how to set it to repro is essential. BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own.

Could we please go step by step?
While having a common seeding mechanism in Phobos would be nice, (1) I would prefer not to have them in the first place, (2) the other randomized tests seem to test not such complicated algorithms, (3) it's a design decision decision (e.g. mixin failsafeSeed, runInSeededEnv, shared seed, ...) that blocks this PR unnecessarily, (4) my interest is atm primarily in fixing the random "red cross" for PRs rather soon than later and this PR is hopefully a first step towards it.

dnadlinger · 2017-02-28T17:02:04Z

std/algorithm/sorting.d

-    auto pieces = partition3(a, 25);
-    assert(pieces[0].length + pieces[1].length + pieces[2].length == a.length);
-    foreach (e; pieces[0])
+    uint[] seeds = [3923355730, 1927035882, unpredictableSeed];


Stylistic suggestion: immutable seeds = …

dnadlinger · 2017-02-28T17:02:33Z

std/algorithm/sorting.d

-        assert(left <= a[k]);
-    }
-    if (k + 1 < a.length)
+    uint[] seeds = [90027751, 2709791795, 1374631933, 995751648, 3541495258, 984840953, unpredictableSeed];


Same here: const/immutable

dnadlinger · 2017-02-28T17:02:52Z

std/algorithm/sorting.d

+        auto r = Random(s);
+
+        int[] a = new int[uniform(1, 10000, r)];
+            foreach (ref e; a) e = uniform(-1000, 1000, r);


Wonky indentation.

That's from the existing source ;-)

dnadlinger · 2017-02-28T17:03:58Z

Could we please go step by step?

Fair enough – lgtm, feel free to merge after fixing the broken indentation.

dnadlinger · 2017-02-28T17:08:22Z

BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own.

This would prohibit us from using pure on unit tests to verify the inferred attributes, though.

(Hmm, so does printing the seed, though, unless we use exception/assertion messages for that. Maybe the randomised tests should be a separate thing, although this doesn't seem like an elegant solution.)

schveiguy · 2017-02-28T17:08:50Z

BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own.

Careful here, I think if you are having a random test, you need the seed printed before each test, not one global seed. Otherwise, other factors (which modules are included, which ones test random data, etc) can affect whether the seed is set correctly at the time of the test. If you just mean there needs to be a standardized way to set the seed for an individual test, yes, that is ideal.

Yes, fix the issue that is causing random PR failures first, and then maybe we can have a bigger discussion on random testing elsewhere.

wilzbach · 2017-02-28T17:26:18Z

Fair enough – lgtm, feel free to merge after fixing the broken indentation.
..
Yes, fix the issue that is causing random PR failures first,

Done, but a PR needs to be approved via an GH review otherwise merging is blocked.
If you want, you could give the new "auto-merge-squash" label a try ;-)

dnadlinger · 2017-02-28T17:27:57Z

Done, but a PR needs to be approved via an GH review otherwise merging is blocked.

I didn't realise one couldn't green-light their own PR's.

wilzbach · 2017-02-28T17:31:20Z

I didn't realise one couldn't green-light their own PR's.

This is a rather new change due to us trying to catching up with the improvements GH makes ;-)
If case you haven't seen this summary, it might be interesting as well:

http://forum.dlang.org/post/[email protected]
(or summarized as Wiki entry: https://wiki.dlang.org/Guidelines_for_maintainers)

dnadlinger · 2017-02-28T17:39:16Z

If case you haven't seen this summary, it might be interesting as well

I have seen (and read) your summaries, but the last comment still stands. ;)

andralex · 2017-02-28T17:52:03Z

Thanks @wilzbach this is terrific work.

andralex · 2017-02-28T17:55:03Z

@schveiguy well unless there's randomization in the order of running unittests, we can set things up so one seed sets up an entire test battery. At any rate please let's not print random numbers with every successful run :).

Agreed on taking the discussion to a higher level, maybe you could lead it. Thanks!

schveiguy · 2017-02-28T18:02:05Z

At any rate please let's not print random numbers with every successful run

Right, I was thinking only printing after a failure, but I meant you need to look at the seed for the specific test (and store it before running the test), not once for the beginning of all unit tests.

wilzbach · 2017-03-01T02:01:25Z

Huh?

@andralex could you please have a look? (and maybe for the other repos as well)

wilzbach · 2017-03-01T02:04:27Z

Auto-merge toggled on

wilzbach · 2017-03-01T02:06:21Z

Auto-merge toggled on

The labelled PRs still don't have priority on the auto-tester. Hence I went for the faster option and squashed manually.

andralex · 2017-03-01T14:41:36Z

Alright, squash merges should be on now. Thanks for your diligence!

wilzbach added the automation label Feb 26, 2017

wilzbach assigned andralex Feb 26, 2017

This was referenced Feb 26, 2017

Temporarily disable project coverage #5197

Closed

Don't merge with upstream/master on CircleCi due to CodeCov issues #5198

Closed

wilzbach force-pushed the derandomize-std-algorithm branch 2 times, most recently from db6fb73 to a670227 Compare February 26, 2017 21:37

dnadlinger reviewed Feb 28, 2017

View reviewed changes

dnadlinger approved these changes Feb 28, 2017

View reviewed changes

dnadlinger added the auto-merge-squash label Feb 28, 2017

Derandomize std.algorithm.sorting

b46bd29

wilzbach force-pushed the derandomize-std-algorithm branch from c26b85b to b46bd29 Compare March 1, 2017 02:04

dlang-bot removed the auto-merge-squash label Mar 1, 2017

wilzbach merged commit 6153116 into dlang:master Mar 1, 2017

wilzbach mentioned this pull request Jun 13, 2017

Q: Should we add code coverage to all repos? dlang-community/discussions#15

Open

wilzbach deleted the derandomize-std-algorithm branch December 11, 2017 02:11

wilzbach mentioned this pull request Apr 3, 2018

Issue 18715 - Non-documented unittests should not use unpredictableSeed or default Random alias #6414

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Derandomize std.algorithm #5202

Derandomize std.algorithm #5202

wilzbach commented Feb 26, 2017 •

edited

Loading

andralex commented Feb 26, 2017

wilzbach commented Feb 26, 2017

schveiguy commented Feb 27, 2017

dnadlinger commented Feb 27, 2017

andralex commented Feb 27, 2017

wilzbach commented Feb 28, 2017 •

edited

Loading

dnadlinger Feb 28, 2017

dnadlinger Feb 28, 2017

dnadlinger Feb 28, 2017

wilzbach Feb 28, 2017

dnadlinger commented Feb 28, 2017

dnadlinger commented Feb 28, 2017

schveiguy commented Feb 28, 2017

wilzbach commented Feb 28, 2017

dnadlinger commented Feb 28, 2017

wilzbach commented Feb 28, 2017

dnadlinger commented Feb 28, 2017

andralex commented Feb 28, 2017

andralex commented Feb 28, 2017

schveiguy commented Feb 28, 2017

wilzbach commented Mar 1, 2017

wilzbach commented Mar 1, 2017

wilzbach commented Mar 1, 2017

andralex commented Mar 1, 2017

Derandomize std.algorithm #5202

Derandomize std.algorithm #5202

Conversation

wilzbach commented Feb 26, 2017 • edited Loading

andralex commented Feb 26, 2017

wilzbach commented Feb 26, 2017

schveiguy commented Feb 27, 2017

dnadlinger commented Feb 27, 2017

andralex commented Feb 27, 2017

wilzbach commented Feb 28, 2017 • edited Loading

dnadlinger Feb 28, 2017

Choose a reason for hiding this comment

dnadlinger Feb 28, 2017

Choose a reason for hiding this comment

dnadlinger Feb 28, 2017

Choose a reason for hiding this comment

wilzbach Feb 28, 2017

Choose a reason for hiding this comment

dnadlinger commented Feb 28, 2017

dnadlinger commented Feb 28, 2017

schveiguy commented Feb 28, 2017

wilzbach commented Feb 28, 2017

dnadlinger commented Feb 28, 2017

wilzbach commented Feb 28, 2017

dnadlinger commented Feb 28, 2017

andralex commented Feb 28, 2017

andralex commented Feb 28, 2017

schveiguy commented Feb 28, 2017

wilzbach commented Mar 1, 2017

wilzbach commented Mar 1, 2017

wilzbach commented Mar 1, 2017

andralex commented Mar 1, 2017

wilzbach commented Feb 26, 2017 •

edited

Loading

wilzbach commented Feb 28, 2017 •

edited

Loading