Testament: test process batching and internals rework #135

saem · 2021-12-27T21:57:05Z

Started as introducing knownIssue and disabled, but reworking internals heavily and minor consequences to the interface.

Refactor Consequences:

introduces a more data oriented design
simplifies categories
- remove dead/unnecessary ones (IC/Navigator)
- remove the megatest category hack
- likely other changes before finishing
makes js a first class target
- now a default execution target
- it should be explicitly excluded (likely need a feature)

break changes forecast:

this will necessitate some test/spec clean-up
might run a fair bit slower without megatest
although in some areas it's more parallelizeable (always runs in parallel)
removed ic and navigator categories, need to fix spec parsing

reviewable things:

ExecutionState is getting pretty stable
The batching is somewhat final, but making a bit lazier can avoid IO
- can't quite add that to the stack of things to change yet
all, category, and single test all behave the same now
should we bother with pattern?

Known issues:

need to get an end to end execution going
runTests is a beast
lol docs
tests for testament

Lessons learned:
Many, biggest one so far is orchestration of dependencies and testing
are separate things and the existing testament conflates these.

saem · 2021-12-27T21:59:09Z

testament/testament.nim

+    of tfkAll:
+      discard
+    of tfkCategories:
+      # xxx: currently multiple categories are unsupported
+      cats: Categories
+    of tfkGlob:
+      pattern: GlobPattern
+    of tfkSingle:
+      test: string
+
+  TestId = int         # xxx: make this a distinct
+  RunId = int          ## test run's id/index # xxx: make this a distinct
+  EntryId = int        ## matrix entry index # xxx: make this a distinct
+  ActionId = int       ## a test action's id # xxx: make this a distinct
+  TestTarget = TTarget # xxx: renamed because I dislike the TXxx convention
+  TestFile = string    # xxx: make this a distinct
+
+  RetryInfo = object
+    test: TestId       ## which test failed
+    target: TestTarget ## the specific target
+
+  ExecutionFlag = enum
+    outputColour,      ## colour the output
+    outputResults,     ## print results to the console
+    outputFailureOnly, ## only output failures
+    outputVerbose,     ## increase output verbosity
+    logBackend         ## enable backend logging
+    dryRun,            ## do not run the tests, only indicate which would run
+    rerunFailed,       ## only run tests failed in the previous run
+    runKnownIssues     ## also execute tests marked as known issues
+
+  ExecutionFlags = set[ExecutionFlag]
+    ## track the option flags that got set for an execution
+
+  RetryList = OrderedTable[TestId, RetryInfo]
+      ## record failures in here so the user can choose to retry them
+
+  TestTargets = set[TestTarget]
+
+  DebugInfo = OrderedTable[ActionId, string]
+
+  RunTime = object
+    compileStart: float      ## when the compile process start
+    compileEnd: float        ## when the compile process ends
+    checkStart: float        ## for run or compiles, check output start
+    checkEnd: float          ## for run or compiles, check output end
+    runStart: float          ## for run, start of execution
+    runEnd: float            ## for run, end of execution
+
+  TestRun = object
+    testId: TestId           ## test id for which this belongs
+    target: TestTarget       ## which target to run for
+    matrixEntry: EntryId     ## which item from the matrix was used
+    runtime: RunTime         ## time tracking for test activities
+
+  TestAction = object
+    runId: RunId
+    case kind: TTestAction
+    of actionReject, actionRun:
+      discard
+    of actionCompile:
+      partOfRun: bool
+
+  Execution = object
+    # user and execution inputs
+    filter: TestFilter       ## filter that was configured
+    flags: ExecutionFlags    ## various options set by the user
+    targets: TestTargets     ## specified targets or `noTargetsSpecified`
+    workingDir: string       ## working directory to begin execution in
+    nimSpecified: bool       ## whether the user specified the nim
+    testArgs: string         ## arguments passed to tests by the user
+
+    # environment input / setup
+    compilerPath: string     ## compiler command to use
+    testsDir: string         ## where to look for tests
+
+    # test discovery data
+    testCats:  Categories    ## categories discovered, for this execution
+    testFiles: seq[TestFile] ## files for this execution
+    testSpecs: seq[TSpec]    ## spec for each file
+
+    # test execution data
+    testRuns: seq[TestRun]   ## a test run: reject, compile, or compile + run
+    actions: seq[TestAction] ## test actions for each run, phases of a run
+    debugInfo: DebugInfo     ## debug info related to actions run for tests
+
+    # test execution related data
+    retryList: RetryList     ## list of failures to potentially retry later
+
+  ParseCliResult = enum
+    parseSuccess       ## successfully parsed cli params
+    parseQuitWithUsage ## parsing failed, quit with usage message
+
+const
+  testResultsDir = "testresults"
+  cacheResultsDir = testResultsDir / "cacheresults"
+  noTargetsSpecified: TestTargets = {}
+  defaultExecFlags = {outputColour}
+  defaultBatchSize = 10
+  noMatrixEntry: EntryId = -1


@alaviss as you can see I learned similar lessons with DOD. 😆

You really should turn those int into int32, no one do more than 2M tests yet

Done locally.

Presently testament will parse all modules under `pure/lib` and `lib/packages/docutils` then: 1. it would read the contents, looking for a string match of `when isMainModule` 1. resulting in a compile action instead of run 2. then include these modules as "tests" within its run There are only two modules that are making use of this: * `lib/packages/docutils/highlite.nim`: which acted as a build check * `lib/pure/xmlparser.nim`: had a basic test case For the former the build check was integrated into `tools/koch/kochDocs` and for the latter a quick test was created within `tests/stdlib`. After which point the `lib/packages/docutils` was removed from testament's stdlib category. Additionally, content parsing for `when isMainModule` has been removed, as this is no longer necessary. This should speed up execution for slower disks. Also these are now used as simple compiles tests instead of attempting to run them as before. This at least ensures that pure modules compile on all platforms. The background motivation for this is it simplifies the testament rework happen currently in: nim-works#135

- introduces a more data oriented design - simplifies categories - makes js a first class target break changes forecasting: - this will necessitate some test/spec clean-up - might run a fair bit slower without megatest - although in some areas it's more parallelizeable - removed ic and navigator categories, need to fix spec parsing first reviewable things: - ExecutionState is getting pretty stable - The batching is somewhat final, but making a bit lazier can avoid IO - can't quite add that to the stack of things to change yet - all, category, and single test all behave the same now - should we bother with pattern? Known issues: - need to get an end to end execution going - runTests is a beast - tests for testament Lessons learned: Many, biggest one so far is orchestration of dependencies and testing are separate things and the existing testament conflates these.

Also outputs the tests it'll run, this demonstrated issues with tests that specify cmd, such as arc tests.

- they're dead code, most of them are simply compiles - many are covered far more comprehensively in the spec - the async ones are gone with async removal - the error message related ones will change dramatically - threading is so broken I don't see how these help

next major todos: - execution - skips/known issues - reporting

need to add in the test matrix override handling miles to go.

then can work through creating test batches

next up is reporting and all that.

- starting to emit program run success/errors - need to handle compile and reject

next up is bugs and lots of code clean-up

immediate next step is to handle skips and known issues

the io category had one test that was disabled long ago removed it and older helper code `readall_echo.nim`

skipping is either because they're disabled or known issues there seems to be an issue with tests discovery of execution, not all tests seem to be run.

check if the next batch is full after failed action pruning

but it only applies to js... hmm

failure info: - arc/tcontrolflow fails in cpp, not sure why - arc/thavlak_orc_stress fails due to valgrind issues in c and cpp - arc/torc_selfcycles fails similar to thavlak_orc_stress in c and cpp

point here is that category commonalities should be in category handling

haxscramper · 2022-11-25T11:12:15Z

@saem aside from xxx todos, maybe failing tests and rebase, is there anything else that needs to be done for this PR? I want to continue this after #476 is merged and initial test directory cleanup is finished.

haxscramper · 2022-11-25T18:27:05Z

Many changes from this PR are already implemented so this can be closed and PRed in a simplified manner to avoid dealing with all merge conflicts.

saem commented Dec 27, 2021

View reviewed changes

saem force-pushed the testament-61-knownIssue-and-disabled branch from d5093ff to 3e5012f Compare December 28, 2021 05:49

saem force-pushed the testament-61-knownIssue-and-disabled branch 2 times, most recently from 34601e3 to 388d062 Compare February 5, 2022 22:51

saem force-pushed the testament-61-knownIssue-and-disabled branch from 15f10f4 to 0d7a032 Compare February 12, 2022 20:14

saem added 22 commits March 13, 2022 19:17

formatting - testament/specs.parseTargets

e2dbb41

testament - formatting & remove dead code

1397a53

sketch knownIssue spec

4393fc4

partially working known issue retesting

299114d

removed examples category

dc1539b

Also outputs the tests it'll run, this demonstrated issues with tests that specify cmd, such as arc tests.

testament: remove nim in action remenant

6cf2642

got test compilation working

bd11f35

next major todos: - execution - skips/known issues - reporting

added special handling for gc tests

d8dcbd3

need to add in the test matrix override handling miles to go.

just need to rough in stdlib

459e174

then can work through creating test batches

compiling all the tests

26f2958

next up is reporting and all that.

part way through wiring up results

53d5361

further along creating the run command

65d4281

started executing compiled executables

c7e740d

set reminders before rebase

b38f128

test file name handling is awful, still

ff72a23

lift out the ridiculous template

6c094b8

part way through error reporting

1828ba9

- starting to emit program run success/errors - need to handle compile and reject

right before I try another big refactor

dca9dbb

end to end runs starting to work

e32b4c5

next up is bugs and lots of code clean-up

remove some cruft

69f894a

immediate next step is to handle skips and known issues

saem force-pushed the testament-61-knownIssue-and-disabled branch from 96db68a to 69f894a Compare March 14, 2022 03:17

remove io test special category

a81bb55

the io category had one test that was disabled long ago removed it and older helper code `readall_echo.nim`

saem added 16 commits March 19, 2022 13:04

add skip file handling

7d5ca44

remove commented out code

097a251

setup simulate/dry run

520642c

skipped tests are reported

c5b0f2e

skipping is either because they're disabled or known issues there seems to be an issue with tests discovery of execution, not all tests seem to be run.

lang spec: don't test union pragma in js

8aeb8f1

separate test result output reporting

667a641

skip run actions if compile action fails

9a4a2f9

lang spec: don't test ptr boundchecks pragma in js

665d074

fix checking of next batch size

2c46fb4

check if the next batch is full after failed action pruning

fix tests so they work on more targets

4ba5acd

mark t03_inheritance with a known issue

d631043

but it only applies to js... hmm

fix tests so they pass on js

91f02b1

codecheck and nimcache dir fixes

36146f1

fix ambsym test so it works on the js target

6f1e22c

arc test clean-up; 3 failures

9b10682

failure info: - arc/tcontrolflow fails in cpp, not sure why - arc/thavlak_orc_stress fails due to valgrind issues in c and cpp - arc/torc_selfcycles fails similar to thavlak_orc_stress in c and cpp

have categories specify test target defaults

7aaccce

point here is that category commonalities should be in category handling

haxscramper added test Add or improve tests old-poc Old Proof-of-Concept implementation. Wasn't intended to be merged anyway, closed to reduce clutter tool Improvements to non-compiler tooling labels Nov 20, 2022

haxscramper added this to the Test suite reorganization and cleanup milestone Nov 21, 2022

haxscramper self-assigned this Nov 25, 2022

haxscramper mentioned this pull request Nov 25, 2022

Add language specification for finally clause #256

Draft

haxscramper closed this Nov 25, 2022

This was referenced Nov 25, 2022

Rebase https://github.com/nim-works/nimskull/pull/135 #483

Open

Testament reimplementation and test directory cleanup #486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testament: test process batching and internals rework #135

Testament: test process batching and internals rework #135

saem commented Dec 27, 2021

saem Dec 27, 2021

alaviss Dec 27, 2021

saem Dec 27, 2021

haxscramper commented Nov 25, 2022 •

edited

Loading

haxscramper commented Nov 25, 2022

Testament: test process batching and internals rework #135

Testament: test process batching and internals rework #135

Conversation

saem commented Dec 27, 2021

saem Dec 27, 2021

Choose a reason for hiding this comment

alaviss Dec 27, 2021

Choose a reason for hiding this comment

saem Dec 27, 2021

Choose a reason for hiding this comment

haxscramper commented Nov 25, 2022 • edited Loading

haxscramper commented Nov 25, 2022

haxscramper commented Nov 25, 2022 •

edited

Loading