cannot determine order of test execution in a parallel worker
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Testrepository |
Fix Released
|
High
|
Benji York | ||
testtools |
Fix Released
|
Medium
|
Jonathan Lange |
Bug Description
In working on making the Launchpad tests work with the testr run --parallel feature, we have encountered many tests that fail intermittently. Many of these have resolved into tests that fail or succeed depending on the order in which they run. For one of these intermittently failing tests, some previous test is not isolated--it leaves the environment in an unclean state. In some cases a previous test is necessary to run before another test succeeds. More often a previous test will cause a following test to fail.
The order that the tests run in is available via one file per process while they run in a temporary directory. When testr runs successfully to completion, these files are deleted.
In the same way that one can interrogate testr for the tests that failed, I would like to be able to interrogate it to see the ordered collection of tests that included the failing test. Given that, we should be able to run the tests again. It gives an easy way to verify that test ordering affects or does not affect outcome, and it gives us a reduced starting point of tests to determine which ones are affected.
Related branches
- Jonathan Lange: Approve
-
Diff: 72 lines (+25/-2)2 files modifiedtestrepository/commands/load.py (+9/-2)
testrepository/tests/commands/test_load.py (+16/-0)
summary: |
- --parallel should provide easy access to the order in which tests were - run + cannot determine order of test execution in a parallel worker |
Changed in testrepository: | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in testrepository: | |
milestone: | none → next |
Changed in testtools: | |
milestone: | none → next |
Changed in testrepository: | |
assignee: | nobody → Benji York (benji) |
Changed in testtools: | |
status: | Fix Committed → Fix Released |
Changed in testrepository: | |
status: | Fix Committed → Fix Released |
I'm not sure that the order is available during execution, as the workers stream directly into the multiplexer.
I'd solve this by tagging tests coming in with a worker id - the sequence is still implicit within a worker.
One special case would be to identify tests already tagged this way, and use a nested tag (this will handle nested parallelisation where each parallel worker is a separate machine which internally parallelises further.
e.g. the logic would be something like:
when starting a parallel worker: add tag worker-%d
if an incoming tag of worker-%d is seen, translate that to worker-%d-%d where the left most %d is the id for this worker and the right hand %d is that from the incoming stream. (And make this generic so worker-%d-%d -> worker-%d-%d-%d too.