Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

address issue #41, replace process.nextTick(check) because that causes a... #42

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

seanjsong
Copy link

Hi there,

When I reported issue #41 this morning, I promised to provide a fix. Here it is.

In order to check synchronously I have to move the whole bunch of local vars from next.group to outer scope. Hence the readability is slightly compromised.

Anyway, this fix passed my counter-example and all of the original tests. And I tried not to change the semantics of this lib.

@raffecat
Copy link

Hi seansoong,

Your fix will not work if a step creates more than one group(), because all such groups will share the same return value array and the group items will write over eachothers' results.

It is sufficient to add the following boolean to fix the race condition in group completion:

    var group_done = false;

    function check() {
      // BUGFIX: check group_done to avoid calling localCallback twice;
      // still trying to determine how this comes about, but it does.
      if (pending === 0 && !group_done) {
        group_done = true;
        // When group is done, call the callback
        localCallback(error, result);
      }
    }

I'm having trouble coming up with a test that reproduces the race, but I have real code that triggers the problem - on node v0.8.12 but not in v0.6.15 (presumably the exact timing of the fs callbacks differs between versions.)

The race only seems to show up in non-trivial examples, perhaps when you have more than one group() or parallel() in the same step, but basically what happens is: if all the callbacks created by a group are called back before process.nextTick runs, check() will find pending === 0 both times, and it calls localCallback twice. Since group() is built on parallel() this will decrement the step's pending count an extra time and cause the next step to be run too early with some arguments missing (the missing arguments are the other groups or parallels that have not completed yet.)

I created a branch and added lots of debugging checks for accidental misuse of step callbacks, which is how I found this problem.

@raffecat
Copy link

Here is a test case that reproduces the issue on node v0.8.12:

// Test group completion with multiple parallel groups.
expect("test6: 123");
expect("test6: 5");
Step(
  function makeGroup() {
    var group1 = this.group();
    var group2 = this.group();
    var group3 = this.group();
    var p1 = group1(), p2 = group2(), p3 = group3();
    // fs callbacks can sneak in before process.nextTick runs...
    require('fs').readFile('', function() { p1(null, 1); });
    setTimeout(function() { p2(null, 2); }, 5);
    setTimeout(function() { p3(null, 3); }, 10);
  },
  function groupResults(err, g1, g2, g3) {
      if(err) throw err;
      console.log("test6: " + g1 + g2 + g3);
      fulfill("test6: " + g1 + g2 + g3);
      return 5;
  },
  function terminate(err, num) {
      if(err) throw err;
      fulfill("test6: " + num);
  }
);

The expectation fails to be met because one of the groups yields undefined instead of the expected result [3]. With defensive checks added to the code in my fork, and the group_done fix reverted, I see this output (I also commented out the throw on my checks so it would run to completion):

Step: group was completed twice
Step: parallel cb was called twice
Step: group was completed twice
Step: parallel cb was called twice
test6: 12undefined

/home/mario/code/step/test/helper.js:15
    throw expectations[message];
                      ^
Error: Missing expectation: test6: 123
    at expect (/home/mario/code/step/test/helper.js:8:27)
    at Object.<anonymous> (/home/mario/code/step/test/groupTest.js:138:1)
    at Module._compile (module.js:449:26)
    at Object.Module._extensions..js (module.js:467:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.runMain (module.js:492:10)
    at process.startup.processNextTick.process._tickCallback (node.js:244:9)

It seems that Step assumes the process.nextTick callback will be run before any callbacks that are scheduled "later" in program order, but node doesn't make this promise; indeed this issue is topical ;)

@seanjsong
Copy link
Author

Hi raffecat,

You are right. I didn't use multiple group within one step myself so I overlooked this possibility. I believe this flag thing works. The reason it didn't come to my mind, I suppose, is the fear of introducing yet another flag that is hard to explain to others.

Anyway, I was persuaded by the author to try the upgrade version of step. I recommend you review and test it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants