Serially Iterating An Array, Asynchronously

Meet Derick Bailey. He runs Watch Me Code, has written a few programming books and blogs a fair bit, so he’s always on my radar when I need to learn a little bit more about programming.

A few months ago, he wrote a post, Serially Iterating An Array, Asynchronously, which was about a very specific programming issue he had to solve.

I recently found myself looking at an array of functions in JavaScript. I needed to iterate through this list, process each function in an asynchronous manner (providing a “done” callback method), and ensure that I did not move on to the next item in the list until the current one was completed.

Wait a minute … I faced that same problem a while ago!

My Node.js Express server routes have a list of steps they need to complete before sending out a response to the client.

The /signup route verifies the email address through an external service, bcrypts the password, stores the information in the database, retrieves the insert id and sends the information over to the client.

None of those operations are synchronous. I can’t run them in parallel either, since some operations depend on the result of a previous one. It’s pointless to continue if any step fails.

A serial list of functions that need to be processed asynchronously.

Derick posted his solution to the problem and where he felt it could be improved. My solution addresses quite a few of those, so I thought I’d share.

CJS-TASK

available on github and npm

  var task = require('cjs-task')(msg.notice.callback);

  task.set('email', msg.notice.email);
  task.set('password', msg.notice.password);

  task.step('validate-email', function(){

    var valid_email = require('email-validator').validate( task.get('email') );

    if(!valid_email){ 
      task.end({message: 'doesn\'t look like that email address is valid'});
    }

    else{ task.next(); }
  });

  task.step('create-passhash', function(){

    var bcrypt_helper = require('./utils/bcrypt-helper-functions.js');

    bcrypt_helper.hash(task.get('password'), function(err, hash){

      if(err){ 
  
        logger.log(err);
        task.end({message: 'could not create user. so sorry. please try again.'});
      }

      else {

        task.set('passhash', hash);
        task.next();
      }
    });
  });

  task.step('create-user', function(){

    var email, password, password_hash;

    email = task.get('email');
    password = task.get('password');
    password_hash = task.get('passhash');

    db_helper.query('INSERT INTO `user` (email, passhash) VALUES ("' + email +'", "'+ password_hash +'")', function(error, result){

      var response = {};

      if(error){

        response.message = 'couldn\'t create an account for ' + email;
        logger.log('[ERROR] CREATE ACCOUNT FAILED:\n' + email + '\n' + error);
        task.end( response );
      }

      else {

        task.end(null, result.insertId);
      }
    });
  });

  // START TASK
    task.start();

Couple points to highlight here.

1. task.end can be used to end a task prematurely. If any of your steps fails, use it to bring the task to a halt instead of wasting more time and resources.

2. task.next is a step control mechanism.

3. task.set /task.get can be used to store and return data relevant to the task at any step. Use it to store the initial data set, keep api responses, set flags … anything really.

4. Under the hood, task.end nulls the data store, the task list and the callback list after the final callback has been triggered, in an attempt to prevent memory leaks.

5. Derick uses a destructive process on the task steps list. I simply keep track of the current index and increment after each task.next .

6. Under the hood I’m backing cjs-task with a pubsub, so I can trigger events for updates to the data store as well as when steps are triggered. Currently not implemented and probably not necessary, but I feel it’d add tremendous value and make it easy to monitor or modify your task instance. Most likely just an excuse to justify using the pubsub instead of something dead simple like a hashmap.

The longer I look at this, the more they look remarkably different, despite the similar API, job to be done and identical operation loop. Interesting.

Postscript.
Looks like someone just released queuer.js which looks like a hybrid approach. Combines the kind of event hooks I was looking to build into mine and Derick’s approach to handling data.