Friday, October 18, 2013

Node.js express middleware: beware of reading data before the bodyParser()!

Found that yesterday, as a part of implementation of HMAC-like functionality. So, what we did:
/*
THIS IS A CODE WHICH IS ____NOT____ WORKING! DON'T USE IT PLEASE!
*/
app.use(function (req, res, next) {
    var headerValue = req.header(secureAPIHeader);
    req.hasher = crypto.createHmac("sha256", config.secretAdminAPIKey);
    req.setEncoding('utf8');
    req.on('data', function (chunk) {
        req.hasher.update(chunk);
    });
    req.on('end', function () {
        var hash = req.hasher.digest('hex');
        if (hash != headerValue) {
            log.err(util.format("Received wrong Admin request, hashes do not match. Received hash %s want hash %s", headerValue, hash));
            res.json(403, {message: "Not authorized"});
        } else {
            log.warn("Received admin request for url " + req.url);
            req.isSuperSecure = true;
            next();
        }
    });
});
It exposed quite weird (not really) behavior. When the request with the wrong HMAC was coming in, the server gave away the correct 403 error. However, if the HMAC was correct, the server would never return from call chain.
Further investigation showed that it was quite simple. Since we have finished reading all the data from stream (and successfully received ‘end’ event), the underlying bodyParser was forever blocked on ‘end’ event which would never be issued! Actually, bodyParser was only subscribing to this event when the whole data bunch was read.
We didn’t really want to divert from streaming nature of the node.js crypto hasher and read all data into a huge string. So this middleware came into play:
/*
A correct version. The underlying data sourcers are not blocked.
*/
app.use(function (req, res, next) {
    var headerValue = req.header(secureAPIHeader);
    req.hasher = crypto.createHmac("sha256", config.secretAdminAPIKey);
    req.setEncoding('utf8');
    req.on('data', function (chunk) {
        req.hasher.update(chunk);
    });
    req.on('end', function () {
        var hash = req.hasher.digest('hex');
        if (hash != headerValue) {
            log.err(util.format("Received wrong Admin request, hashes do not match. Received hash %s want hash %s", headerValue, hash));
            res.json(403, {message: "Not authorized"});
        } else {
            log.warn("Received admin request for url " + req.url);
            req.isSuperSecure = true;
        }
    });
    // This is the only difference - we execute middlewares in parallel! 
    // => 
    next();
});
Trick here is that we pass execution to the underlying bodyParser straight away. And, when bodyParser receives data so do we - there are 2 listeners for data and end events here. There is an issue with this solution, though. We are not strongly coupling the moments when hmac becomes available in request, and the processing is done. bodyParser does this for us.
Moreover, got an extra-quick reaction by the express team, and now there’s a verify method (in pull request as of 2013-10-18), which does exactly what we do, but w/out streaming (for JSON only), which should be good for 90% users. Here’s discussion link: https://github.com/visionmedia/express/issues/897#issuecomment-26575496
Thanks!

No comments: