Using NodeJS and FTP with Promises
I’ve played with node in the past but as of the new year I decided to try and make a more concerted effort to get stuck into node properly. I decided to go back to the beginning to try and get a better appreciation for the language so read “JavaScript: The Good Parts by Douglas Crockford”. I found that exercise fulfilling and resulted in a few light bulb moments that made some dots join up so I’d recommend reading it if you haven’t already.
Real World App
As I stated earlier I have already played with node in the past using Express and have read quite a bit on node and read many examples but I wanted to write a non-web app as I felt this would give me a better opportunity to get to grips with the language and Node. Using Express allows you to get up and running very quickly without to much head scratching so I felt a standalone script would give me more exposure to things.
During the previous couple of weeks at work I wrote a console app that downloaded zip file from a FTP server, extract the contents, read data in a XML file that was in the zip, do some string matching and upload the zip to another FTP server. I figured this would be a good app to replicate in node so off I went.
After a bit of npm research I found the modules I needed and managed to get to the point of downloading files pretty easily with the below code:
var path = require('path');
var fs = require('fs');
var Promise = require('bluebird');
var Client = require('ftp');
var c = new Client();
var connectionProperties = {
host: "myhost",
user: "myuser",
password: "mypwd"
};
c.on('ready', function () {
console.log('ready');
c.list(function (err, list) {
if (err) throw err;
list.forEach(function (element, index, array) {
//Ignore directories
if (element.type === 'd') {
console.log('ignoring directory ' + element.name);
return;
}
//Ignore non zips
if (path.extname(element.name) !== '.zip') {
console.log('ignoring file ' + element.name);
return;
}
//Download files
c.get(element.name, function (err, stream) {
if (err) throw err;
stream.once('close', function () {
c.end();
});
stream.pipe(fs.createWriteStream(element.name));
});
});
});
});
c.connect(connectionProperties);
However, I originally had that code in a function and wanted to call it and then call another function to read the files that I had downloaded but what I found was callback hell.
Enter Promises
I needed to know that all the files had downloaded and then I could read the files in a directory ready for zip extraction but I couldn’t work out how. I discovered promises and probably didn’t read enough about all the ins and outs of them but I remember Glenn Block giving a talk about async programming in node so I pestered him on Twitter and he kindly helped and me out and also pointed me towards his code and slides where I decided to use Bluebird, the promise library. Unfortunately I just couldn’t get the files downloaded. It would download one file but not the other and closed the streams.
Here is a snippet of what I had (brace yourself)
var processListing = function (directoryItems) {
var itemsToDownload = [];
directoryItems.forEach(function (element, index, array) {
//Ignore directories
if (element.type === 'd') {
console.log('directory ' + element.name);
return;
}
//Ignore non zips
if (path.extname(element.name) !== '.zip') {
console.log('ignoring ' + element.name);
return;
}
//Download zip
itemsToDownload.push({
source: element.name,
destination: element.name
});
});
return itemsToDownload;
};
var processItem = function (object) {
return aFtpClient.getAsync(object.source);
};
var downloadFiles = function () {
console.log('downloading files');
aFtpClient.
listAsync().
then(processListing).
map(function (object) {
return processItem(object).then(function (processResult) {
return {
input: object,
result: processResult
};
});
}).
map(function (downloadItem) {
downloadItem.result.pipe(fs.createWriteStream(process.cwd() + "/zips/" + downloadItem.input.destination));
return new Promise(function (resolve, reject) {
downloadItem.result.once("close", function () {
console.log('closed');
resolve();
});
});
}).done()
};
Not only is that a tad complicated but I could not for the life of me understand what the hell was happening and why it wasn’t downloading all the files. I reached out to @PrabirShrestha who agreed it was a tad over complicated and tried to help but recommended I take a look at Reactive Extensions, maybe I will in the future but at this point my frustration had kicked in and I wanted to give up. I went through a mixture of emotions from frustration, which led to anger, fuming anger, denial, then apathy. Although these emotions went by and after a couple of questions on stackoverflow that helped but didn’t give the solution I explained the issue to a colleague and we both took a look. I went through a few iterations with no luck and after a bit more reading I think we were closing in on it individually but I was beaten to it. All hail @iamnerdfury who produced this:
var connect = function() {
c.connect(connectionProperties);
return c.onAsync('ready');
};
var getList = function() {
return c.listAsync();
};
var zipFiles = function(element) {
return element.type !== 'd' && path.extname(element.name) === '.zip';
};
var current = Promise.resolve();
var downloadFiles = function(file) {
current = current.then(function() {
return c.getAsync(file.name)
}).then(function(stream) {
stream.pipe(fs.createWriteStream(file.name));
console.log(file.name + ' downloaded..');
});
return current;
};
connect().then(getList).filter(zipFiles).map(downloadFiles).done();
I think the previous issues I had was I was returning resolve()
after the first file downloaded which is not what you want to do when multiple calls to it are executed as a promise can only resolve once. I needed to find some way of concatenating a promise somehow for each file that is downloaded. I looked at the all()
command but I couldn’t get it to fit but @iamnerdfury found that you could do this via creating an instance of a promise by calling resolve and then assign to it on each file that needed to be downloaded.
Now I know the files are downloaded I can chain more functions to read the file system, extract the zip for each one, read the XML and upload to a new server.
I hope this helps someone else because it wound me up something chronic and whilst I get pissed off with JavaScript when things like this happen I will keep at it because I think node is now becoming a serious contender and us developers need to keep a finger in many pies.
(If you think there is a way to improve the solution above I’d love to hear it)