Shawn Dahlen

Periodic updates on my software startup endeavor

Setup node.js Servers Within a Load-balanced Configuration

| Comments

With an excellent foundation in place to automate provisioning of secure servers with Chef, I spent week two establishing an application project structure using the Node.js platform. Additionally, I setup a load balancer configuration with HAProxy and tested it using a local multi-machine Vagrant environment. Read on to find out details about this setup.

Value & Exit Criteria

As I mentioned in my previous post, I start each week defining the business value for the task at hand coupled with concise exit criteria to know when I’m done. This week’s task will ensure the conversion rate for subscribers is not impacted as the product usage grows. I consider the first impression of the product’s experience extremely important and I don’t want to see that disrupted as I try to scale a system in production.

To complete the task, four critera had to be met:

  • Setup a project structure for Node.js supporting a rapid development workflow and optimized production deployment.
  • Implement a Chef recipe to provision the Node.js application.
  • Implement a Chef recipe to provision HAProxy configured to load balance multiple Node.js application servers.
  • Test the Chef recipes using a local multi-machine Vagrant environment.

Setup a Node.js Project

Given my experience with Node.js from my previous job at Lockheed Martin, I selected it as the platform to build my product. I intend to implement a thin server architecture supporting a single-page application. Essentially, all the user interface logic will reside on the client using JavaScript libraries such as jQuery, require.js, and Backbone. The user interface will query an API layer built on the Express web application framework for Node.js.

Below is a step-by-step guide to setup a basic Node.js project using these libraries. It assumes that Node.js and npm are installed.

  • Setup Express. I began by creating a new directory, marinara, and initialized it as a git repository. Next, I created the package.json file which defines metadata for the project including its dependencies. Within that file I specified the Express version I required and then installed it using npm.
package.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
    "name": "marinara",
    "version": "0.1.0",
    "engines": {
        "node": "0.10.x"
    },
    "dependencies": {
        "express": "~3.1.0",
        "log": "~1.3.1",
        "underscore": "~1.4.4",
        "consolidate": "~0.8.0"
    },
    "devDependencies": {
    }
}
1
$ npm install

With Express installed, I created a server.js file at the root of the project directory and implemented a basic application with a simple logging and configuration strategy. Specifically, I defined configuration through environment variables (with suitable defaults) that will later be set with an Upstart script. I leveraged the log.js module to write messages using the log levels specified by syslog that will later be consolidated with an rsyslog server.

server.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
'use strict';

var express = require('express'),
    Log = require('log'),
    api = require('./lib/api'),
    pkg = require('./package.json'),
    app = express();

// defines app settings with default values
app.set('log level', process.env.MARINARA_LOG_LEVEL || Log.DEBUG);
app.set('session secret', process.env.MARINARA_SESSION_SECRET || 'secret');
app.set('session age', process.env.MARINARA_SESSION_AGE || 3600);
app.set('port', process.env.MARINARA_PORT || 8000);

// configures default logger available for middleware and requests
app.use(function (req, res, next) {
    req.log = new Log(app.get('log level'));
    next();
});

// logs all requests if log level is INFO or higher using log module format
if (app.get('log level') >= Log.INFO) {
    var format = '[:date] INFO :remote-addr - :method :url ' +
                 ':status :res[content-length] - :response-time ms';
    express.logger.token('date', function () { return new Date(); });
    app.use(express.logger(format));
}

// TODO: setup cluster support
app.listen(app.get('port'));
  • Serve static assets and api requests. With the basic application structure complete, I implmented support to serve static assets both in development and production. During development, static assets are served from the app/ directory using require.js to keep code modular. For production, assets are optimized (concatnated and minified) and placed in the public/ directory. The single html page, index.html, is an underscore.js micro-template that replaces script and stylesheet references depending on the mode.
server.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// configures underscore view engine
// TODO: set caching for production
app.engine('html', require('consolidate').underscore);
app.set('view engine', 'html');
app.set('views', __dirname + '/templates');

// enables compression for all requests
app.use(express.compress());

// serves index and static assets from app/ directory in development
app.configure('development', function () {
    app.get('/', function (req, res) {
        res.render('index', {
            jsFile: 'public/components/requirejs/require.js',
            cssFile: 'public/styles/main.css'
        });
    });
    app.use('/public', express.static(__dirname + '/app'));
});

// serves index and static assets from optimized public/ directory in prod
// TODO: replace staticCache middleware with varnish
app.configure('production', function () {
    var oneYear = 60*60*24*365,
        baseFile = 'public/' + pkg.name + '-' + pkg.version;

    app.get('/', function (req, res) {
        res.render('index', {
            jsFile: baseFile + '.min.js',
            cssFile: baseFile + '.min.css'
        });
    });
    app.use('/public', express.staticCache());
    app.use('/public', express.static(__dirname + '/public', { maxAge: oneYear }));
});
templates/index.html
1
2
3
4
5
6
7
8
9
10
11
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Marinara</title>
    <link type="text/css" rel="stylesheet" href="<%= cssFile %>">
</head>
<body>
    <script data-main="public/scripts/main" src="<%= jsFile %>"></script>
</body>
</html>

While I have not flushed out the api for the product yet, I went ahead and stubbed out a basic implementation. I created a sub-application in lib/api.js where I configured middleware to handle secure cookie sessions, cross-site request forgery, and body parsing. This sub-application is exported and mounted at the /api path on the main Express application in server.js.

lib/api.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
'use strict';

var express = require('express'),
    app = express();

module.exports = function (options) {
    app.use(express.cookieParser());
    app.use(express.cookieSession({
        secret: options.sessionSecret,
        cookie: { maxAge: options.sessionAge }
    }));
    app.use(express.bodyParser());
    app.use(express.csrf());
    app.use(app.router);

    app.get('/hello', function (req, res) {
        req.log.debug('serving /api/hello request');
        res.send('Hello Shawn');
    });

    return app;
};
server.js
1
2
3
4
5
// mounts and configures rest api
app.use('/api', api({
    sessionSecret: app.get('session secret'),
    sessionAge: app.get('session age')
}));
  • Setup Bower and install client dependencies. With the server ready to handle requests, I shifted to the client-side by defining libraries in component.json that Bower, a browser package manager, would install. (I will discuss the installation of Bower shortly). I configured Bower (in the .bowerrc file) to install libraries to app/components so they could be served by Express during development.
component.json
1
2
3
4
5
6
7
8
9
10
11
{
    "name": "marinara",
    "version": "0.1.0",
    "dependencies": {
        "requirejs": "~2.1.5",
        "jquery": "~1.9.1",
        "backbone": "~0.9.10",
        "underscore": "~1.4.4",
        "almond": "~0.2.5"
    }
}
.bowerrc
1
2
3
{
    "directory": "app/components"
}
1
$ bower install
  • Setup require.js configuration and main entry point. As I mentioned earlier, require.js supports modular development of both css and javascript. I went ahead and defined a main entry point for require.js, main.js, which includes configuration to use the libraries installed by Bower. I also stubbed out a simple Backbone view to test asset optimization later on.
app/scripts/main.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
requirejs.config({
    paths: {
        jquery: '../components/jquery/jquery',
        backbone: '../components/backbone/backbone',
        underscore: '../components/underscore/underscore'
    },
    shim: {
        backbone: {
            deps: ['underscore', 'jquery'],
            exports: 'Backbone'
        },
        underscore: {
            exports: '_'
        }
    }
});

require(['jquery', 'marinara'], function ($, marinara) {
    'use strict';

    var view = new marinara.SampleView();
    $('body').html(view.render().el);
});
app/scripts/marinara.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
define(['jquery', 'backbone', 'underscore'], function ($, Backbone, _) {
    'use strict';

    var exports = {};

    var SampleView = exports.SampleView = Backbone.View.extend({
        id: 'sample',
        template: _.template('<h1><%= msg %></h1>'),
        events: {
            'click': function () { alert('Hello Shawn.'); }
        },
        render: function () {
            this.$el.html(this.template({ msg: 'Click me please.' }));
            return this;
        }
    });

    return exports;
});
  • Setup Grunt.js build system. To test changes during development, I setup an automated watch and reload workflow using Grunt. When files change in the app/ directory, I automatically reload the browser window with the LiveReload Chrome extension. When javascript files change in the / or lib/ directories, I restart the Node.js server using Upstart. Additionally, I also lint both css and javascript files when changes occur. To kickoff this workflow, I run grunt in the root directory.

    I also used Grunt to optimize static assets. The grunt-contrib-requirejs plugin uses require.js’s optimizer to concatnate and minify javascript. It also concatnate’s css by scanning for @import statements. I paired this with the grunt-contrib-cssmin plugin to minify css assets. To optimize assets and move them into the public/ directory to be served by Express, I run:

1
$ grunt optimize

Check out this gist for the Gruntfile.js that supports this setup. With the build system complete, the first exit criteria was met.

Implement Chef Application Recipe

With the Node.js project structure complete, I shifted back to the marinara-kitchen repository I worked on in the last post. I created a new recipe, application.rb, that would provision Node.js and related dependencies and deploy the marinara git repository. Below is the step-by-step guide to create the recipe:

  • Provision Node.js and npm packages. I leveraged the nodejs cookbook to install Node.js and npm by source defining the default versions in the default.rb attributes file. Additionally, I defined npm packages to be installed globally on the server (Grunt, Bower) using the LWRP provided by the npm cookbook.
site-cookbooks/marinara/attributes/default.rb
1
2
3
4
5
6
7
8
9
10
11
12
# includes nodejs default attributes first to override them
include_attribute 'nodejs'

# defines node.js and npm version configuration
default.nodejs.version = '0.10.0'
default.nodejs.npm = '1.2.14'

# defines npm packages to install globally
default.marinara.application.npm_packages = {
  'grunt-cli' => '0.1.6',
  'bower'     => '0.8.5'
}
site-cookbooks/marinara/recipes/application.rb
1
2
3
4
5
6
7
8
9
10
# provisions node.js and npm
include_recipe 'nodejs::install_from_source'
include_recipe 'nodejs::npm'

# provisions global npm packages
node.marinara.application.npm_packages.each_pair do |pkg, ver|
  npm_package pkg do
    version ver
  end
end
  • Provision Upstart script. Since I am using Ubuntu, I implemented an Upstart script to manage the Node.js application as a service. As I mentioned earlier, I configure the application using environment variables defined in the Upstart script injected in by a Chef template. The template also checks whether the application is being provisioned in production mode, and if so, includes statements to start the service on startup and run under an application-specific user account.
site-cookbooks/marinara/templates/default/marinara.conf.erb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
description 'marinara upstart script'
author 'Shawn Dahlen <shawn@dahlen.me>'

<% if not node.marinara.application.development -%>
start on [2345]
setuid <%= node.marinara.application.user %>
setgid <%= node.marinara.application.user %>
env NODE_ENV=production
<% end -%>

env MARINARA_LOG_LEVEL=<%= node.marinara.application.log_level %>
env MARINARA_PORT=<%= node.marinara.application.port %>

stop on [06]

respawn

exec /usr/local/bin/node <%= node.marinara.application.deploy_path %>/server.js > <%= node.marinara.application.log_path %>/marinara.log 2>&1
site-cookbooks/marinara/recipes/application.rb
1
2
3
4
5
# provisions upstart script
template '/etc/init/marinara.conf' do
  source 'marinara.conf.erb'
  mode 0440
end
  • Deploy application. If the application will be provisioned in production mode (as opposed to development mode where Vagrant mounts the project directory), the recipe provisions the application user, creates the deployment and log directories, copies over a ssh deploy key, clones the marinara git repository, install dependencies, and optimizes assets. A number of default attributes drive the deployment — the most important of which is the reference attribute defining the git tag or branch to deploy.
site-cookbooks/marinara/attributes/default.rb
1
2
3
4
5
6
7
8
9
10
11
12
# defines marinara application deployment configuration
default.marinara.application.development = false
default.marinara.application.user = 'marinara'
default.marinara.application.repository = 'git@github.com:smdahlen/marinara.git'
default.marinara.application.reference = 'master'
default.marinara.application.deploy_path = '/opt/marinara'
default.marinara.application.log_path = '/var/log/marinara'
default.marinara.application.log_level = 6
default.marinara.application.port = 8000
default.marinara.application.servers = {
  'app0' => '127.0.0.1'
}
site-cookbooks/marinara/recipes/application.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
if not node.marinara.application.development
  # stops marinara service if running
  service 'marinara' do
    action :stop
    provider Chef::Provider::Service::Upstart
  end

  # provisions system user to run application
  user node.marinara.application.user do
    system true
    shell '/bin/false'
    home '/home/marinara'
    supports manage_home: true
  end

  # provisions deploy key and ssh wrapper
  cookbook_file '/tmp/deploy_id_rsa' do
    source 'deploy_id_rsa'
    mode 0400
    owner node.marinara.application.user
  end
  cookbook_file '/tmp/deploy_ssh_wrapper' do
    source 'deploy_ssh_wrapper'
    mode 0500
    owner node.marinara.application.user
  end

  # provisions deploy directory with app user permissions
  directory node.marinara.application.deploy_path  do
    owner node.marinara.application.user
    group node.marinara.application.user
  end

  # provisions log directory with app user permissions
  directory node.marinara.application.log_path  do
    owner node.marinara.application.user
    group node.marinara.application.user
  end

  # provisions git repository
  git node.marinara.application.deploy_path  do
    repository node.marinara.application.repository
    reference node.marinara.application.reference
    user node.marinara.application.user
    group node.marinara.application.user
    ssh_wrapper '/tmp/deploy_ssh_wrapper'
  end

  # provisions npm application dependencies
  execute 'npm install' do
    cwd node.marinara.application.deploy_path
    command '/usr/local/bin/npm install'
    user node.marinara.application.user
    group node.marinara.application.user
    env 'HOME' => "/home/#{node.marinara.application.user}"
  end

  # provisions bower application dependencies
  execute 'bower install' do
    cwd node.marinara.application.deploy_path
    command 'bower install'
    user node.marinara.application.user
    group node.marinara.application.user
    env 'HOME' => "/home/#{node.marinara.application.user}"
  end

  # optimizes browser assets
  execute 'grunt optimize' do
    cwd node.marinara.application.deploy_path
    command 'grunt optimize'
    user node.marinara.application.user
    group node.marinara.application.user
  end

  # starts marinara service
  service 'marinara' do
    action :start
    provider Chef::Provider::Service::Upstart
  end
end
  • Provision application firewall rule.To complete the second exit criteria, I defined a firewall rule in the application recipe to allow traffic on the specified application port with the upstream HAProxy server as the source.
site-cookbooks/marinara/recipes/application.rb
1
2
3
4
5
6
7
8
# provisions firewall rule to support incoming application traffic
include_recipe 'simple_iptables'
server = node.marinara.proxy.server
port = node.marinara.application.port
simple_iptables_rule 'application' do
  rule "-p tcp -s #{server} --dport #{port}"
  jump 'ACCEPT'
end

Implement Chef Proxy Recipe

To scale the application servers horizontally with the growth of product usage, I implemented a Chef recipe to install and configure HAProxy. This recipe leveraged the haproxy cookbook for most of the heavy lifting. However, I had to reopen the template resource defined in that cookbook to use my configuration template instead. The code snippet below demonstrates this. In addition to providing custom configuration that references deployed application servers, I implemented a firewall rule to accept traffic on port 80 (the default value for the incoming_port attribute). In a future week, I intend to tune HAProxy’s configuration based on performance testing.

site-cookbooks/marinara/attributes/default.rb
1
2
3
4
5
6
7
8
9
10
# includes default haproxy cookbook attributes first to override
include_attribute 'haproxy'

# defines haproxy configuration
default.haproxy.install_method = 'source'
default.haproxy.source.version = '1.5-dev17'
default.haproxy.source.url = 'http://haproxy.1wt.eu/download/1.5/src/devel/haproxy-1.5-dev17.tar.gz'
default.haproxy.source.checksum = 'b8deab9989e6b9925410b0bc44dd4353'

default.marinara.proxy.server = '127.0.0.1'
site-cookbooks/marinara/recipes/proxy.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# provisions haproxy
include_recipe 'haproxy'

# overrides haproxy cookbook default configuration
template = resources('template[/etc/haproxy/haproxy.cfg]')
template.source 'haproxy.cfg.erb'
template.cookbook 'marinara'

# provisions firewall rule to support incoming http traffic
include_recipe 'simple_iptables'
simple_iptables_rule 'proxy' do
  rule "-p tcp --dport #{node.haproxy.incoming_port}"
  jump 'ACCEPT'
end
site-cookbooks/marinara/templates/default/haproxy.cfg.erb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
global
  log 127.0.0.1 local0
  maxconn 4096
  user haproxy
  group haproxy

defaults
  log global
  mode http
  option httplog
  option dontlognull
  option redispatch
  option forwardfor
  timeout connect 5s
  timeout client 50s
  timeout server 50s
  balance <%= node.haproxy.balance_algorithm %>

frontend http
  bind 0.0.0.0:<%= node.haproxy.incoming_port %>
  default_backend application

backend application
  <% node.marinara.application.servers.each_pair do |name, addr| -%>
  server <%= name %> <%= addr %>:<%= node.marinara.application.port %> check
  <% end -%>

Test with Multi-Machine Vagrant Environment

To finish off the week, I created a Vagrantfile in the marinara-kitchen repository that creates three virtual machines using VirtualBox. Specifically, I created a definition for the HAProxy server and a definition for the Node.js application servers running on a private network so they could communicate with one another. Here is the configuration below using the new Vagrant 1.1 syntax:

Vagrantfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
app_servers = {
  app0: '10.0.5.3',
  app1: '10.0.5.4'
}

Vagrant.configure('2') do |config|
  config.vm.box = 'precise64-chef11.2'
  config.vm.box_url = 'https://opscode-vm.s3.amazonaws.com/vagrant/opscode_ubuntu-12.04_chef-11.2.0.box'

  config.vm.provider :virtualbox do |vb|
    vb.customize ['setextradata', :id, 'VBoxInternal2/SharedFoldersEnableSymlinksCreate/v-root', '1']
  end

  config.ssh.forward_agent = true

  config.vm.define :proxy do |proxy|
    proxy.vm.network :private_network, ip: '10.0.5.2'
    proxy.vm.provision :chef_solo do |chef|
      chef.cookbooks_path = ['cookbooks', 'site-cookbooks']
      chef.data_bags_path = 'data_bags'
      chef.add_recipe 'marinara::default'
      chef.add_recipe 'marinara::security'
      chef.add_recipe 'marinara::proxy'
      chef.json = {
        marinara: {
          application: {
            servers: app_servers
          }
        }
      }
    end
  end

  app_servers.each_pair do |name, ip|
    config.vm.define :"#{name}" do |app|
      app.vm.network :private_network, ip: ip
      app.vm.provision :chef_solo do |chef|
      chef.cookbooks_path = ['cookbooks', 'site-cookbooks']
      chef.data_bags_path = 'data_bags'
        chef.add_recipe 'marinara::default'
        chef.add_recipe 'marinara::security'
        chef.add_recipe 'marinara::application'
        chef.json = {
          marinara: {
            application: {
              reference: 'develop'
            },
            proxy: {
              server: '10.0.5.2'
            }
          }
        }
      end
    end
  end
end

With the Vagrantfile implemented, I brought the servers up and executed a quick test with curl against my endpoints. With a successful response, the fourth criteria had been met and the week’s task complete.

1
2
3
$ vagrant up
$ curl http://10.0.5.2/
$ curl http://10.0.5.2/api/hello

Comments