Multi node testing with Test-Kitchen and Docker containers / by Matt Wrock

Two docker containers created and tested with Kitchen-Docker

Two docker containers created and tested with Kitchen-Docker

My last post provided a walk through of some of the new Windows functionality available in the latest Test-Kitchen RC and demonstrated those features by creating and testing a Windows Active Directory domain controller pair. This post will also be looking at testing multiple nodes but instead of windows, I'll be spinning up multiple docker containers. I'm going to be using a Couchbase cluster as my example. Note that while I am using docker containers, there is nothing special happening here preventing one from running the same tests on multiple linux or windows VMs using the Kitchen-Vagrant driver. Couchbase runs on windows too.

Why run tests with containers when my production nodes are VMs?

There are some really interesting things being done with containers in production environments but even if you are not using containers in production, there are some clear benefits to using them for testing infrastructure development. The biggest value is faster provisioning. Using the kitchen-docker driver over vagrant or another cloud based driver can potentially save several minutes per test. You might wonder "whats a couple minutes?" However, when you are iterating over a problem and need to reprovision several times, a couple minutes or more can add up quick.

You still want to test provisioning to VMs if that is what your production infrastructure runs, but that can sit later in your testing pipeline. You will save alot of time, money and tears (you'll need those later) by keeping your feedback cycles short early in your development process.

Setting things up

To get started you will need to have the docker engine installed and the latest RC of test-kitchen.

Docker Install

There are a few approaches one can take to installing docker. Some are more complicated than others and really depend on your host operating system. I'm using an Ubuntu 14.04 desktop os on my laptop. Ubuntu 14.04 has no prerequisites and you simply run:

wget -qO- https://get.docker.com/ | sh

Ubuntu 12.02 requires a kernel upgrade and several packages before the above install will work. The docker installation documentation provides instructions for most operating systems.  If you are running windows or a mac, you will want to run the docker engine from inside a linux vm. You can either setup a vm of your favorite linux distro and then install docker following the instructions on the docker site or you can install Boot2Docker which will install a local docker CLI, VirtualBox, and a stripped down, tiny core linux image.

This post is not aimed to explore the different ways of installing docker. If you do not already have docker or a vm setup from which you can install it friction-free, take a look at my chef_workstation repo that includes a Vagrantfile that will provision a  workable chef enabled workstation environment with docker installed. It should work with VirtualBox, Hyper-V or Parallels on a mac. I believe it also works for VMWare Fusion users but I have not validated that for a while.

A multi-node enabled cookbook to test

To demonstrate multi node testing with test-kitchen, I have forked the community couchbase cookbook. I'll be sending a PR with these changes:

  • Compatibility with docker (current version uses netstat to validate a listening port an thats not installed on the default ubuntu container)
  • Extends the couchbase-cluster resource to allow other nodes to be joined to a cluster
  • Fixes the cookbook on windows which is unrelated to this post but aligns well with one of my personal missions in life

Clone my fork and checkout the multi-node branch:

git clone -b multi-node https://github.com/mwrock/couchbase

If you are using the vagrant box in my chef_workstation repo, cd to the cookbooks directory just below the directory you land in from vagrant ssh and clone from there.

Using the right gems

To help facilitate testing multiple nodes, this cookbook uses a custom test-kitchen provisioner plugin that utilizes functionality exposed in the latest test-kitchen RC. So the cookbook includes a Gemfile that references both of these gems and other important dependencies. To ensure that you are testing with all of the correct gems, cd into the root of the couchbase cookbook and run:

bundle install

Converge and test the first node

We are now ready to create, converge and test the first node of our couchbase cluster. Make sure to run with bundle exec so that we use all of the correct gem versions:

bundle exec kitchen verify server-community-ubuntu

This will start a new container running ubuntu 12.04, install Couchbase and initialize a new cluster. Then a serverspec test will ensure that the service is running and configured the way we want it.

Joining an additional node to the cluster

To get the full multi-node effect, lets now ask test-kitchen to run our second-node suite:

bundle exec kitchen converge second-node-ubuntu

This brings up a new container that will post to the couchbase rest endpoint of our first node asking to join the cluster. Then its serverspec test will pull the list of nodes in the cluster exposed from the original node and check if our second node is included in the list.

Discovering the original node

One possible strategy could be to set an attribute specifying the IP or host name of the initiating couchbase node. However this assumes it is a known and constant value. You may prefer your infrastructure to dynamically query for an existing couchbase node. In our test scenario, we really cant predict the ip or host name since we are getting IPs from DHCP and docker is handing out a unique hash for a host name.

Note that we could tweak the driver configuration in our .kitchen.yml to expose predictable hostnames that can link to other containers. Here is an example of a possible config for our node suites:

suites:
- name: server-community
  driver:
    publish_all: true
    instance_name: first_cluster
  run_list:
  - recipe[couchbase::server]
  attributes:
    couchbase:
      server:
        password: "whatever"

- name: second-node
  driver:
    links: "first_cluster:first_cluster"
  run_list:
  - recipe[couchbase-tests::default]
  attributes:
    couchbase:
      server:
        password: "whatever"
        cluster_to_join: first_cluster

Here the first node uses the kitchen-docker configuration to ask the docker engine to expose its container with a specific name "first_cluster." The second node is asked to link the name "first_cluster" with he "first_cluster" instance. This way any requests from the second container to the DNS name first_cluster will resolve to our first container. Finally we would create a node attribute named luster_to_join that our second node would ask to join.

This may work for your scenario and thats great. However it may break down for others. First its not very portable. This cookbook supports windows and locking in docker specific options will run into problems for windows tests that leverage vagrant here:

- name: windows-2012R2
  driver:
    name: vagrant
    network:
      - ["private_network", { type: "dhcp" }]
  transport:
    name: winrm
  driver_config:
    gui: true
    box: mwrock/Windows2012R2Full
    customize:
      memory: 1024

Furthermore, our test logic needs to match production logic. If production nodes will be querying the chef server for a node to send cluster join requests to, out tests must validate that this strategy works.

The kitchen-nodes provisioner plugin

In my last post I demonstrated a strategy that uses chef search to find a chef node based on a run list recipe. It used my kitchen-nodes provisioner plugin to create mock chef nodes of each kitchen suite so that a chef search can find other suite test instances during convergences. Since that example was creating a windows active directory controller pair, its functionality had some windows specific functionality. I have extended the functionality of this plugin to support most *Nix scenarios including docker.

First we tell test-kitchen to use the kitchen-nodes plugin as a provisioner for the suites that test our couchbase servers:

suites:
- name: server-community
  provisioner:
    name: nodes
  run_list:
  - recipe[couchbase-tests::ipaddress]
  - recipe[couchbase::server]
  - recipe[export-node]
  attributes:
    couchbase:
      server:
        password: "whatever"

- name: second-node
  provisioner:
    name: nodes
  run_list:
  - recipe[couchbase-tests::ipaddress]
  - recipe[couchbase-tests::default]
  - recipe[export-node]
  attributes:
    couchbase:
      server:
        password: "whatever"

The defult recipe of the couchbase-tests cookbook used by our second node can now find the first node using chef search:

primary = search_for_nodes("run_list:*couchbase??server* AND platform:#{node['platform']}")
node.normal["couchbase-tests"]["primary_ip"] = primary[0]['ipaddress']

The search_for_nodes method is defined in our couchbase-tests library:

require 'timeout'

def search_for_nodes(query, timeout = 120)
  nodes = []
  Timeout::timeout(timeout) do
    nodes = search(:node, query)
    until  nodes.count > 0 && nodes[0].has_key?('ipaddress')
      sleep 5
      nodes = search(:node, query)
    end
  end

  if nodes.count == 0 || !nodes[0].has_key?('ipaddress')
    raise "Unable to find nodes!"
  end

  nodes
end

Here we are using a chef search to find a node that includes the couchbase server recipe and has the same os platform of the current node. Matching on platform is important if our .kitchen.yml is designed to test more than one platform like ours.

Chef-zero and chef search

The kitchen-nodes plugin derives from the chef-zero test-kitchen provisioner. Using chef-zero we can issue a chef-search for nodes without being hooked up to a real chef-server. Chef-zero accomplishes this by storing information on each node in a json file stored in its nodes folder. The test-kitchen chef-zero provisioner wires all of this up by copying all files under tests/integration/nodes to {test-kitchen temp folder on test instance}/nodes. So you can create a json file for each test suite in your local nodes folder and then chef search calls will effectively treat the nodes files as the master chef server database.

The kitchen-nodes plugin automatically generates a node file when a test instance is provisioned by test-kitchen. Provisioning occurs at the very beginning of the converge operation. kitchen-nodes populates the node's json file with ip address, platform, and run list. Here are the two nodes' json files generated in my tests:

{
  "id": "server-community-ubuntu-1204",
  "automatic": {
    "ipaddress": "172.28.128.3",
    "platform": "ubuntu"
  },
  "run_list": [
    "recipe[couchbase-tests::ipaddress]",
    "recipe[couchbase::server]",
    "recipe[export-node]"
  ]
}

{
  "id": "second-node-ubuntu-1204",
  "automatic": {
    "ipaddress": "172.17.128.4",
    "platform": "ubuntu"
  },
  "run_list": [
    "recipe[apt]",
    "recipe[couchbase-tests::ipaddress]",
    "recipe[couchbase-tests::default]",
    "recipe[export-node]"
  ]
}

During provisioning, kitchen-nodes will either use SSH or WinRM depending on the test instance platform to interrogate its interfaces for an IP that is accessible to the host. On windows, this information is retrieved using a few powershell cmdlets and on *Nix instances either ifconfig or ip addr show is used depending on what is available on that distro. There may be several interfaces but kitchen-nodes will only choose an ipv4 ip that can be pinged from the host.

Testing that we joined the correct cluster

So how do we test that we actually found the correct node? We cant write a serverspec test using a hard coded IP. We use a testing recipe, export-node, that dumps the entire node object to a json file. Our test recipe run by the second node stores the primary node's IP in a node attribute as we saw further above.

Here is an instant replay:

node.normal["couchbase-tests"]["primary_ip"] = primary[0]['ipaddress']

So when the export-node cookbook dumps the node data, that IP address will be included. Here is the test that validates the node join:

describe "cluster" do
  let(:node) { JSON.parse(IO.read(File.join(ENV["TEMP"] || "/tmp", "kitchen/chef_node.json"))) }
  let(:response) do
    resp = Net::HTTP.start node["normal"]["couchbase-tests"]["primary_ip"], 8091 do |http|
      request = Net::HTTP::Get.new "/pools/default"
      request.basic_auth "Administrator", "whatever"
      http.request request
    end
    JSON.parse(resp.body)
  end

  it "has found the priary node and it is not itself" do
    expect(node["normal"]["couchbase-tests"]["primary_ip"]).not_to eq(node['automatic']['ipaddress'])
  end

  it "has joined the primary cluster" do
    joined = false
    response['nodes'].each do |cluster_node|
      if cluster_node['hostname'] == "#{node['automatic']['ipaddress']}:8091"
        joined =  true
      end
    end

    expect(joined).to be true
  end
end

The export-nodes cookbook dumps the node json to a file named chef-node.json in the kitchen temp folder. So our test pulls the ip that was returned by the chef search from here. It makes sure that it is in fact a different node from its own IP and then issues a couchbase API request to that node to return all nodes in its cluster. Our test passes as long as the second node is included in the returned node list.

Testing all the things

I find this helpful and reassuring that I can include my node interactions into my tests. Test-Kitchen's coverage can indeed extend well beyond the boundaries of a single node.