How to tell if you are running in a remote process by Matt Wrock

A yak running remotely

A yak running remotely

I've encountered a few scenarios now where I need to perform different logic based on whether or not I am running locally or remotely. One scenario may be that I have just provisioned a new machine from a base image that has a generic bootstrap password and I need to change the root/admin password. If I do this in an SSH or WinRM/PowerShell session, it will likely kill the process I am running in (of course I could schedule it to happen later). Another scenario may be I'm about to install some Windows updates. I've blogged before about how this does not work over WinRM. So I know I will need to perform that action in a scheduled task.

Depending on your OS and the type of process you are living in (ruby, powershell, ssh, winrm, etc) there are different techniques to detect if you are remote or local and some are more friendly than others.

Detecting SSH

There are a couple ways I have done this. One is language independent. Its probably os independent too but I have only done this on Linux. There are a couple environment variables set in most SSH sessions so you may be able to simply check if that exists:

if ENV('SSH_CLIENT')
  # your crazy remote code here
end

I've run into problems with this though. Lets say you start a service in a SSH session. Now the process in that service has inherited your environment variables. So even if you logout and the process continues to run outside of the initial SSH context, using the above code will still trigger the remote logic.

Is the console associated with a terminal (tty)?

Most languages have a friendly way to determine if the console has a TTY. Check out this wiki that provides snippets for several different programming languages. Ive done this in ruby using:

if STDOUT.isatty
 puts 'I am remote...yo'
end

Detecting a powershell remoting session

Note that WinRM or powershell remoting do not associate a TTY with the console like SSH does; so you need to take a different approach here. If you are lucky enough to be running in a powershell remoting session and NOT a vanilla WinRM session (these are similar but different), there is an easy way to tell if you are local or being invoked from a remote powershell session:

if($PSSenderInfo -ne $null) { Write-Output 'I am remote' }

$PSSenderInfo contains metadata about one's remote session, like the winrm connection string.

Detecting a non-powershell winrm session

There are many reasons to dislike winrm and I regret to give you one more. I googled the "heck" out of this when I was trying to figure out how to do this and I assure you words much more harsh than "heck" flowed freely from my dry, parched lips. I found nothing. The solution I came up with was to search the process tree for an instance of winrshost.exe. All remote winrm sessions run as a child process of winrshost (god bless it).

Here is how you might do this in powershell:

function Test-ChildOfWinrs($ID = $PID) {
  if(++$script:recursionLevel -gt 20) { return $false }
  $parent = (Get-WmiObject -Class Win32_Process -Filter "ProcessID=$ID")
  $parent = $parent.ParentProcessID
  if($parent -eq $null) {
    return $false
  }
  else
  {
    try {
      $parentProc = Get-Process -ID $parent -ErrorAction Stop
    }
    catch {
      return $false
    }
    if($parentProc.Name -eq "winrshost") {return $true} 
    else {
      return Test-ChildOfWinrs $parent
    }
  }
}

This function takes a process id but defaults to the currently running process's id if none is provided. It traverses the ancestors of the process looking for winrshost.exe. One thing I found I had to do to make this "safe" is watch the recursion level and don't become infinite.  Your CPU will thank you. I wrote this a year ago and am trying to remember the exact situation but I do know that somehow I hit a situation here I got caught in a circular traverse of the tree.

Here is a ruby function I have used in chef to do roughly the same thing:

def check_process_tree(parent, attr, match)
   wmi = ::WIN32OLE.connect("winmgmts://") 
   check_process_tree_int(wmi, parent, attr, match)   
end

def check_process_tree_int(wmi, parent, attr, match)
  if parent.nil?
    return nil 
  end
  query = "Select * from Win32_Process where ProcessID=#{parent}"
  parent_proc = wmi.ExecQuery(query)
  if parent_proc.each.count == 0
    return nil
  end
  proc = parent_proc.each.next
  result = proc.send(attr)
  if !result.nil? && result.downcase.include?(match)
    return proc 
  end
  if proc.Name == 'services.exe'
    return nil
  end
  return check_process_tree_int(wmi, proc.ParentProcessID, attr, match)
end

This will climb the process tree looking for any process you tell it to. So you would call it using:

is_remote = !check_process_tree(Process.ppid, :Name, 'winrshost.exe').nil?

Be careful with wmi queries and recursion. Every instance of the wmi search root creates a process. So I typically make sure to create one and reuse it.

I hope you have found this post helpful.

A destructible and repeatable chef workstation environment by Matt Wrock

Several months ago I posted on the CentyryLink Cloud blog about how we reproduce our chef development environment with Vagrant using the chef solo provisioner that consumes the same cookbook as our build agents to create an environment for testing cookbooks. This post will dive into the technical details of such a Vagrantfile/cookbook and kicks it up a notch by publicly sharing a repo that is just that but stripped of any CenturyLink specific references like our Nexus server, Berkshelf API client setup or internal gem installs. Here I will share a chef_workstation repo that anyone can use to quickly spin up an environment that's awesome for developing and testing cookbooks. This repo can be found here in my github chef_workstation repo.

Why a vagrant box for chef development?

Many use vagrant for creating test nodes that converge cookbooks but it is not as common to hear about using vagrant for creating chef workstations/development environments. The fact is that ChefDK goes a long way to solve an isolated pre-built chef environment. However, every environment is different even if many use the ChefDK as a starting point. Its not uncommon to be changing or creating gems, adding tools, and tweaking rake tasks and such before you want to either "capture" these changes to be easily shared or blow these changes away and start fresh from a known, working base.

We have found that being able to easily reproduce shared dev environments has added a ton of value to our workflow.  So lets dive in and explore what exactly this repo provides.

Whats in the box?

I've tried to make sure the current readme describes the details of this box, what it contains, and how to use it. So please feel free to review that. Here is a high level list of what the chef_workstation repo provides:

  • chefdk
  • docker
  • nano text editor
  • git
  • squid proxy cache
  • SSH agent forwarding
  • generated knife.rb
  • Rakefile with tasks for cookbook testing
  • Support for VirtualBox, Parallels, Hyper-V and VMWare
  • Secret energizing ingredient guaranteed to infuze your soul with the spirit of automation!!

The repo contains a Vagrantfile at the root that builds an Ubuntu 12.04 image and sets up all of the above. I've been updating this repo in preparation for this post and I can say there is a fair amount that goes into getting this setup just right. Some of it seems trivial to me today but when I was just getting used to chef and linux a year ago, some of this seemed very mysterious then.

We use vagrant to develop but not for testing

Lets get this out of the way right now. In the automation team at CenturyLink Cloud, we use docker to test almost all linux based cookbooks and a custom vsphere kitchen driver for everything else. We don't use kitchen-vagrant. I do use kitchen-vagrant often on personal projects but it just does not make sense at work. We ARE a cloud so spinning up our own cloud resources needs to be central to our testing. Its also simpler for us to manage test instances in our own large scale capacity cloud than on a much thinner build agent's hard drive.

So if you primarily use kitchen-vagrant for testing, you may not benefit from all the features here but you may want to stick around before you run off. There is more here than just a test-kitchen environment and if you work mostly with linux infrastructure, I cant stress enough the benefits of containers to testing productivity and decreased feedback cycles. This workstation aims to reduce the friction of setting this up repeatably.

Cache all the things!

One feature of this environment that is very popular is its use of squid and the vagrant-cachier plugin. The first time one runs vagrant up with this environment, it may take 10 minutes to run. Give or take a few minutes. This setup caches most internet downloaded artifacts on the host so that all subsequent vagrant ups will build much more quickly.

This feature is not available to windows hosts but if you run a mac or a linux desktop, this works wonders to bring down box build times.

Standardize on a base tools install while maintaining freedom of dev tools

I've historically preferred not to work in VMs as a dev environment. I either need to make sure all my "stuff" is installed there or cope with an environment that has minimal tooling. For those who use vagrant, you know this is where it shines. It syncs folders seamlessly across guest and host regardless of differing operating systems so I can dev on the same files in my favorite text editor or IDE on the host and then run those bits immediately in the shared guest environment.

This environment standardizes on Ubuntu 12.04 as a guest for a few reasons. Its used in many of our production servers and the popular hashicorp/precise64 vagrant box image has providers for VirtualBox, Hyper-V, VMWare Fusion and parallels. This makes it easy to share among any member of our org. I don't think there is anyone who cannot accommodate any of these hypervisor configurations. We don't have to tell anyone that their OS is not welcome.

Decoupling the "shell" from the "core"

This repository is intended to serve as one's "master" chef cookbook repository but it allows one to keep individual cookbooks in separate repos or in a single repo that is separate from this one. This repository contains only one cookbook which actually builds the environment. Any additional cookbook added to the cookbooks directory will be gitignored and should not be maintained in this repo. Simply clone your cookbooks into the cookbooks directory to be maintained separately. As long as they are in the cookbooks directory, they will be included in the vagrant folder syncing and immediately available in your vagrant guest.

Common rake tasks

The readme goes into the specifics of how to call all of the available tasks. I removed quite alot from this Rakefile that was in our own CenturyLink Rakefile and tied into our internal CI/CD pipeline with tasks for syncing dependencies, bumping versions, promoting cookbooks to different environments, etc. However what remains are likely tasks that most would expect to be present in any basic chef repo:

The tasks can work on a single cookbook or all cookbooks in the cookbooks directory in a single task invocation. The kitchen task also allows you to provide your own cookbook level Rakefile if you need to "orchestrate" the test suites for more complex test behavior.

Chef server friendly

The idea here is that if you work with a chef server, upon bringing up this environment, knife commands should "just work" with your account with no gymnastics or secret winking patterns (not that I think secret winking patterns are an anti-pattern -- I'm a huge fan!).

The chef_workstation cookbook generates a knife.rb in the .chef directory of the repo and only if no other knife.rb file exists here. By default the username of the logged in user on the host is used and the chef server url of the publicly hosted chef server is included but is missing the org in the path. Both the user and server url are configurable in the Vagrantfile. Then it is up to the user to add their .pem file to the top level .chef directory with the same name as the user. The entire .chef directory is gitignored so you can add keys without fear of them being added to source control.

A brief walk through of building the box and testing with docker

Getting setup

See the readme for the exact prerequisites and install instructions. Chances are high you already have them. At the very least you want:

Unless you are on windows I strongly suggest installing vagrant-cachier but its completely optional. Finally clone the chef_workstation repo.

Make it your own

This wisdom applies equally to American Idol performances and customized chef workstation vagrant boxes. Edit the Vagrantfile chef.json property that will inject the node attributes into the chef_workstation cookbook that provisions the box. This can include chef server user name and url, gems you want added that are not in the default installed gems, or additional deb packages you want included. All the possible attributes are mentioned here.

Also add your own cookbooks that you intend to work with to the cookbooks directory. so far, everything we have mentioned in the walk through, you only have to do once.

For this walk through, I will add the same cookbook covered a couple posts ago that covered multi node testing with test-kitchen and docker. Its a fork of the couchbase cookbook that can add nodes to a cluster.

cd cookbooks
git clone https://github.com/mwrock/couchbase -b multi-node


Build the box

Now run:

vagrant up

If you are using a hypervisor other than virtualbox, make sure to either:

  • Set the VAGRANT_DEFAULT_PROVIDER environment variable to the provider you want to use
  • Specify it in the --provider argument to vagrant up

In a few minutes the box should be ready.

Add your server key if using a server

If you use a chef server, copy your private key used to authenticate to the .chef directory in the root of the repo. The box provisioning should have created this directory. This file should be named {user_name}.pem. You may of course edit the knife.rb file if you prefer to use a different name. Once generated, the knife.rb file will never be overwritten when provisioning the box, and as mentioned previously, it is in the .gitignore file and thus excluded from source control.

Log on to the box and navigate to the couchbase cookbook

vagrant ssh
cd cookbooks/couchbase

Test a kitchen suite

We wont go through the whole multi node test cycle. Refer to my post on that topic if you are interested in running through that because in this environment you can without a problem. 

First lets run our chefspecs just to make sure unit tests are happy:

vagrant /chef-repo $ rake chefspec[couchbase]

...
2 deprecation warnings total

Finished in 0.90515 seconds (files took 2.34 seconds to load)
243 examples, 0 failures

vagrant /chef-repo $

Looks good. Now lets see what kitchen suites we have available. There are quite a few. Here is a portion of the suites at the top of the list:

vagrant /chef-repo/cookbooks/couchbase $ kitchen list
Instance                          Driver   Provisioner  Verifier  Transport  Last Action
server-community-debian-76        Docker   Nodes        Busser    Ssh        <Not Created>
server-community-ubuntu-1204      Docker   Nodes        Busser    Ssh        <Not Created>
server-community-centos-65        Docker   Nodes        Busser    Ssh        <Not Created>
server-community-windows-2012R2   Vagrant  Nodes        Busser    Winrm      <Not Created>
second-node-debian-76             Docker   Nodes        Busser    Ssh        <Not Created>
second-node-ubuntu-1204           Docker   Nodes        Busser    Ssh        <Not Created>

We will run the server-community-ubuntu-1204 suite:

kitchen verify server-community-ubuntu-1204

You will likely see a lot of text and this may take several minutes to complete at least the first time as the base ubuntu docker image is downloading. If you are using vagrant-cachier, the next time will be much faster. Hopefully this should end in a successful converge and serverspec tests...yup:

       Finished in 0.13151 seconds (files took 0.54705 seconds to load)
       13 examples, 0 failures

       Finished verifying <server-community-ubuntu-1204> (0m5.89s).
-----> Kitchen is finished. (5m1.32s)

Add the cookbook to our server

First we'll do a berks vendor to grab all dependencies:

vagrant /chef-repo/cookbooks/couchbase $ berks vendor /tmp/couchbase-vendor
Resolving cookbook dependencies...
Fetching 'couchbase' from source at .
Fetching 'couchbase-tests' from source at test/integration/cookbooks/couchbase-tests
Using apt (2.7.0)
Using chef-sugar (3.1.0)
Using chef_handler (1.1.6)
Using couchbase (1.3.0) from source at .
Using couchbase-tests (0.1.0) from source at test/integration/cookbooks/couchbase-tests
Using export-node (1.0.1)
Using minitest-handler (1.3.2)
Using openssl (4.0.0)
Using windows (1.36.6)
Using yum (3.6.0)
Vendoring apt (2.7.0) to /tmp/couchbase-vendor/apt
Vendoring chef-sugar (3.1.0) to /tmp/couchbase-vendor/chef-sugar
Vendoring chef_handler (1.1.6) to /tmp/couchbase-vendor/chef_handler
Vendoring couchbase (1.3.0) to /tmp/couchbase-vendor/couchbase
Vendoring couchbase-tests (0.1.0) to /tmp/couchbase-vendor/couchbase-tests
Vendoring export-node (1.0.1) to /tmp/couchbase-vendor/export-node
Vendoring minitest-handler (1.3.2) to /tmp/couchbase-vendor/minitest-handler
Vendoring openssl (4.0.0) to /tmp/couchbase-vendor/openssl
Vendoring windows (1.36.6) to /tmp/couchbase-vendor/windows
Vendoring yum (3.6.0) to /tmp/couchbase-vendor/yum

Now we will upload to our server. Mine is my personal, free hosted chef server:

vagrant /chef-repo/cookbooks/couchbase $ knife cookbook upload -a -o /tmp/couch
base-vendor
Uploading apt            [2.7.0]
Uploading chef-sugar     [3.1.0]
Uploading chef_handler   [1.1.6]
Uploading couchbase      [1.3.0]
Uploading couchbase-tests [0.1.0]
Uploading export-node    [1.0.1]
Uploading minitest-handler [1.3.2]
Uploading openssl        [4.0.0]
Uploading windows        [1.36.6]
Uploading yum            [3.6.0]
Uploaded all cookbooks.

Rinse and repeat

Now I am free to destroy this box and my synced repo folder will remain in tact. We do this all the time. Boxes get stale as other team members are reving shared internal gems or you are experimenting with alternate configurations and want to just start over.

There is something to be said for the freedom that comes with knowing you can walk on a tight rope since there is a net just beneath you to catch your fall (except when there's not).

Multi node Test-Kitchen tests and working with Vagrant NAT addressing with VirtualBox by Matt Wrock

This is a gnat. Let the observer note that it is different from NAT

This is a gnat. Let the observer note that it is different from NAT

This is now the third post I have written on creating multi node test-kitchen tests. The first covered a windows scenario building an active directory controller pair and the last one covered a multi docker container couchbase cluster using the kitchen-docker driver. This will likely be the last post in this series and will cover perhaps the most commonly used test-kitchen setup using the kitchen-vagrant driver with virtualbox.

In one sense there is nothing particularly special about this setup and all of the same tests demonstrated in the first two posts can be run through vagrant on virtualbox. In fact that is exactly what the first post used for the active directory controllers although it also supports Hyper-V. However there is an interesting problem with this setup that the first post was lucky enough to avoid. If you were to switch from the docker driver to the vagrant driver in the second post that built a couchbase cluster in docker containers, you may have noticed a recipe at the top of the runlist in each suite: couchbase-tests::ipaddress.

Getting nodes to talk to one another

We will soon get to the purpose behind that recipe, but first I'll lay out the problem. When you use the kitchen-vagrant driver to build test instances with the virtualbox provider without configuring any networking properties on the driver, your instance will have a single interface with an ip address of 127.0.0.1. Its gonna be difficult to do any kind of multi node testing with this. For one thing if you have two nodes, they will not be able to talk to each other over these interfaces. From the outside the host can talk to these nodes using its own localhost address and over the forwarded ports. But to another node? They are dead to each other.

The trick to get them to be accessible to one another is to add an additional network interface to the nodes by adding a network in the kitchen.yml:

    driver:
      network:
        - ["private_network", { type: "dhcp" }]

Now the nodes will have a dhcp assigned ip address that both the host and each node can use to access the other.

One could then use my kitchen-nodes provisioner that derives from the chef-zero provisioner so that normal chef searches can find the other nodes and access their ip addresses.

Just pull down the gem:

gem install kitchen-nodes

Add it to your .kitchen.yml:

provisioner:
  name: nodes

Now chef searches from inside your nodes can find one another as long as both nodes have beed created:

other_ip = search(:node, "run_list:*couchbase??master*")[0]['ipaddress']

Missing the externally reachable ip address in ohai

While this allows the nodes to see one another, one may surprised if they inspect a node's own ip address from inside that node.

ip = node['ipaddress'] # will be 127.0.0.1

The ip will be the localhost ip and not the same ip address that the other node will see. This may be fine in many scenarios, but for others you may need to know the externally reachable ip. You would like the ohai attributes to expose the second NIC's address and not the one belonging to the localhost interface.

This was the wall I hit in my kitchen-docker post because I was registering couchbase nodes in a couchbase cluster and the couchbase cluster could not add multiple nodes with the same ip (127.0.0.1) and each node needed to register an ip that could be used to access itself by the master node.

I googled for a while to see how others dealt with this scenario. I did not find much but what I did find were posts explaining how to create a custom ohai plugin to expose the right ip. Most posts really contained just a fraction of the information needed to create the plugin and once I did manage to assemble all the information needed to create the plugin, it honestly felt like quite a bit of ceremony for such a simple assignment.

Overwriting the 'automatic' attribute

So I thought instead of using an ohai plugin I'd find the second interfaces address in the ohai attributes and then set the ['automatic']['ipaddress'] attribute with that reachable ip. This seemed to work jut fine and as long as its done at the front of the chef run, any call to node['ipaddress'] subsequently in the run would return the desired address.

Here is the full recipe that sets the correct ipaddress attribute:

kernel = node['kernel']
auto_node = node.automatic

# linux
if node["virtualization"] && node["virtualization"]["system"] == "vbox"
  interface = node["network"]["interfaces"].select do |key|
    puts key
    key == "eth1"
  end
  unless interface.empty?
    interface["eth1"]["addresses"].each do |ip, params|
      if params['family'] == ('inet')
        auto_node["ipaddress"] = ip
      end
    end
  end
  
# windows

elsif kernel && kernel["cs_info"] && kernel["cs_info"]["model"] == "VirtualBox"
  interfaces = node["network"]["interfaces"]
  interface_key = interfaces.keys.last
  auto_node["ipaddress"] = interfaces[interface_key]["configuration"]["ip_address"][0]
end

First, this is not a lot of code and its confined to a single file. Seems a lot simpler than the wire-up required in an ohai plugin.

This is designed to work on both windows and linux. I have only used it on windows 2012R2 and ubuntu but its likely fairly universal. The code only sets the ip if virtualbox is the host hypervisor so you can safely include the recipe in multi-driver suites and other non virtualbox environments will simply ignore it.

Get it in the hurry-up-and-test-cookbook

In case others would like to use this I have included this recipe in a new cookbook I am using to collect recipes helpful in testing cookbooks. This cookbook is called hurry-up-and-test and is available on the chef supermarket. It also includes the export-node recipe I have shown in a couple posts that allows one to access a node's attributes from inside test code.

I hope others find this useful and I'd love to hear if anyone thinks there is a reason to package this as a full blown ohai plugin instead.

Now what are you waiting for? Hurry up and test!!

Lamentations of the OSS consumer: I'd like to read the $@#%ing manual but no one has written it by Matt Wrock

I've been an active open source consumer and contributor for the past few years and overall being involved in both roles has been one of the peak experiences of my career and I only wish  I had discovered open source much much sooner. However its not all roses, and things can be rough on both sides of the pull request. Especially for those new to these ecosystems and even more so if you come from a heritage of tools and culture not originally friendly to open source.

Yesterday I received a great email from someone asking about how to navigate a very popular infrastructure testing project where the documentation can be sparse in some respects. The project is ServerSpec - a really fantastic tool that many gain value from every day. The question comes from someone new to the chef community and ServerSpec is a key tool used in the chef ecosystem. The questioner immediately won my respect. They were curious but not bitter (at least did not admit to being so) and wanted to know how to learn more and start to contribute and get to a point of writing more creative tests.

This inspired me because i love interacting with people who are passionate about this craft and who like myself want to learn and improve themselves. It also struck a nerve since I have alot of opinions about approaching OSS projects and empathy for those new to the playing field and perhaps feeling a bit awkward. This individual, like myself, comes from a windows background, so I think I have some insights to where he is coming from.

I thought it might be interesting to transform my responses to a blog post. Here are some modified excerpts from my replies.

When Windows is the edge case

I think one issue that windows suffers from in this ecosystem is that it is the "edge case". The vast majority of those using, testing and contributing to this project are largely linux users. So when a minor version bump occurs and the PR notes explicitly call out that it wont break current tests, clearly thats evidence that windows was not tested. Although one can argue that windows is just not much of a player (but that's of course changing).

I'd look at this differently if this was code in a chef owned project where I would expect them to be paying more attention to windows. Regarding ServerSpec, a wholly open source project with no funding and shepherded by a community member who has a full time job that is not ServerSpec, I tend to be more forgiving but it can definitely make for a frustrating development experience at times.

I'm really hoping that more windows folks get involved and contribute more in this ecosystem both with code and documentation and also just filing issues, and I hope that their employers support them in these efforts. They stand to gain alot in doing so.

There may be no manual but there are always the codes

One thing I have found in the ruby world and much of the OSS world outside of ruby is that sometimes the best way to figure something out is to read the source code. The obvious downside to this is that its hard to read a language we may not be familiar with and we just want to write our test and move on with our lives.

So I find myself going back and forth. I may just do some quick “code spelunking” and not find anything that clearly points out how to do what I want and may take an uglier “brute force” approach. On other days depending on mood and barometric pressure in the office, I might be inclined to spend the time and dig deeper. It would be awesome if the authors took the time to spell out how to write custom matchers and resource types, but many OSS projects seem to lack this level of detailed documentation. I’m guessing because no one is paying them to and in the end they, like us, have a problem that needs solving and lack time to document.

One consolation is that these ruby libraries tend to be relatively small. Compare the ServerSpec code base including its sister project Specinfra to something like XUnit in C#. Its a lot less code. Of coarse it may take 3x longer to groc if you are a ruby beginner. What I often find is that given the motivation to learn and be more proficient, you eventually reach a point of minimal comfort with the codebase where you get it enough to see what needs to be added to get the thing to do what you want it to do and that’s when you start making contributions.

Heh. I totally have a weird love hate relationship with this stuff. There are days when  I curse these libraries because I just want to do something that seems so simple and I really have no desire or time to make an investment and then there are other times when  I am totally into the code and loving the sense that  I am gaining an understanding of new patterns and coding constructs and realize I’m gaining some knowledge where I can not only make the code better for me but for others as well.

In the end its all just a constant slog through the marshes of learning and as software engineers, that’s our sweet spot. The ability to live in a state of learning and not so much bask in what we have learned.

Multi node testing with Test-Kitchen and Docker containers by Matt Wrock

Two docker containers created and tested with Kitchen-Docker

Two docker containers created and tested with Kitchen-Docker

My last post provided a walk through of some of the new Windows functionality available in the latest Test-Kitchen RC and demonstrated those features by creating and testing a Windows Active Directory domain controller pair. This post will also be looking at testing multiple nodes but instead of windows, I'll be spinning up multiple docker containers. I'm going to be using a Couchbase cluster as my example. Note that while I am using docker containers, there is nothing special happening here preventing one from running the same tests on multiple linux or windows VMs using the Kitchen-Vagrant driver. Couchbase runs on windows too.

Why run tests with containers when my production nodes are VMs?

There are some really interesting things being done with containers in production environments but even if you are not using containers in production, there are some clear benefits to using them for testing infrastructure development. The biggest value is faster provisioning. Using the kitchen-docker driver over vagrant or another cloud based driver can potentially save several minutes per test. You might wonder "whats a couple minutes?" However, when you are iterating over a problem and need to reprovision several times, a couple minutes or more can add up quick.

You still want to test provisioning to VMs if that is what your production infrastructure runs, but that can sit later in your testing pipeline. You will save alot of time, money and tears (you'll need those later) by keeping your feedback cycles short early in your development process.

Setting things up

To get started you will need to have the docker engine installed and the latest RC of test-kitchen.

Docker Install

There are a few approaches one can take to installing docker. Some are more complicated than others and really depend on your host operating system. I'm using an Ubuntu 14.04 desktop os on my laptop. Ubuntu 14.04 has no prerequisites and you simply run:

wget -qO- https://get.docker.com/ | sh

Ubuntu 12.02 requires a kernel upgrade and several packages before the above install will work. The docker installation documentation provides instructions for most operating systems.  If you are running windows or a mac, you will want to run the docker engine from inside a linux vm. You can either setup a vm of your favorite linux distro and then install docker following the instructions on the docker site or you can install Boot2Docker which will install a local docker CLI, VirtualBox, and a stripped down, tiny core linux image.

This post is not aimed to explore the different ways of installing docker. If you do not already have docker or a vm setup from which you can install it friction-free, take a look at my chef_workstation repo that includes a Vagrantfile that will provision a  workable chef enabled workstation environment with docker installed. It should work with VirtualBox, Hyper-V or Parallels on a mac. I believe it also works for VMWare Fusion users but I have not validated that for a while.

A multi-node enabled cookbook to test

To demonstrate multi node testing with test-kitchen, I have forked the community couchbase cookbook. I'll be sending a PR with these changes:

  • Compatibility with docker (current version uses netstat to validate a listening port an thats not installed on the default ubuntu container)
  • Extends the couchbase-cluster resource to allow other nodes to be joined to a cluster
  • Fixes the cookbook on windows which is unrelated to this post but aligns well with one of my personal missions in life

Clone my fork and checkout the multi-node branch:

git clone -b multi-node https://github.com/mwrock/couchbase

If you are using the vagrant box in my chef_workstation repo, cd to the cookbooks directory just below the directory you land in from vagrant ssh and clone from there.

Using the right gems

To help facilitate testing multiple nodes, this cookbook uses a custom test-kitchen provisioner plugin that utilizes functionality exposed in the latest test-kitchen RC. So the cookbook includes a Gemfile that references both of these gems and other important dependencies. To ensure that you are testing with all of the correct gems, cd into the root of the couchbase cookbook and run:

bundle install

Converge and test the first node

We are now ready to create, converge and test the first node of our couchbase cluster. Make sure to run with bundle exec so that we use all of the correct gem versions:

bundle exec kitchen verify server-community-ubuntu

This will start a new container running ubuntu 12.04, install Couchbase and initialize a new cluster. Then a serverspec test will ensure that the service is running and configured the way we want it.

Joining an additional node to the cluster

To get the full multi-node effect, lets now ask test-kitchen to run our second-node suite:

bundle exec kitchen converge second-node-ubuntu

This brings up a new container that will post to the couchbase rest endpoint of our first node asking to join the cluster. Then its serverspec test will pull the list of nodes in the cluster exposed from the original node and check if our second node is included in the list.

Discovering the original node

One possible strategy could be to set an attribute specifying the IP or host name of the initiating couchbase node. However this assumes it is a known and constant value. You may prefer your infrastructure to dynamically query for an existing couchbase node. In our test scenario, we really cant predict the ip or host name since we are getting IPs from DHCP and docker is handing out a unique hash for a host name.

Note that we could tweak the driver configuration in our .kitchen.yml to expose predictable hostnames that can link to other containers. Here is an example of a possible config for our node suites:

suites:
- name: server-community
  driver:
    publish_all: true
    instance_name: first_cluster
  run_list:
  - recipe[couchbase::server]
  attributes:
    couchbase:
      server:
        password: "whatever"

- name: second-node
  driver:
    links: "first_cluster:first_cluster"
  run_list:
  - recipe[couchbase-tests::default]
  attributes:
    couchbase:
      server:
        password: "whatever"
        cluster_to_join: first_cluster

Here the first node uses the kitchen-docker configuration to ask the docker engine to expose its container with a specific name "first_cluster." The second node is asked to link the name "first_cluster" with he "first_cluster" instance. This way any requests from the second container to the DNS name first_cluster will resolve to our first container. Finally we would create a node attribute named luster_to_join that our second node would ask to join.

This may work for your scenario and thats great. However it may break down for others. First its not very portable. This cookbook supports windows and locking in docker specific options will run into problems for windows tests that leverage vagrant here:

- name: windows-2012R2
  driver:
    name: vagrant
    network:
      - ["private_network", { type: "dhcp" }]
  transport:
    name: winrm
  driver_config:
    gui: true
    box: mwrock/Windows2012R2Full
    customize:
      memory: 1024

Furthermore, our test logic needs to match production logic. If production nodes will be querying the chef server for a node to send cluster join requests to, out tests must validate that this strategy works.

The kitchen-nodes provisioner plugin

In my last post I demonstrated a strategy that uses chef search to find a chef node based on a run list recipe. It used my kitchen-nodes provisioner plugin to create mock chef nodes of each kitchen suite so that a chef search can find other suite test instances during convergences. Since that example was creating a windows active directory controller pair, its functionality had some windows specific functionality. I have extended the functionality of this plugin to support most *Nix scenarios including docker.

First we tell test-kitchen to use the kitchen-nodes plugin as a provisioner for the suites that test our couchbase servers:

suites:
- name: server-community
  provisioner:
    name: nodes
  run_list:
  - recipe[couchbase-tests::ipaddress]
  - recipe[couchbase::server]
  - recipe[export-node]
  attributes:
    couchbase:
      server:
        password: "whatever"

- name: second-node
  provisioner:
    name: nodes
  run_list:
  - recipe[couchbase-tests::ipaddress]
  - recipe[couchbase-tests::default]
  - recipe[export-node]
  attributes:
    couchbase:
      server:
        password: "whatever"

The defult recipe of the couchbase-tests cookbook used by our second node can now find the first node using chef search:

primary = search_for_nodes("run_list:*couchbase??server* AND platform:#{node['platform']}")
node.normal["couchbase-tests"]["primary_ip"] = primary[0]['ipaddress']

The search_for_nodes method is defined in our couchbase-tests library:

require 'timeout'

def search_for_nodes(query, timeout = 120)
  nodes = []
  Timeout::timeout(timeout) do
    nodes = search(:node, query)
    until  nodes.count > 0 && nodes[0].has_key?('ipaddress')
      sleep 5
      nodes = search(:node, query)
    end
  end

  if nodes.count == 0 || !nodes[0].has_key?('ipaddress')
    raise "Unable to find nodes!"
  end

  nodes
end

Here we are using a chef search to find a node that includes the couchbase server recipe and has the same os platform of the current node. Matching on platform is important if our .kitchen.yml is designed to test more than one platform like ours.

Chef-zero and chef search

The kitchen-nodes plugin derives from the chef-zero test-kitchen provisioner. Using chef-zero we can issue a chef-search for nodes without being hooked up to a real chef-server. Chef-zero accomplishes this by storing information on each node in a json file stored in its nodes folder. The test-kitchen chef-zero provisioner wires all of this up by copying all files under tests/integration/nodes to {test-kitchen temp folder on test instance}/nodes. So you can create a json file for each test suite in your local nodes folder and then chef search calls will effectively treat the nodes files as the master chef server database.

The kitchen-nodes plugin automatically generates a node file when a test instance is provisioned by test-kitchen. Provisioning occurs at the very beginning of the converge operation. kitchen-nodes populates the node's json file with ip address, platform, and run list. Here are the two nodes' json files generated in my tests:

{
  "id": "server-community-ubuntu-1204",
  "automatic": {
    "ipaddress": "172.28.128.3",
    "platform": "ubuntu"
  },
  "run_list": [
    "recipe[couchbase-tests::ipaddress]",
    "recipe[couchbase::server]",
    "recipe[export-node]"
  ]
}

{
  "id": "second-node-ubuntu-1204",
  "automatic": {
    "ipaddress": "172.17.128.4",
    "platform": "ubuntu"
  },
  "run_list": [
    "recipe[apt]",
    "recipe[couchbase-tests::ipaddress]",
    "recipe[couchbase-tests::default]",
    "recipe[export-node]"
  ]
}

During provisioning, kitchen-nodes will either use SSH or WinRM depending on the test instance platform to interrogate its interfaces for an IP that is accessible to the host. On windows, this information is retrieved using a few powershell cmdlets and on *Nix instances either ifconfig or ip addr show is used depending on what is available on that distro. There may be several interfaces but kitchen-nodes will only choose an ipv4 ip that can be pinged from the host.

Testing that we joined the correct cluster

So how do we test that we actually found the correct node? We cant write a serverspec test using a hard coded IP. We use a testing recipe, export-node, that dumps the entire node object to a json file. Our test recipe run by the second node stores the primary node's IP in a node attribute as we saw further above.

Here is an instant replay:

node.normal["couchbase-tests"]["primary_ip"] = primary[0]['ipaddress']

So when the export-node cookbook dumps the node data, that IP address will be included. Here is the test that validates the node join:

describe "cluster" do
  let(:node) { JSON.parse(IO.read(File.join(ENV["TEMP"] || "/tmp", "kitchen/chef_node.json"))) }
  let(:response) do
    resp = Net::HTTP.start node["normal"]["couchbase-tests"]["primary_ip"], 8091 do |http|
      request = Net::HTTP::Get.new "/pools/default"
      request.basic_auth "Administrator", "whatever"
      http.request request
    end
    JSON.parse(resp.body)
  end

  it "has found the priary node and it is not itself" do
    expect(node["normal"]["couchbase-tests"]["primary_ip"]).not_to eq(node['automatic']['ipaddress'])
  end

  it "has joined the primary cluster" do
    joined = false
    response['nodes'].each do |cluster_node|
      if cluster_node['hostname'] == "#{node['automatic']['ipaddress']}:8091"
        joined =  true
      end
    end

    expect(joined).to be true
  end
end

The export-nodes cookbook dumps the node json to a file named chef-node.json in the kitchen temp folder. So our test pulls the ip that was returned by the chef search from here. It makes sure that it is in fact a different node from its own IP and then issues a couchbase API request to that node to return all nodes in its cluster. Our test passes as long as the second node is included in the returned node list.

Testing all the things

I find this helpful and reassuring that I can include my node interactions into my tests. Test-Kitchen's coverage can indeed extend well beyond the boundaries of a single node.