I was recently pinged by a #cheffriend of mine, Alex Vinyar. A few months back, I sent him a bit of code that I wrote at CenturyLink Cloud that runs on our build agents to determine which cookbooks to test after a cookbook commit is pushed to source control. He just mentioned to me that he has gotten some positive feedback on that code so I thought I'd blog on the details of that for others who may be interested. This can also be considered a follow up to my post from almost a year ago on cookbook dependency management where I shared some theory I was pondering at the time on this subject. This post covers how I followed up on that theory with working code.
I'm going to cover two key strategies:
- Testing all cookbooks that depend on a changed cookbook
- Ensuring that once a cookbook is tested and succeeds, it is never tested again as long as it and its dependencies do not change
I'll share the same code I shared with Alex (and more) in a bit.
Not enough to test just the cookbook changed
Lets consider this scenario: You manage infrastructure for a complex distributed application. You maintain about a baker's dozen cookbooks - see what I did there?...baker...chef...look at me go girl! You have just committed and pushed a change to one of your lower level library cookbooks that exposes functions used by several of your other cookbooks. We'd expect your pipeline to run tests on that library cookbook but is that enough? What if your upstream cookbooks that depend on the library cookbook "blow up" when they converge with the new changes. This is absolutely possible and not just for library cookbooks.
I wanted a way for a cookbook commit to trigger not only its own chefspecs and test-kitchen tests but also the tests of other cookbooks that depend on it.
Kitchen tests provide slow but valuable feedback, don't run more than needed
Here is another scenario: You are now testing the entire dependency tree touched by a change...yay! But failure happens and your build fails. However the failure may not have occurred at the root of your tree (the cookbook that actually changed). Here are some possibilities:
- Maybe there was some transient network glitch that caused a upstream cookbook to fail
- Gem dependency changes outside of your own managed cookbook collection (you know - the dark of semver - minor build bumps that are really "major")
- Your build has batched more than one root cookbook change. One breaks but the others pass
So an hour's worth (or more) of testing 10 cookbooks results in a couple broken kitchen tests. You make some fixes, push again and have to wait for the entire set to build/test all over again.
I have done this before. Just look at my domain name!
However many of these tests may be unnecessary now. Some of the tests that has succeeded in the filed batch may be unaffected by the fix.
Finding dependencies
So here is the class we use at CenturyLink Cloud to determine when one cookbook as been affected by a recent change. This is what I sent to Alex. I'll walk you through some of it later.
require 'berkshelf' require 'chef' require 'digest/sha1' require 'json' module Promote class Cookbook def initialize(cookbook_name, config) @name = cookbook_name @config = config @metadata_path = File.join(path, 'metadata.rb') @berks_path = File.join(path, "Berksfile") end # List of all dependent cookbooks including transitive dependencies # # @return [Hash] a hash keyed on cookbook name containing versions def dependencies @dependencies ||= get_synced_dependencies_from_lockfile end # List of cookbooks dependencies explicitly specified in metadata.rb # # @return [Hash] a hash keyed on cookbook name containing versions def metadata_dependencies metadata.dependencies end def version(version=nil) metadata.version end def version=(version) version_line = raw_metadata[/^\s*version\s.*$/] current_version = version_line[/('|").*("|')/].gsub(/('|")/,"") if current_version != version.to_s new_version_line = version_line.gsub(current_version, version.to_s) new_content = raw_metadata.gsub(version_line, new_version_line) save_metadata(new_content) end version end # Stamp the metadata with a SHA1 of the last git commit hash in a comment # # @param [String] SHA1 hash of commit to stamp def stamp_commit(commit) commit_line = "#sha1 '#{commit}'" new_content = raw_metadata.gsub(/#sha1.*$/, commit_line) if new_content[/#sha1.*$/].nil? new_content += "\n#{commit_line}" end save_metadata(new_content) end def raw_metadata @raw_metadata ||= File.read(@metadata_path) end def metadata @metadata ||=get_metadata_from_file end def path File.join(@config.cookbook_directory, @name) end # Performs a berks install if never run, otherwise a berks update def sync_berksfile(update=false) return if berksfile.nil? update = false unless File.exist?("#{@berks_path}.lock") update ? berksfile_update : berksfile.install end # Determine if our dependency tree has changed after a berks update # # @return [TrueClass,FalseClass] def dependencies_changed_after_update? old_deps = dependencies sync_berksfile(true) new_deps = get_dependencies_from_lockfile # no need for further inspection is # number of dependencies have changed return true unless old_deps.count == new_deps.count old_deps.each do |k,v| if k != @name return true unless new_deps[k] == v end end return false end # Hash of all dependencies. Guaranteed to change if # dependencies of their versions change # # @param [environment_cookbook_name] Name of environmnt cookbook # @return [String] The generated hash def dependency_hash(environment_cookbook_name) # sync up with the latest on the server sync_latest_app_cookbooks(environment_cookbook_name) hash_src = '' dependencies.each do | k,v | hash_src << "#{k}::#{v}::" end Digest::SHA1.hexdigest(hash_src) end # sync our environmnt cookbook dependencies # with the latest on the server # # @return [Hash] the list of cookbooks updated def sync_latest_app_cookbooks(environment_cookbook_name) result = {} #Get a list of all the latest cookbook versions on the server latest_server_cookbooks = chef_server_cookbooks(1) env_cookbook = Cookbook.new(environment_cookbook_name, @config) # iterate each cookbook in our environment cookbook env_cookbook.metadata_dependencies.each do |key| cb_key = key[0] # cookbook name # skip if its this cookbook or if its not on the server next if cb_key == @name || !latest_server_cookbooks.has_key?(cb_key) latest_version = latest_server_cookbooks[cb_key]['versions'][0]['version'] # if the server as a later version than this cookbook # we do a berks update to true up with the server if dependencies.keys.include?(cb_key) && latest_version > dependencies[cb_key].to_s berksfile_update(cb_key) result[cb_key] = latest_version end end # return the updated cookbooks result end private def chef_server_cookbooks(versions = 'all') Chef::Config[:ssl_verify_mode] = :verify_none rest = Chef::REST.new( @config.chef_server_url, @config.node_name, @config.client_key ) rest.get_rest("/cookbooks?num_versions=#{versions}") end def berksfile @berksfile ||= get_berksfile end def get_berksfile return unless File.exist?(@berks_path) Berkshelf.set_format :null Berkshelf::Berksfile.from_file(@berks_path) end def save_metadata(content) File.open(@metadata_path, 'w') do |out| out << content end # force subsequent usages of metadata to refresh from file @metadata=nil @raw_metadata = nil end def get_metadata_from_file metadata = Chef::Cookbook::Metadata.new metadata.from_file(@metadata_path) metadata end def get_synced_dependencies_from_lockfile # install to make sure we have a lock file berksfile.install get_dependencies_from_lockfile end # get the version of every dependent cookbook from berkshelf def get_dependencies_from_lockfile berks_deps = berksfile.list deps = {} berks_deps.each {|dep| deps[dep.name] = dep.locked_version} deps end def berksfile_update(cookbooks = nil) cookbooks.nil? ? berksfile.update : berksfile.update(cookbooks) @dependencies = nil end end end
This class is part of a larger "promote" gem that we use to promote cookbooks along our delivery pipeline. The class represents an individual cookbook and can be instantiated given an individual cookbook name and a config.
The config class (not shown here) is an extremely basic data class that encapsulates the properties of the chef environment.
Some assumptions
This code will only work given a couple conditions are true. One is fairly common and the other not so much. Even if these conditions do not exist in your environment you should be able to groc the logic.
Using Berkshelf
You are using berkshelf to manage your dependencies. This class uses the berkshelf API to find dependencies of other cookbooks.
Uning the Environment Cookbook Pattern
We use the environment cookbook pattern to govern which cookbooks are considered part of a deployable application. You can read more about this pattern here but an environment cookbook is very simple. It is just a metadata.rb and a Bekshelf.lock. The metadata file includes a depends for each of my cookbooks responsible for infrastructure used in the app. The Berkshelf.lock file uses that to build the entire tree of not only "my own" cookbooks but all third party cookbooks as well.
The dependency hash
For the purposes of this post, the dependency_hash method of the above class is the most crucial. This hash represents all cookbook dependencies and their version at the current point in time. If I were to persist this hash (more on that in a bit) and then later regenerate it with a hash that is at all different from the persisted one, I know something has changed. I don't really care where. Any change at all anywhere in a cookbook's dependency tree means it is at risk of breaking and should be tested.
The "Last Known Good" environment
Every time we test a cookbook and it passes, we store its current dependency hash in a special chef environment called LKG (last known good). This is really a synthetic environment with the same structure as any chef environment but the version constraints may look a bit strange:
{ name: "LKG_platform", chef_type: "environment", json_class: "Chef::Environment", cookbook_versions: { p_rabbitmq: "1.1.15.2d88c011bfb691e9de0df593e3f25f1e5bf7cbba", p_elasticsearch: "1.1.79.2dd2472fbd47bf32db4ad83be45acb810b788a90", p_haproxy: "1.1.47.75dfc55b772fa0eb08b6fbdee02f40849f37ca77", p_win: "1.1.121.fa2a465e8b964a1c0752a8b59b4a5c1db9074ee5", p_couchbase: "1.1.92.325a02ab22743eccc17756539d3fac57ce5ef952", elasticsearch_sidec: "0.1.7.baab30d04181368cd7241483d9d86d2dffe730db", p_orchestration: "0.1.48.3de2368d660a4c7dfd6dba2bc2c61beeea337f70" } }
We update this synthetic environment any time a cookbook test runs and passes. Here is a snippet of our Rakefile that does this:
cookbook = Promote::Cookbook.new(cookbook_name, config) environment = Promote::EnvironmentFile.new("LKG", config) lkg_versions = environment.cookbook_versions cookbook_version = "#{cookbook.version.to_s}.#{hash}" lkg_versions[cookbook_name] = cookbook_version environment.write_cookbook_versions(lkg_versions)
This is just using the cookbook and environment classes of our Promote gem to write out the new versions and following this, the environment like all of our environments is committed to source control.
The dirty check
We have now seen how we generate a dependency hash and how we persist that on successful tests. Here is where we actually compare our current dependency hash and the last known hash. We need to determine before running integration tests on a given cookbook if it is any different than the last time we succesfully tested it. If we previously tested it and neither it nor any of its dependencies (no matter how deep) have changed, we really do not need or want to test it again.
def is_dirty( source_environment, lkg_environment, cookbook_name, dependency_hash = nil ) # Get the environment of the last good tests lkg_env_file = EnvironmentFile.new(lkg_environment, config) # if this cookbook is not in the last good tests, we are dirty unless lkg_env_file.cookbook_versions.has_key?(cookbook_name) return true end # get the current environment under test source_env_file = EnvironmentFile.new(source_environment, config) # get the version of this cookbook from the last good tests # it will look like 1.1.15.2d88c011bfb691e9de0df593e3f25f1e5bf7cbba # thi is the normal version followed by the dependency hash tested lkg_parts = lkg_env_file.cookbook_versions[cookbook_name].split('.') # Get the "plain" cookbook version lkg_v = lkg_parts[0..2].join('.') # the dependency hash last tested lkg_hash = target_parts[3] # we are dirty if our plain version or dependency hash has changed return source_env_file.cookbook_versions[cookbook_name] != lkg_v || dependency_hash != lkg_hash end
This method takes the environment currently under test, our LKG environment that has all the versions and their dependency hashes last tested, the cookbook to be tested and its current dependency hash. It returns whether or not our cookbook is dirty and needs testing.
Prior to calling this code, our build pipeline edits our "test" environment (the "source_environment" above). This environment holds the cookbook versions of all cookbooks being tested including the last batch of commits. We use a custom kitchen provisioner plugin to ensure Test-Kitchen uses these versions and not just the versions in each cookbooks berkshelf lock file. As I discussed in my dependency post from last year, its important that all changes are tested together.
So there you have it. Now Hurry up and Wait for your dirty cookbooks to run their tests!