In search of a light weight windows vagrant box by Matt Wrock

Update 2: See this new post that includes everything here and also automates everything via Packer and Boxstarter.

Update: More on packaging windows 10 Hyper-V and comparison of LZMA vs. GZIP compression can be found in this more recent post. My conclusion is that gzip is bigger but better.

Windows... No getting around it - its a beast compared to the size of its linux step sibling. When I hear colleagues complaining about pulling down a couple hundred MB to get their linux vagrant box up and running, I feel like a third world scavenger of disk space while they whine about their first world 100mb problems. While this post does not solve this problem, it does provide a way to deliver a reasonably sized windows box weighing in under 3GB.

This post will explain how to:

  • Obtain a free evaluation server 2012 R2 ISO
  • Prepare a minimally sized VM image
  • Package the VM for use via vagrant both in VirtualBox and Hyper-V
  • Allow the box to be accessed by others over the web. The world wide web.

I may follow up with a more automated approach but here I'll be walking through the process fairly manually. However my instructions will largely be command line samples so one should be able to cobble together a script that performs all of this. I will be doing just that.

Downloading an evaluation ISO

The Technet evaluation center provides free evaluation copies of the latest operating systems. As of this post, the available operating systems are:

  • Server 2012R2
  • Server 2012
  • Server 2008 R2
  • Windows 8.1
  • Windows 8
  • Windows 7

These are fully functional copies with a limited lifetime. I believe the client SKUs are 90 days and the server SKUs are 180 days. Server 2012 R2 is definitely 180 days. Once the evaluation period expires, you can download a new evaluation and the trial period begins anew. Some may say like a new spring. Others will choose to describe it differently but either way it is perfectly legit.

The evaluation center provides several formats for download including ISO, VHD, and Azure VM. I prefer to download the ISO since it is much smaller than the VHD and it can be used to create both a Hyper-V and VirtualBox VM.

Create the VM

I am assuming here that you are familiar with how to do this on your preferred hypervisor platform. However I recommend that you choose the following file formats:

VHD (gen 1) on Hyper-V

Unless you know that this vagrant box will only be used on windows 8.1 or server 2012 R2 hosts, a generation 2 .vhdx format will not work on older OS versions. Chances are that most vagrant use cases will not be taking advantage of the generation 2 features anyways.

.vmdk on Virtual Box

The reason for this will become more apparent when we get to packaging the VM. The short answer here is that this is going to put you in a better position for a more optimally compressed (likely much more) box file.

Note: You cannot run Hyper-V and VirtualBox on the same machine. You can create second boot record, but I have had too many bad experiences uninstalling VirtualBox on windows and ending up with a corrupted network stack. So I am using Hyper-V on my windows box and VirtualBox on my Ubuntu machine.

When the ISO installation begins, you will be given a choice of Standard or DataCenter editions as well as whether to install with the GUI or not. I am installing Server 2012 R2 Standard with GUI.

Preparing the image

Here we want to ensure our VM can interact with vagrant with as little friction as possible and be as small as possible.

Configuring the vagrant user

By default, vagrant handles all authentication with a user named 'vagrant' and password 'vagrant'. So we will rename the built-in administrator to 'vagrant' and change the password to 'vagrant'.

First thing that needs to happen here is to turn off the password complexity policy that is enabled by default on server SKUs. To do this:

  1. Run gpedit from a command prompt
  2. Navigate to Computer Configuration -> Windows Settings -> Security Settings -> Account Policies -> Password Policy
  3. Select 'Password must meet complexity requirements' and disable

Good riddens complexity!

Now we jump to powershell and rename the administrator account and change its password to vagrant:

$admin=[adsi]"WinNT://./Administrator,user"
$admin.psbase.rename("vagrant")
$admin.SetPassword("vagrant")
$admin.UserFlags.value = $admin.UserFlags.value -bor 0x10000
$admin.CommitChanges()

This may be obvious but that second to the last line instructs the password to never expire. Its a good idea to reboot at this point since you are currently logged in as this user and things can and will get weird eventually if you don't.

Configure WinRM

While server 2012 R2 will have powershell remoting automatically enabled, we need to slightly tweak the winrm configuration to play nice with vagrant's ruby flavored winrm implementation:

winrm set winrm/config/client/auth '@{Basic="true"}'
winrm set winrm/config/service/auth '@{Basic="true"}'
winrm set winrm/config/service '@{AllowUnencrypted="true"}'

Allow Remote Desktop connections

Vagrant users may use the vagrant rdp command to access a vagrant vm so we want to make sure this is turned on:

$obj = Get-WmiObject -Class "Win32_TerminalServiceSetting" -Namespace root\cimv2\terminalservices
$obj.SetAllowTsConnections(1,1)

Enable CredSSP (2nd hop) authentication

Users may want to use powershell remoting to access this server and need to authenticate to other network resources. Particularly on a vagrant box, this may be necessary to access synced folders as a network share. In order to do this in a powershell remoting session, CredSSP authentication must be enabled on both sides. So we will enable it on the server:

Enable-WSManCredSSP -Force -Role Server

Enable firewall rules for RDP and winrm outside of subnet

Server SKUs will not allow remote desktop connections by default and non domain users making winrm connections will be refused if they are outside of the local subnet. So lets turn that on:

Set-NetFirewallRule -Name WINRM-HTTP-In-TCP-PUBLIC -RemoteAddress Any
Set-NetFirewallRule -Name RemoteDesktop-UserMode-In-TCP -Enabled True

Shrink PageFile

The page file can sometimes be the largest file on disk with the least amount of usage. It all depends on your configuration but lets trim ours down to half a gb.

$System = GWMI Win32_ComputerSystem -EnableAllPrivileges
$System.AutomaticManagedPagefile = $False
$System.Put()

$CurrentPageFile = gwmi -query "select * from Win32_PageFileSetting where name='c:\\pagefile.sys'"
$CurrentPageFile.InitialSize = 512
$CurrentPageFile.MaximumSize = 512
$CurrentPageFile.Put()

You will need to reboot to see the page file change size but you can continue on without issue and reboot later.

Change the powershell execution policy

We want to make sure users of the vagrant box can run powershell scripts without any friction. We'll assume that all scripts are safe within the blissful confines of our vagrant environment:

Set-ExecutionPolicy -ExecutionPolicy Unrestricted

Unrestricted. To some its an execution policy. Others - a lifestyle.

Install Important Windows Updates

Not gonna bore you with the details here. Simply install all important windows updates (dont bother with optional updates), reboot and repeat until you get the final green check mark indicating you have gotten them all.

Cleanup WinSXS update debris

Now that we have installed those updates there are gigabytes (not many but enough) of backup and rollback files lying on disk that we dont care about. We are not concerned with uninstalling any of the updates. New in windows 8.1/2012 R2 (and back ported to win 7/2008R2) there is a command that will get rid of all of this unneeded data:

Dism.exe /online /Cleanup-Image /StartComponentCleanup /ResetBase

This may take several minutes, but if you can make it through the updates, you can make it through this.

Additional disk cleanup

On windows client SKUs you have probably used the disk cleanup tool to remove temp files, error reports, web cache files, etc. This tool is not available by default on server SKUs but you can turn it on. We will go ahead and do that:

Add-WindowsFeature -Name Desktop-Experience

This will require a reboot to take effect, but once that completes, the "Disk Cleanup" button should be available on the property page from the root of our C: drive or we can launch from a command line:

C:\Windows\System32\cleanmgr.exe /d c:

Go ahead and check all the boxes. Chances are this wont get rid of much on this young strapping box but might as well jettison all of this stuff.

Features on Demand

One new feature in server 2012 R2 is the ability to remove features not being used. Every windows installation regardless of the features enabled (IIS, Active Directory, Telnet, etc) has almost all of these features available on disk. However the majority will never be used and there is alot of space to be saved here by removing them.

Its really up to your own discretion as to what features you will need. In this post we are going to remove practically everything other than the bare essentials for running a GUI based server OS. If you plan to use the box for web work, you may want to go ahead and enable IIS and any related web features you think you will need now. Once a feature is removed, some features are easier to restore than others. For instance, the telnet-client feature will just be downloaded from windows update server if you choose to add it later. However, if you need to re-add the desktop-experience feature, you will need installation media handy and must point to a source .wim file. I really don't know what the logic is that determines what can be downloaded from the public windows update service and what cannot but its good to be aware of this.

First we will remove (not uninstall yet) the features that are currently enabled that we do not need:

@('Desktop-Experience',
  'InkAndHandwritingServices',
  'Server-Media-Foundation',
  'Powershell-ISE') | Remove-WindowsFeature

Now we can iterate over every feature that is not installed but 'Available' and physically uninstall them from disk:

Get-WindowsFeature | 
? { $_.InstallState -eq 'Available' } | 
Uninstall-WindowsFeature -Remove

At this point its a good idea to reboot the vm and then rerun the above command just to make sure you got everything. For some reason I find that the InkAndHandwritingServices feature is still lying around. A fantastic feature I have no doubt but trust me, you don't need it.

Defragment the drive

Now we will defrag and possibly retrim the drive using the powershell command:

Optimize-Volume -DriveLetter C

Purge any hidden data

Its possible there is unallocated space on disk with data remaining that can hinder the ability to compact the disk and also limit the compression algorithm's ability to fully optimize the size of the final box file. There is a sysinternals tool, sdelete, that will "0 out" this unused space:

wget http://download.sysinternals.com/files/SDelete.zip -OutFile sdelete.zip
[System.Reflection.Assembly]::LoadWithPartialName("System.IO.Compression.FileSystem")
[System.IO.Compression.ZipFile]::ExtractToDirectory("sdelete.zip", ".") 
./sdelete.exe -z c:

Shutdown

Stop-Computer

Packaging the box

At this point you should hopefully have a total disk size of roughly 7.5GB. For a windows server, thats pretty good. We are now ready to package up our vm into a vagrant .box file that can be consumed by others. The process for doing so is going to be different on Hyper-V and Virtualbox so I will cover both separately.

Hyper-V

First thing, compact the vhd:

Optimize-VHD -Path C:\path\to\my.vhd -Mode Full

On my laptop, the VHD just shrank from 13 to 8 GB.

Next we will export the VM:

Export-VM -Name win2012R2 -Path C:\dev\vagrant\Win2k12R2\test

Note that my VM name is win2012R2 and after this command completes, I should have a directory named c:\dev\vagrant\Win2k12R2\test\win2012R2 with three subdirectories:

  • Snapshots
  • Virtual Hard Disks
  • Virtual Machines

We can delete the Snapshots directory:

Remove-Item C:\dev\vagrant\Win2k12R2\test\win2012R2\Snapshots -Recurse -Force

Now there are two text files that we want to create in the root c:\dev\vagrant\Win2k12R2\test\win2012R2 directory. The first is a small bit of JSON:

{
  "provider": "hyperv"
}

Save this as metadata.json. Lastly, and this is optional, we will create an initial Vagrantfile:

# -*- mode: ruby -*-
# vi: set ft=ruby :

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.vm.guest = :windows
  config.vm.communicator = "winrm"
end

By adding this to the Box, it ensures that a user can run vagrant up without any initial vagrant file, and have a succesful windows experience. However, it is customary for most vagrant projects to include these settings in their project Vagrantfile in which case having this file in the box file is redundant.

We are now ready to create the final box file. This is expected to be a .tar archive with a .box extension. The name (before the extension) doesn't matter but keep it short with no spaces. I found that not just any tar archiver will work on windows and this can be a big stumbling point on a windows system where tarring is not "normal" and you may not even have tar.exe or bsdtar.exe installed.

A note of caution: You obviously have vagrant installed and vagrant comes with a version of bsdtar.exe compiled for the windows platform. However, as of version 1.6.5, this executable will not work on tar files with a final size greater than 2GB. Sadly, ours will be greater than that. Also, if you have GIT installed, it too has a tar.exe installed with similar limitations. Many may assume they can create a .tar using the popular 7zip. Technically this is true but I have not been succesful in having the final .tar play nice with vagrant. I have had success with two available solutions:

  • The version of tar.exe that is available with cygwin. I'm not a big cygwin fan and this tar requires several command switches be properly set to produce an archive that will work with vagrant: tar --numeric-owner --group=0 --owner=0 -czvmf win2kr2.box * is what worked for me. The options here ensure the ACLs are cleared out and the m option clears the timestamps.
  • (recommended) Nikhil Benesch has been kind enough to make available a version of bsdtar.exe that works great and can be downloaded here.

So assuming you have Nikhil's bsdtar on your path, run this inside the same directory that holds your exported Hyper-V directories and the two text files (metadata.json and Vagrantfile) you created:

bsdtar --lzma -cvf package.box *

Very important: the --lzma switch instructs bsdtar to use the lzma compression algorithm. This is the same algorithm 7zip uses as its default. This can dramatically reduce the size of the final box file (like 40%) over gzip. It also dramatically increases the time (and CPU) bsdtar will take to create the file. So some patience is needed but in my opinion it is well worth it. This is a one time cost and it will NOT impact decompression time which is what really matters.

VirtualBox

We will be following roughly the same process we used for Hyper-V with a couple deviations. Note that the instructions that follow differ from the guidance given in the vagrant documentation as well as most blogs. These instructions are optimized for a small file size. You can use the vagrant package command to produce virtualbox packages but that will produce a much larger file (3.9GB vs 2.75GB) than the steps that follow.

The first main difference is that we will not compact the .vmdk. Virtualbox can only compact its native .vdi files. This is ok since the final compression step will take care of this for us.

First step: export the virtualbox VM:

vboxmanage export win2012r2 -o box.ovf

Here win2012r2 is the name of the vm in the virtualbox GUI. It is important that the name of the outputted .ovf file is box.ovf. Do not use a different name.

Now for something not so intuitive: delete the exported .vmdk file that was created. It is dead to us. Regardless of the initial file type you use to create a virtualbox vm, the export command will always convert the drive to a compressed .vmdk. This compression is not ideal. What you want to do after deleting this file is copy the original .vmdk file of the vm to the same directory where your box.ovf file exists and rename it to the same name of the .vmdk you just deleted. This should be box-disk1.vmdk.

This is why we used a .vmdk (the native vmware format) in the first place. Later when this file is imported back to virtualbox by vagrant, it is going to expect a vmdk file. However, if we used the one compressed by the export command, we are stuck with the compression it yields. By using the original uncompressed .vmdk, we allow the archiver to use lzma and can get a MUCH smaller file.

Just as was shown with Hyper-V above, we want to add the same two text files to this export directory but their contents will be slightly different. The first one, metadata.json should contain:

{
  "provider": "virtualbox"
}

Next our Vagrantfile should look like:

# -*- mode: ruby -*-
# vi: set ft=ruby :

# Vagrantfile API/syntax version. Don't touch unless you know what you're doing!
VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.vm.guest = :windows
  config.vm.communicator = "winrm"
  config.vm.base_mac = "080027527704"
  config.vm.provider "virtualbox" do |v|
    v.gui = true
  end
end

Your file should have one key difference. Make sure that the MAC address matches the one in the settings of your VM. I'm pretty sure its gonna be different. So you should now have four files in your export directory:

  • box.ovf
  • box-disk1.vmdk
  • metadata.json
  • Vagrantfile

Navigate to this directory. I'm on an ubuntu desktop so I'm just gonna use the native tar. If you are on a windows host. Follow the same instructions for the compression above for Hyper-V. On my linux box I will package using:

tar --lzma -cvf package.box *

Again, the name of the box file other than the extension is not important. Also, this may take a long time. Maybe an hour. Its worth it. Its a one time cost and while much longer than producing a gzip archive, you should end up with a 2.75 GB file that will take just as long to import as the same 3.9GB file gzip would produce.

Its a win-win!

Do I have to create the box twice?

No. You can create the box in Virtual box and then clone to a .vhd based box. Then you can create a new Hyper-V VM and instead of creating a new drive, use the existing .vhd from your virtualbox vm. Just make sure your Hyper-V vm is a generation 1 vm. Thats probably the easiest way to do it. You could also create the vm in Hyper-V and then create a .vhd based virtualbox vm using your Hyper-V vm instead of creating a new one. Then copy the vhd to a vmdk based vm. Granted this is still not an ideal or seamless migration but it is one way to get around having to run through the entire box preparation twice and ensuring the two boxes are identical.

Testing your box

Before going to the trouble of uploading your box to the web and making it public, you will want to test vagrant up on the package you just created locally. This is just three commands:

vagrant box add test-box /path/to/package.box
vagrant init test-box
vagrant up

This first places the box into your global vagrant cache. By default this is a folder named .vagrant.d in your home directory. You can move it by creating an environment variable VAGRANT_HOME that points to the directory where you want to keep it. I keep mine on a different drive.

Next, vagrant init creates a Vagrantfile in your current directory so that now any time you want to interact with this box, issuing vagrant commands from this directory will affect this box.

Vagrant up to Hyper-V

Finally vagrant up will create the box in your provider and start it. For Hyper-V, or any provider other than virtualbox, you need to either include a --provider option:

vagrant up --provider hyperv

or you can set an environment variable VAGRANT_DEFAULT_PROVIDER to hyperv.

Uploading your box to the world wide web

In order for your packages to be accessible to others, you are gonna want to do two things:

  1. Upload the .box files somewhere where they can be accessed via HTTP.
  2. Optionally, publish those URLs on VagrantCloud.

You have lots of options for where to place the box files. Amazon S3, Azure storage and just recently I have been using CenturyLink Cloud object storage. I believe dropbox works but I have not tried that. Here is a script I have used in the past to upload to my Azure storage account:

$storageAccountKey = Get-AzureStorageKey 'mystorageaccount' | %{ $_.Primary }
$context = New-AzureStorageContext -StorageAccountName 'mystorageaccount' -StorageAccountKey $storageAccountKey
Set-AzureStorageBlobContent -Blob 'package.box' -Container vhds -File 'c:\path\to\package.box' -Context $context -Force

Publishing your box to VagrantCloud.com

You can certainly share your vagrant box once accessible from a URL. You simply need to share a Vagrantfile with a url setting included:

config.vm.box_url = "https://wrock.ca.tier3.io/win2012r2-virtualbox.box"

This tells vagrant to fetch the image from the provided URL if it is not found locally. However, vagrant hosts a repository where you can store images for multiple providers and where you can add documentation and versioning. This repository is located at vagrantcloud.com and allows you to give a box a canonical identifier where users can get the latest version (or specify a specific version) of the box compatible with their hypervisor.

Simply create an account at vagrantcloud. Creating an account is free as long as you are fine with your box files being public and you host the actual boxes elsewhere like amazon, azure or CenturyLink Cloud. Once you have an account, you can create a box which initially is composed of just a name and description.

Next you can add box URLs for individual providers that point to an actual .box file. With a paid account you can host the box files on vagrant cloud.

Now your username and box name form the cannonical name of your box. Vagrant will automatically try to find boxes on vagrantcloud if the box cannot be found locally. Every box on vagrantcloud includes a command line example of how to invoke the box. So my box, Windows2012R2 can be installed by any vagrant user by running:

vagrant init mwrock/Windows2012R2

If I am on my windows box with Hyper-V, it will install the Hyper-V box (make sure to add --provider hyperv to your vagrant up command or set the VAGRANT_DEFAULT_PROVIDER environment variable). On my Ubuntu desktop with VirtualBox it will download the VirtualBox box. After it downloads the box, it caches it locally and does not need to download again.

So there you have it a free windows box under 3GB. Ok...ok, thats still crazy huge but I didn't name this blog Hurry up and wait! for no reason. A little hurry...a little wait...ok maybe alot of wait...whatever it takes to get the job done.

The 80s just called - they want their telnet client back by Matt Wrock

Telnet has been around ever since I was born. No..really..it was developed in 1968 and the very first protocol used on the ARPAnet. That’s right kids, when grandpa wanted to send an email, he used telnet.

I don’t think I have used Telnet for its intended use since the late nineties, but for years and years, enabling the stock Microsoft telnet client has been part of my routine setup script for any windows box I work with.

dism /Online /Get-FeatureInfo /FeatureName:telnet-client

For me and many of my colleagues, this is often the simplest, albeit crude, tool to help determine if a remote machine is listening on a specific port. Its certainly not the only tool, but one is nearly guaranteed that this can be found on any windows OS.  

On linux, Netcat is a similarly ubiquitous tool that is typically installed with most distributions:

nc -z -w1 boxstarter.com 80;echo $?

This will return 0, if the specified host is listening on port 80.

Why is this important?

Perhaps you are a web developer and your web site goes down. One key troubleshooting step is to determine if the web server is even up and listening on port 80. Or maybe SSL traffic is broken and you are wondering if the server is listening to port 443. The answer to these questions may very well tell you which of many possible paths is the best to pursue in finding the root of your problem.

So you whip out your command line console and simply run:

telnet myhost.com 80

If the machine is in fact listening on port 80, I will likely get a blank screen. Otherwise, the command will hang and eventually timeout. This always felt clunky, but it worked. Oh sure, since powershell became available, I could write a script that worked with the .net library to construct a raw socket to reach an endpoint and thereby get the same information. But that’s just more code to write.

A better way

Since powershell version 4 which ships with Windows 8.1 and server 2012R2, there is a new cmdlet that provides a much more elegant means of getting this information.

C:\dev\WinRM [v1.3]> Test-NetConnection -ComputerName boxstarter.org -Port 80
WARNING: Ping to boxstarter.org failed -- Status: TimedOut
ComputerName           : boxstarter.org
RemoteAddress          : 168.62.20.37
RemotePort             : 80
InterfaceAlias         : vEthernet (Virtual External Switch)
SourceAddress          : 192.168.1.7
PingSucceeded          : False
PingReplyDetails (RTT) : 0 ms
TcpTestSucceeded       : True
C:\dev\WinRM [v1.3]>

So now I have the same information plus some other bits of useful data given to me in a much more easily consumable format. Not only can I see that a site is responding to TCP port 80 requests, I see the IP address that the host name resolves to and I notice that the server is configured not to respond to ping requests.

Some may complain that Test-NetConnection requires far too many key strokes. Well there is a built in alias that points to this cmdlet allowing you to shorten the above command to:

TNC boxstarter.org -port 80

And if you don’t like having to include the -Port parameter name, the -CommonTCPPort is the next parameter in the default parameter order which takes the possible values of "HTTP,RDP,SMB,WINRM". So this means you get the same result as the command above using:

TNC boxstarter.org HTTP

So lose the telnet, and remember TNC – and welcome to the twenty first century!

Help raise the Chocolatey experiment to the Chocolatey experience and support the Chocolatey Kickstarter! by Matt Wrock

Mmm…Chocolatey…

As described on the homepage of Chocolatey.org, Chocolatey is like Apt-Get for windows. Now to some, that will immediately paint a clear picture of what Chocolatey is and why it puts the “awesome” into windows just like it does on my winawesomedows system. Others may be asking, “what get?” Or maybe, “that’s ok. I’m not in the market for an apartment right now.” For those who are not familiar with package management systems on other operating systems, Chocolatey makes it easy to find, download and install software. I cant remember the last time I searched for the windows git installer and went through all the screens in the install wizard. I just pop open a command line and type: ‘cinst git’. Curious what you can install via chocolatey? Check out chocolatey.org and you’ll find the over 2300 packages available for download and install.

Beyond this immediate value of installation ease, Chocolatey makes it easy to create packages and provides a platform that anyone can leverage to support private and alternate repositories other than the public community feed on chocolatey.org.

Chocolatey launches a Kickstarter

About a week and a half ago, Rob Reynolds (@ferventcoder) and team launched a kickstarter for chocolatey to help raise funds to not only preserve the value chocolatey currently provides to many many windows users but to help fund some greatly needed enhancements.

Chocolatey costs

While the benefits of chocolatey are free to anyone with access to a windows command line (or powershell console), there is a cost. Hosting these packages entails monthly storage and bandwidth expenses. There is also a huge time investment on the part of Chocolatey contributors and especially Rob Reynolds to address support questions and add features. I’m personally on the chocolatey email groups and I can tell you that I see a constant stream of emails from Rob every day addressing issues, merging PRs, and announcing new features. Gary Park (@gep13) is another individual who immediately comes to mind as an avid supporter. While no one has officially stated this, I can only imagine that asking Rob (husband and father of two) to continue to personally front the recurring costs and invest this amount of time is not sustainable and certainly not scalable.

Professional level offering

In addition to recurring expenses and some first-line support assistance, there are a slew of “professional grade” features the team would like to add to make Chocolatey a more polished experience for those needing to support an enterprise or other business critical infrastructure. These would include, enhanced security, better support for private feeds and other slick features as described in this image from the kickstarter:

New feature: Package moderation

Unrelated to the kickstarter but adding to the evidence that funding can only make things better is a new feature just launched the afternoon prior to this writing – package moderation. One of the most prevelant criticisms of Chocolatey is the fact that it potentially exposes users to malware. This is absolutely true. When you download a chocolatey package, you are downloading software over the open internet and you likely do not know the individual who created the package. The package may state that it installs one thing, but nothing stops it from doing something else or generally doing a sloppy job of installing what it is advertising.

To be clear, I do not believe that this truth implies that chocolatey should not be included in ones tool set. Its not the only package management system with these flaws and there are steps individuals and businesses can take today to protect themselves like pinning to known good package versions or hosting their own chocolatey feed.

Package moderation is one of many other features that stand to enhance the overall security story of Chocolatey. New packages, including package updates must now be approved by a Chocolatey moderator in order to be publicly visible and available to others. Here are some criteria that moderators use to deem a package approved:

  1. Is the package named appropriately? 
  2. Is the title appropriate? 
  3. Does it have all the links? ProjectUrl at the very least. 
  4. Is the description sufficient to explain the software? 
  5. Are the authors pointed to the actual authors and not the package maintainers? 
  6. Does the package look generally safe for consumption? 
  7. Are links in the package to download software using the appropriate location? 
  8. Does the package generally meet the guidelines set forth? 
  9. Does the install and uninstall scripts make sense or are there variables being used that don't work? 
  10. Does the package actually work?

Of course this feature does incur more time on the part of the core chocolatey team and is an example of another area of Chocolatey that this kickstarter aims to support.

Call to action!

So in the spirit of this blog, HurryUpAndWait, I encurage you to hurry up and offer what you feel comfortable contributing and then wait for the success of this kickstrarter! Do you work for a business that uses chocolatey? Perhaps it uses the chocolatey chef cookbook or the chocolatey puppet module. If you have access to those who make financial decisions in these organizations, let them know about how they can help to make chocolatey a more business friendly option that stands to improve its ability to automate.

 

Cloud Automation in a Windows World via InfoQ by Matt Wrock

This week InfoQ published an article I wrote entitled "Cloud Automation in a Windows World." The article could just as well been given the title "Automation in a Windows World" since there is nothing exclusive to cloud in the article. The article is a survey of automation strategies and tools used for windows provisioning and environment definition and management.

When it comes to compute resource automation, windows has not traditionally been known to be a leader in that space. This article discusses this historical gap and illustrates some of the ways in which it is closing.

I discuss ways in which we automate windows at CenturyLink Cloud and I point to a variety of tools that anyone can use today to assist in windows environment automation, be it a cloud or your grandma's PC. Among other technologies I mention Chef, Powershell DSC, Chocolatey, Boxstarter, Vagrant, and the future of windows containers.

If this all sounds interesting, please give it a read.

Configure and test windows infrastructure using Powershell technologies DSC and Pester running from Chef and Test-Kitchen by Matt Wrock

Update: see this post for the latest update on getting up and running with Test-Kitchen on windows.

About a week ago I attended the 2014 Chef Summit. I got to meet a bunch of new and interesting people and also met several who I had interacted with online but had never seen in person. One new person I met was Jay Mundrawala (@jdmundrawala). Jay works for chef and built a Test-Kitchen Busser for Pester (as a personal oss contribution and not as part of his job at Chef). You might ask…a What for What? Well this post is going to attempt to answer that and explain why I think it is important.

Pester

Pester is a unit testing framework for Powershell. It was originally created by Scott Muc (@scottmuc) a few years back. I joined in in 2012 to add support for Mocking and now development has largely been taken over by Dave Wyatt (@MSH_Dave). It is a BDD style approach to writing and running unit tests for powershell. However, as we will see here, you can write more than just unit tests. You can write a suite of tests to ensure your infrastructure is built and runs as intended.

The whole idea of writing tests for powershell is new to a lot of long time scripters. However, as just mentioned, this framework has been around for a few years but is just now starting to gain some popularity among the powershell community and in fact the Powershell team at Microsoft is now beginning to use it themselves.

Many entrenched in the Chef ecosystem have undoubtedly been exposed to rspec and rspec derivative tools for writing tests for their chef recipes and other ruby gems. Pester is very much inspired by rspec and many familiar with rspec who take a first look at Pester may not immediately notice the difference. There are indeed several differences but the primary difference is one is written in and for ruby and the other powershell.

Test-Kitchen

Test kitchen is a tool that is widely used within the Chef community but can also be used by other Configuration management tools like Puppet. Test kitchen is not a test framework per se but it is a sort of meta framework that provides a plugin architecture around configuration management scripts that makes it easy to use one or more of many testing frameworks with your infrastructure management scripts.

There are issues specific to configuration management that make such a tool as Test-kitchen very useful. In addition to simply running tests, Test-Kitchen can manage the creation and destruction of a VM or other computing resource where tests can be run in a repeatable, disposable and rebuildable manner. Again, this is managed by another plugin family of provisioners. Some may use the vagrant driver, others docker, vsphere, EC2, etc. Using Test-kitchen, I can watch as an instance is provisioned, built, tested ad then destroyed without any side effects impacting my local environment.

The plugin that manages different test frameworks is called the busser. This plugin is responsible for “bussing” code from your local machine to a virtual test instance. Jay’s busser, like all the others simply make sure that Pester gets installed on the system where you want your tests to run. Since Pester is a powershell based tool. You are typically going to be running Pester tests on a windows machine and the cool thing here is that you can write them in “pure” powershell. No need to wrap all of your powershell inside of ruby language constructs. Its all 100% powershell here.

Enter DSC – Microsoft’s Desired State Configuration

This is an interesting one because it is both a product (or API) of a specific technology vendor and a long time philosophical approach to infrastructure management. Some also incorrectly interpret it as a competitor trying to unseat  tools like chef or Puppet. There is indeed some overlap between DSC and other configuration management tools but the easiest way to groc how DSC fits into the CM landscape is as an API for writing resources specifically for windows infrastructure. Chef, Puppet and other tools provide a broad range of features to help you oversee and codify your infrastructure. The DSC surface area is really much simpler. DSC as it stands today consists of a constantly growing set of resources that can be leveraged in your configuration management tool of choice.

What do I mean by “resource?” Resource is a ubiquitous term in the popular CM tools used to provide an abstraction or DSL over a concrete piece of infrastructure (user, group, machine, file, firewall rule, etc) The resource descries how you want this infrastructure to look and does so in code that can be reviewed, tested, linted and source controlled.

You can use straight up DSC to execute these resources which offers a bare bones approach, or you can wrap them inside of a Chef recipe that can live alongside of non-DSC resources. Now the DSC resource for your windows roles and features, sql server HA, registry keys sits inside of your larger Chef infrastructure of nodes, environments, attributes, etc.

Chef making it easy to execute DSC resources

An initial reaction to this by many would be users of DSC is, why would I use Chef? Don’t I have to learn Ruby to work with that? Well because Chef is a full featured, mature configuration management solution, you get access to all of the great reporting, and server management features of chef. If you have a mixed windows/linux shop, you can manage everything with chef. Finally, it can be a bit unwieldy using raw DSC on its own. Before you can execute DSC resources, they must be downloaded and installed. Chef makes that super easy. And as we will see with test-kitchen, now you can plug your powershell based tests right into your chef workflow.

A real world example of executing DSC resources with chef and testing with Pester

We are going to follow a typical chef workflow of writing a cookbook to build a server. In our case it will be an IIS powered web server that hosts a Nuget package feed. Nuget is a windows package management specification very similar to ruby Gems. Its also the same specification behind windows Chocolatey packages similar to apt-get/yum/rpm for linux. Our web server will provide a rest based feed similar to rubygems.org that one can use to discover nuget packages.

Welcome to the bleeding edge

Before we get started let me point out that testing cookbooks on windows has not historically been well supported but there is more interest than ever in it today. There is very active development that is driving to make this possible but it is still not available from the latest stable version of Test-Kitchen. During this year’s Chef Summit, this exact topic was discussed. The creator and maintainer of Test-Kitchen, Fletcher Nichols was present as well as several others either interested in windows support  or actively working to provide first class support for windows like Salim Afiune. I was there as well and I think everyone left with a clear understanding that this work needs to come together in a future version of Test-Kitchen in the near future. I blogged on the current state of this tooling just a couple months ago. This may be seen as a continuation of that post with a specific bend towards powershell and DSC.

I will walk you through how to get your environment configured so that you can do this testing today and I will certainly update this post once the tooling is officially released.

Environment setup

I am going to assume that you do not have any of the necessary tools needed to run through the sample cookbook I am about to show. So you can pick and choose what you need to add to your system. I am also assuming you are using the ruby embedded with chefDK. If you have another ruby versioning environment, chances are you know what to do. Note: this environment does not need to be a windows box.

ChefDK

First and foremost you need chef. The easiest way to get chef along with many of the popular tools in its ecosystem like test-kitchen is to install the Chef development kit. There are downloads available for windows, mac and several linux distributions.

Vagrant

This tutorial will use Vagrant to instantiate a machine to run the cookbook and execute the tests. You can download vagrant from VagrantUp and like chef, it has downloads for all of the popular platforms.

A hypervisor

You will need something that your vagrant flavored VM can run in. Many prefer the free and feature complete VirtualBox. If you run on windows and are currently using versions 8/2012 and above, you may use Hyper-V already on your box. Note you cannot run both on the same boot instance.

Git

You will be using git to download some of the tools I am about to mention.

The WinRM Test-Kitchen fork

This will eventually and hopefully soon be merged into the authoritative test-kitchen repo. This fork has been largely developed by Salim Afiune and can be found here. There is still active development here. Currently I have my own fork of this fork working to improve performance of winrm based file transfers. My fork hopes to dramatically improve upload times of cookbooks to the test instance. The cookbook in this tutorial should just take a couple minutes to upload using my fork compared to nearly an hour and we hope to get the perf much more faster than that. Note that WinRM has no equivalent SCP functionality so implementing this is a bit crude. Here is how you can use and install my fork:

git clone -b one_session https://github.com/mwrock/test-kitchen
copy-item test-kitchen\lib `
  C:\opscode\chefdk\embedded\apps\test-kitchen `
  -recurse -force
copy-item test-kitchen\support `
  C:\opscode\chefdk\embedded\apps\test-kitchen `
  -recurse -force
cd test-kitchen
C:\opscode\chefdk\embedded\bin\gem build test-kitchen.gemspec
C:\opscode\chefdk\embedded\bin\gem install test-kitchen-1.3.0.gem

The Winrm based Kitchen-Vagrant plugin fork

Salim has also ported his enhancements to the popular Kitchen-Vagrant Test-Kitchen plugin. Since this tutorial uses vagrant, you will need this fork. Note that if you plan to use Hyper-V or a non VirtualBox hypervisor, please use my fork that includes recent changes to make vagrant and the winrm test kitchen work outside of VirtualBox. Here is how to get and install this:

git clone -b Transport https://github.com/mwrock/kitchen-vagrant
cd kitchen-vagrant
C:\opscode\chefdk\embedded\bin\gem build kitchen-vagrant.gemspec
C:\opscode\chefdk\embedded\bin\gem install kitchen-vagrant-0.16.0.gem

The dsc_nugetserver repository containing a sample cookbook and pester tests

This can simply be cloned from https://github.com/mwrock/dsc_nugetserver.

DSC in a chef recipe

Similar to the WinRM Test-Kitchen work, the DSC recipe work done by the folks at chef is still in fairly early development. There is a dsc_script resource available in the latest chef client release as of this post. There is also a community cookbook that represents a prototype of work that will be evolved into the core chef client. This cookbook contains the dsc_resource resource.

I intentionally wrote the dsc_nugetserver cookbook almost entirely from DSC resources. Lets take a look in the default recipe and observe the two flavors of the dsc resource.

dsc_script

dsc_script  "webroot" do
  code <<-EOH
    File webroot
    {
      DestinationPath="C:\\web"
      Type="Directory"
    }
  EOH
end

This is what is currently supported by the official chef-client and ships with the latest version. They really just wrap the DSC Configuration syntax supported by powershell today. The benefit that you get using it inside of a chef recipe are that you can now use the dsc_script as just another resource in your wider library of cookbooks. Chef also does some leg work for you. You do not need to worry about where the resource is installed and you do not need to compile the resource before use.

dsc_resource

dsc_resource "http firewall rule" do
  resource_name :xfirewall
  property :name, "http"
  property :ensure, "Present"
  property :state, "Enabled"
  property :direction, "Inbound"
  property :access, "Allow"
  property :protocol, "TCP"
  property :localport, "80"
end

This is really similar if not the same as dsc_scipt just with different syntax. Note the use of the property DSL. dsc_resource also does a much better job at finding the correct resource. While I believe that dsc_script only works with the official microsoft preinstalled resources, the community dsc cookbook can locate the newer experimental resources that are being distributed as part of the community resource kit waves.

Using the resource_kit recipe to download and install all of the current resource wave kit modules

I have included a recipe that will download the latest batch of resource wave dsc resources. I basically just copied this from one of chef’s own cookbook examples and replaced the download url with the latest resource wave. Once this recipe runs, literally all dsc resources are available for you to use.

Whatif Bug affecting most resources used within chef

There is a bug in both of the dsc resource flavors that will cause most resources to crash. If the dsc resource either does not support ShouldProcess of if the underlying call to powershell DSC’s Set-TargetResource results in the function throwing an error, these chef resources currently to not provide graceful failure for these scenarios. So as is, the resource will break when called. The chef team knows about this and has a fix that will be released in a future release.

In the meantime, I have forked the community dsc_resource in the dsc cookbook and commented out a single line. I can consume this fork from any cookbook by adding this to the Berksfile:

source "https://supermarket.getchef.com"

metadata

cookbook 'dsc' , git: 'https://github.com/mwrock/dsc'

Converging the recipe

The sample cookbook comes with both a .kitchen.yml file that includes a pointer to an evaluation copy of windows 8.1 for testing. I would have included a 2012 box instead but my 2012 vagrant box is Hyper-V only and I have not had time to add virtual box.

So running:

kitchen converge

Should create a windows box for testing and converge that box to run the sample recipe.

[2014-10-13T02:34:38-07:00] INFO: Getting PowerShell DSC resource 'xfirewall'
[2014-10-13T02:35:26-07:00] INFO: DSC Resource type 'xfirewall' Configuration completed successfully
[2014-10-13T02:35:29-07:00] INFO: Chef Run complete in 534.665725 seconds
[2014-10-13T02:35:29-07:00] INFO: Removing cookbooks/dsc_nugetserver/files/default/NugetServer.zip from the cache; it is no longer needed by chef-client.
[2014-10-13T02:35:29-07:00] INFO: Running report handlers
[2014-10-13T02:35:29-07:00] INFO: Report handlers complete
Finished converging <default-windows-81> (13m6.61s).
-----> Kitchen is finished. (13m11.87s)
C:\dev\dsc_nugetserver [master]>

Note that there is a chance the kitchen converge will fail shortly after creating the box and just before downloading the chef client. My suspicion is that this is because the windows 8.1 box is hard at work installing updates and the initial winrm call times out. I have always had success immediately calling kitchen converge again.

So once this completes, you should be able to open a local browser and point at your test box to see the nuget server informational home page:

Testing the recipe with Pester

Here are the tests we will run with Pester:

describe "default recipe" {

  it "should expose a nuget packages feed" {
    $packages = Invoke-RestMethod -Uri "http://localhost/nuget/Packages"
    $packages.Count | should not be 0
    $packages[0].Title.InnerText | should be 'elmah'
  }

  context "firewall" {

    $rule = Get-NetFirewallRule | ? { $_.InstanceID -eq 'http' }
    $filter = Get-NetFirewallPortFilter | ? { $_.InstanceID -eq 'http' }

    it "should filter port 80" {
      $filter.LocalPort | should be 80
    }
    it "should be enabled" {
      $rule.Enabled | should be $true
    }
    it "should allow traffic" {
      $rule.Action | should be "Allow"
    }
    it "should apply to inbound traffic" {
      $rule.Direction | should be "Inbound"
    }    
  }
}

This is 100% powershell. No ruby to see here.

This is first going to test our nuget server website. If all went as we intended, an http call to the root of localhost should reach our nuget server and it should behave like a nuget feed. So here we expect the Packages feed to return some packages and knowing what the first package should be, we test that its name is what we expect.

Because Test-Kitchen runs tests on the converged node, we need to be sure that the outside world can reach our entry point. So we go ahead and test that we opened the firewall correctly.

kitchen verify

The kitchen Pester busser now installs Pester:

C:\dev\dsc_nugetserver [master]> kitchen verify
-----> Starting Kitchen (v1.3.0)
-----> Setting up <default-windows-81>...
       Successfully installed thor-0.19.0
       Successfully installed busser-0.6.2
       2 gems installed
       Plugin pester installed (version 0.0.6)
-----> Running postinstall for pester plugin
-----> [pester] Installing PsGet
Downloading        PsGet from https://github.com/psget/psget/raw/master/PsGet/PsGet.psm1
PsGet is installed and ready to use
       USAGE:
           PS> import-module PsGet
           PS> install-module PsUrl

       For more details:
           get-help install-module
       Or visit http://psget.net
-----> [pester] Installing Pester

Then it runs our tests:

-----> Running pester test suite
-----> [pester] Running
Executing all tests in 'C:\tmp\busser\suites\pester'Describing        default recipe
[+] should expose a nuget packages feed 4.02s   Context        firewall
[+] should filter port 80 3.18s           
[+] should be enabled 16ms
[+] should allow traffic 12ms
[+] should apply to inbound traffic 13ms
Tests completed in 7.23s
       Passed: 5 Failed: 0
       Finished verifying <default-windows-81> (0m22.55s).
-----> Kitchen is finished. (0m59.74s)
C:\dev\dsc_nugetserver [master]>

Bugs regarding Execution Policy

One issue I ran into both with the dsc_resource resource and the Pester busser was a failure to bypass the ExecutionPolicy of the Powershell.exe process. This means if no one has explicitly set an execution policy on the box which they would not have if this is a newly provisioned machine and unless this is windows server 2012R2 which implements a new default ExecutionPolicy of RemoteSigned instead of Undefined, the converge will fail complaining that the execution of scripts are not allowed to run. Since the test vagrant box used here is windows 8.1, it is susceptible to this bug.

You can work around this by setting the execution policy in the recipe as is done in the sample:

powershell_script "set execution policy" do
  code <<-EOH
    Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Force
    if(Test-Path "$env:SystemRoot\\SysWOW64") {
      Start-Process "$env:SystemRoot\\SysWOW64\\WindowsPowerShell\\v1.0\\powershell.exe" -verb runas -wait -argumentList "-noprofile -WindowStyle hidden -noninteractive -ExecutionPolicy bypass -Command `"Set-ExecutionPolicy RemoteSigned`""
    }
    EOH
end

We set the policy for both the 64 and 32 bit shells since chef-client is a 32 bit process.

I have filed an issue with the dsc cookbook here and submitted a pull request for the busser here.

Testing on windows keeps getting better

We are still not where we need to be but we are making progress. This is a big step I think to adding accesibility to the work coming out of the Microsoft DSC initiatives. Here you have all the tools you need to not only execute DSC resources but also test them. A big thanks to Jay and Salim for their work here with Test Kitchen and the Pester busser!

If you want to learn more about DSC or Chef and particularly DSC in Chef. Pay attention to Steven Murawski's blog. Steven is a Chef Community manager and has done a ton of work with DSC at his previous employer Stack Exchange, the home of StackOverflow.com.