A look under the hood at Powershell Remoting through a cross plaform lens / by Matt Wrock

Many Powershell enthusiasts don't realize that when they are using commands like New-PsSession and streaming pipelines to a powershell runspace on a remote machine, they are actually writing a binary message wrapped in a SOAP envelope that leverages a protocol with the namesake of Windows Vista. Not much over a year ago I certainly wasn't. This set of knowledge all began with needing to transfer files from a linux machine to a windows machine. In a pure linux world there is a well known tool for this called SCP. In Windows we map drives or stream bytes to a remote powershell session. How do we get a file (or command for that matter) from one of these platforms to the other?

I was about to take a plunge to go deeper than I really wanted into a pool where I did not really care to swim. And today I emerge with a cross platform "partial" implementation of Powershell Remoting in Ruby. No not just WinRM but a working PSRP client.

In this post I will cover how PSRP differs from its more familiar cross platform cousin WinRM, why its of value and how one can give it a try. Hopefully this will provide an interesting perspective into what Powershell Remoting looks like from an implementor's point of view.

In the beginning there was WinRM

While PSRP is a different protocol from WinRM (Windows Remote Management) with its own spec. It cannot exist or be explained without WinRM. WinRM is a SOAP based web service defined by a protocol called Web Services Management Protocol Extensions for Windows Vista (WSMV). I love that name. This protocol defines several different message types for performing different tasks and gathering different kinds of information on a remote instance. I'm going to focus here on the messages involved with invoking commands and collecting their output.

A typical WinRM based conversation for invoking commands goes something like this:

  1. Send a Create Shell message and get the shell id from the response
  2. Create a command in the shell sending the command and any arguments and grab the command id from the response
  3. Send a request for output on the command id which may return streams (stdout and/or stderr) containing base64 encoded text.
  4. Keep requesting output until the command state is done and examine the exit code.
  5. Send a command termination signal
  6. Send a delete shell message

The native windows tool (which nobody uses anymore) to speak pure winrm is winrs.exe.

C:\dev\winrm [winrm-v2]> winrs -r:http://192.168.137.10:5985 -u:vagrant -p:vagrant ipconfig

Windows IP Configuration

Ethernet adapter Ethernet:

   Connection-specific DNS Suffix  . : mshome.net
   Link-local IPv6 Address . . . . . : fe80::c11b:f734:5bd4:ab03%3
   IPv4 Address. . . . . . . . . . . : 192.168.137.10
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Default Gateway . . . . . . . . . : 192.168.137.1

You can turn on analytical event log messages or watch a wireshark transcript of the communication. One thing is for sure, you will see a lot of XML and alot of namespace definitions. Its not fun to debug but you'll learn to appreciate it after examining PSRP transcripts.

Here's an example create command message:

<s:Envelope
 xmlns:s="http://www.w3.org/2003/05/soap-envelope"
 xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing"
 xmlns:wsman="http://schemas.dmtf.org/wbem/wsman/1/wsman.xsd">
 <s:Header>
 <wsa:To>
 http://localhost:80/wsman
 </wsa:To>
 <wsman:ResourceURI s:mustUnderstand="true">
 http://schemas.microsoft.com/wbem/wsman/1/windows/shell/cmd
 </wsman:ResourceURI>
 <wsa:ReplyTo>
 <wsa:Address s:mustUnderstand="true">
 http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous
 </wsa:Address>
 </wsa:ReplyTo>
 <wsa:Action s:mustUnderstand="true">
 http://schemas.microsoft.com/wbem/wsman/1/windows/shell/Command
 </wsa:Action>
 <wsman:MaxEnvelopeSize s:mustUnderstand="true">153600</wsman:MaxEnvelopeSize>
 <wsa:MessageID>
 uuid:F8671978-E928-49DA-ADB8-5BF97EDD9535</wsa:MessageID>
 <wsman:Locale xml:lang="en-US" s:mustUnderstand="false" />
 <wsman:SelectorSet>
 <wsman:Selector Name="ShellId">
 uuid:0A442A7F-4627-43AE-8751-900B509F0A1F
 </wsman:Selector>
 </wsman:SelectorSet>
 <wsman:OptionSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <wsman:Option Name="WINRS_CONSOLEMODE_STDIN">TRUE</wsman:Option>
 <wsman:Option Name="WINRS_SKIP_CMD_SHELL">FALSE</wsman:Option>
 </wsman:OptionSet>
 <wsman:OperationTimeout>PT60.000S</wsman:OperationTimeout>
 </s:Header>
 <s:Body>
 <rsp:CommandLine
 xmlns:rsp="http://schemas.microsoft.com/wbem/wsman/1/windows/shell">
 <rsp:Command>del</rsp:Command>
 <rsp:Arguments>/p</rsp:Arguments>
 <rsp:Arguments>
 d:\temp\out.txt
 </rsp:Arguments>
 </rsp:CommandLine>
 </s:Body>
</s:Envelope>

Oh yeah. Thats good stuff. This runs del /p d:\temp\out.txt.

Powershell over WinRM

When you invoke a command over WinRM, you are running inside of a cmd.exe style shell. Just as you would inside a local cmd.exe, you can always run powershell.exe and pass it commands. Why would anyone ever do this? Usually its because they are using a cross platform WinRM library and its just the only way to do it.

There are popular libraries written for ruby, python, java, Go and others. Some of these abstract the extra powershell.exe call and make it feel like a true native powershell repl experience. The fact is that this works quite well and so why bothering implementing a separate protocol? As I'll cover in a bit, PSRP is much more complicated than vanila WSMV so if you can get away with the simpler protocol, great.

The limitations of WinRM

There are a few key limitations with WinRM. Many of these limitations are the same limitations involved with cmd.exe:

Multiple shells

You have to open two shells (processes). First the command shell and then startup a powershell instance. This can be a performance suck especially if you need to run several commands.

Maximum command length

The command line length is limited to 8k inside cmd.exe. Now you may ask, why in the world would you want to issue a command greater than 8192 characters? There are a couple common use cases here:

  1. You may have a long script (not just a single command) you want to run. However, this script is typically fed to the -command or -EncodedCommand argument of powershell.exe so this entire script needs to stay within the 8k threshold. Why not just run the script as a file? Ha!...Glad you asked.
  2. WinRM has no native means of copying files like SCP. So the common method of copying files via WinRM is to base64 encode a file's contents and create a command that appends 8k chunks to a file.

#2 is what sparked my interest in all of this. I just wanted to copy a damn file, So Shawn Neal, Fletcher Nichol and I wrote a ruby gem that leveraged WinRM to do just that. It basically does this alot:

"a whole bunch of base64 text" >> c:\some\file.txt

It turns out that 8k is not a whole lot of data and if you want to copy hundreds of megabytes or more, grab a book. We added some algorithms to make this as fast as possible like compressing multiple files before transferring and extracting them on the other end. However, you just cant get around the 8k transfer size and no performance trick is gonna make that fast.

More Powershell streams than command streams

Powershell supports much more than just stdout and stderr. Its got progress, verbose, etc, etc. The WSMV protocol has no rules for transmitting these other streams. So this means all streams other than the output stream is sent on stderr.

This can confuse some WinRM libraries and cause commands that are indeed successful to "appear" to fail. The trick is to "silence" these streams. For example the ruby WinRM gem prepends all powershell scripts with:

$ProgressPreference = "SilentlyContinue"

Talking with Windows Nano

The ruby WinRM gem uses the -EncodedCommand to send powershell command text to powershell.exe. This is a convenient way of avoiding quote hell and base64ing text that will be transferred inside XML markup. Well Nano's powershell.exe has no EncodedCommand argument and so the current ruby WinRM v1 gem cannot talk powershell with Windows Nano Server. Well that simply can't be. We have to be able to talk to Nano.

Introducing PSRP

So without further ado let me introduce PSRP. PSRP supports many message types for extracting all sorts of metadata about runspaces and commands. A full implementation of PSRP could create a rich REPL experience on non windows platforms. However in this post I'm gonna limit the discussion to messages involved in running commands and receiving their output.

As I mentioned before, PSRP cannot exist without WinRM. I did not just mean that in a philosophical sense, it literally sits on top of the WSMV protocol. Its sort of a protocol inside a protocol. Running commands and receiving their response includes the same exchange illustrated above and issuing the same WSMV messages. The key differences is that instead of issuing commands in these messages in plain text and recieving simple base64 encoded raw text output, the powershell commands are packaged as a binary PSRP message (or sequence of message fragments) and the response includes one or more binary fragments that are then "defragmented" into a single binary message.

PSRP Message Fragment

A complete WSMV SOAP envelope can only be so big. This size limitation is specified on the server via the MaxEnvelopeSizeKB setting. This defaults to 512 on 2012R2 and Nano server. So a very large pipeline script or a very large pipeline output must be split into fragments.

The PSRP spec illustrates a fragment as:

All fragments have an object id representing the message being fragmented and each fragment of that object will have incrementing fragment ids starting at 0. E and S are each a single bit flag that indicates if the fragment is an End fragment and if it is a Start fragment. So if an entire message fits into one fragment, both E and S will be 1.

The blob (the interesting stuff) is the actual PSRP message and of course the blob length of the blob in bytes. So the idea here is that you chain the blobs ob all fragments with the same object id in the order of fragment id and that aggregated blob is the PSRP message.

Here is an implementation of a message fragment written in ruby and here is how we unwrap several fragments into a message.

PSRP messages

There are 41 different types of PSRP messages. Here is the basic structure as illustrated in the PSRP spec:

Destination signifies who the message is for: client or server. Message type is a integer representing which of the 41 possible message types this is and RPID and PID both represent runspace_id and pipeline_id respectively. The data has the "meat" of the message and its structure is determined by the message type. The data is XML. Many powershellers are familiar with CLIXML. Thats the basic format of the message data. So in the case of a create_pipeline message, this will include the CLIXML representation of the powershell cmdlets and arguments to run. It can be quite verbose but always beautiful. The symmetric nature of XML really shines here.

Here is an implementation of PSRP message in ruby.

A "partial" implementation in Ruby

So as far as I am aware, the WinRM ruby gem has the first open source implementation of PSRP. Its not officially released yet, but the source is fully available and works (at least integration tests are passing). Why am I labeling it a "partial" implementation?

As I mentioned earlier, PSRP provides many message structures for listing runspaces, commands and gathering lots of metadata. The interests of the WinRM gem are simple and aims to adhere to the same basic interface it uses to issue WSMV messges (however we have rewritten the classes and methods for v2). Essentially we want to provide an SSH like experience where a user issues a command string and gets back string based standard output and error streams as well as an exit code. This really is a "dumbed down" rendering of what PSRP is capable of providing.

The possibilities are very exciting and perhaps we will add more true powershell REPL features in the future but today when one issues a powershell script, we are basically constructing CLIXML that emits the following command:

Invoke-Expression -Command "your command here" | Out-String -Stream

This means we do not have to write a CLIXML serializer/deserializer but we reap most of the benefits of running commands directly in powershell. No more multi shell lag, no more comand length limitations and hello Nano Server. In fact, our repo provides a Vagrantfile that provisions Windows Nano server for running integration tests.

Give it a try!

I have complete confidence that there are major flaws in the implementation as it is now. I've been testing all along the way but I'm just about to start really putting it through the ringer. I can guarantee that Write-Host "Hello World!" works flawlessly. The Hello juxtaposed starkly against the double pointed 'W' and ending in the minimalistic line on top of a dot (!) is pretty amazing. The readme in the winrm-v2 branch has been updated to document the code as it stands now and assuming you have git, ruby and bundler installed, here is a quick rundown of how to run some powershell using the new PSRP implementation:

git clone https://github.com/WinRb/WinRM
git fetch
git checkout winrm-v2
bundle install
bundle exec irb

require 'winrm'
opts = { 
  endpoint: "http://myhost:5985/wsman",
  user: 'administrator',
  password: 'Pass@word1'
}
conn = WinRM::Connection.new(opts)
conn.shell(:powershell) do |shell|
  shell.run('$PSVersionTable') do |stdout, stderr|
    STDOUT.print stdout
    STDERR.print stderr
  end
end

The interfaces are not entirely finalized so things may still change. The next steps are to refactor the winrm-fs and winrm-elevated gems to use this new winrm gem and also make sure that it works with vagrant and test-kitchen. I cant wait to start collecting benchmark data comparing file copy speeds using this new version and the one in use today!