Asynchrony in Powershell

As part of our Octopus Deploy migration effort we are writing a powershell module that we use to automatically bootstrap the Tentacle installation into Octopus. This involves maintaining metadata about machines and environments outside of Octopus. The reason we need this capability is to adhere to the “cattle vs. pets” approach to hardware. We want to be able to destroy and recreate our machines at will and have them show up again in Octopus ready to receive deployments.

Our initial implementation cycled through one machine at a time, installing Tentacle, registering it with Octopus (with the same security certificate so that Octopus recognizes it as the same machine), then moving on to the next machine. This is fine for small environments with few machines, but not awesome for larger environments with many machines. If it takes 2m to install Tentacle and I have 30 machines, I’m waiting an hour to be able to use the environment. With this problem in mind I decided to figure out how we could parallelize the boostrapping of machines in our Powershell module.

Start-Job

Start-Job is one of a family of Powershell functions created to support asynchrony. Other related functions are Get-Job, Wait-Job, Receive-Job, and Remove-Job. In it’s most basic form, Start-Job accepts a script block as a parameter and executes it on a background thread.

# executes "dir" on a background thread.
$job = Start-Job -ScriptBlock { dir } 

The job object returned by Start-Job gives you useful information such as the job id, name, and current state. You can run Get-Job to get a list of running jobs, Wait-Job to wait on one or more jobs to complete, Receive-Job to get the output of each job, and Remove-Job to delist jobs in the current Powershell session.

Complexity

If that’s all there was to it, I wouldn’t be writing this blog post. I’d just tweet the link to the Start-Jobs msdn page and call it done. My scenario is that I need to bootstrap machines using code defined in my Powershell module, but run those commands in a background process. I also need to collate and log the output of those processes as well as report on the succes/failure of each job.

When you call Start-Job in Powershell it creates a new session in which currently loaded modules are not automatically loaded. If you have your powershell module in the $PsModulePath you’re probably okay. However, there is a difference between the version of the module I’m currently working on and testing vs. the one I have on my machine for normal use.

Start-Job has an additional parameter for a script block used to initialize the new Powershell session prior to executing your background process. The difficulty is that while you can pass arguments to the background process script block, you cannot pass arguments to the initialization script. Here’s how you make it all work.

Setup Code

<br /># Store the working module path in an environment variable so that the new powershell session can locate the correct version of the module.
# The environment variable will not persist beyond the current powershell session so we don't have to worry about poluting our machine state.
$env:OctobootModulePath = (get-module Octoboot).Path
$init =  {
    # When initializing the new session, use the -Force parameter in case a different version of the module is already loaded by a profile.
    import-module $env:OctobootModulePath -Force
}

# create a parameterized script block
$scriptBlock = {
    Param(
          $computerName,
          $environment,
          $roles,
          $userName,
          $password,
          $apiKey,
    )

    Install-Tentacle -computer $computerName `
        -environment $environment `
        -roles $roles `
        -userName $userName -password $password `
        -apiKey $apiKey
}

# I like to use an -Async switch on the controlling function. Debugging issues is easier in a synchronous context than in an async context. Making the async functionality optional is a win.
if ($async) {
    $job = Start-Job `
        -ScriptBlock $scriptBlock `
        -InitializationScript $init `
        -Name "Install Tentacle on $($computerName)" `
        -ArgumentList @(
             $computerName, $environment, $roles
             $userName, $password,
             $apiKey) -Debug:$debug
} else {
    Install-Tentacle -computer $computerName `
          -environment $environment `
          -roles $roles `
          -userName $userName -password $password `
          -apiKey $apiKey      
}

The above code is in a loop in the controlling powershell function. After I’ve kicked off all of the jobs I’m going to execute, I just need to wait on them to finish and collect their results.

Finalization Code

<br />if ($async) {
    $jobs = get-job
    $jobs | Wait-Job | Receive-Job
    $jobs | foreach {
        $job = $_
        write-host "$($job.Id) - $($job.Name) - $($job.State)"
    }
    $jobs | remove-job
}

Since each individual job is now running in parallel, bootstrapping large environments doesn’t take much longer than bootstrapping smaller ones. The end result is that hour is now reduced to a few minutes.

Leave a Reply

%d bloggers like this: