Automatically Resize an Azure VM

azure calculator heroimg
In this “how to” post, I will show you how to use Azure Automation to scale up (increase) or scale down (decrease) the size of an Azure virtual machine.
 

The Scenario

In the cloud world, we are supposed to, as I teach, deploy “an army of ants, not a squad of giants”. By deploying lots of small workloads, each instance has little value (especially if it fails). We then have the granular ability to scale in or out, depending on workload (see virtual machine scale sets). The latter is financially beneficial because machines are only running and therefore paid for, while there is a load for them to exist and service.
There are times when an application cannot scale out or in, and the only choice is to add more performance by changing the processor, RAM, etc. I can think of two examples of this:

  • Monolithic Application: A line-of-business application is coded to run on one machine only. There are times when this machine must use more resource than normal, so you want to optimize costs when the loads are less. We could use metrics alerts + Automation to trigger a change of virtual machine spec/size depending on resource usage.
  • Pending Disaster Recovery: Some applications, such as Active Directory with 2+ domain controllers in the forest, don’t support the replication of domain controller machines. In this case, you need to run 1+ additional domain controllers in the DR site. We could run the domain controllers as low-end machines and use a Recovery Plan + Automation to scale up the machine in the event of a disaster.
  • Scheduled Demand: A virtual machine is lightly used outside of business hours but is heavily used during peak hours. One could schedule the virtual machine to scale up before the work day starts and scale down at the end of the day. This ensures that the application is (mostly) available all of the time but the cost of the virtual machine is associated with the demand.

The DR scenario is fairly simple; when things are normal, the virtual machine is low spec because there is little-to-no load. When a recovery plan is triggered, the virtual machine spec is increased to a normal operational level.
The monolithic application scenario can be more interesting. It may be that there are stages of spec/size increase or decrease with a minimum and maximum level. This is the scenario that I will demonstrate.
 

 
You should note two things:

  • There will be downtime while the virtual machine is resized. A resize requires a reboot.
  • In my example, I will be changing the size of the virtual machine inside of a single series of Azure virtual machines. If you want to change series, then you might need to add a Stop-AzureRMVM command before changing the virtual machine size; this is because all sizes aren’t made available while a machine is still running. I did not add this command because it would slow down the script.

Azure Automation

The key to automating the solution is to use Azure Automation. You will deploy an automation account in the Azure Portal. The automation account will execute two runbooks (PowerShell scripts). One is to increase the virtual machine size and the other is to decrease the virtual machine size.
In my example, I will allow the virtual machine to move between three sizes in the B-Series of Azure virtual machines. There is a minimum size and a maximum size, allowing me to set a minimum level of performance and a maximum spend.
Tip: Make sure you run the Update Azure Modules command under Shared Resources > Modules in the automation account or you will get strange PowerShell errors when you run the below runbooks.

The Scale-Up Runbook

Here is the scale-up runbook. Some things to note:

  • The name and resource group of the virtual machine are stored in variables: $ResourceGroupName and $VMName.
  • A switch command is used to determine the current size of the machine and move it up to the next size.
  • Once at the maximum allowed size, the virtual machine will not be increased again.
$connectionName = "AzureRunAsConnection"
try
{
    # Get the connection "AzureRunAsConnection "
    $servicePrincipalConnection=Get-AutomationConnection -Name $connectionName
    "Logging in to Azure..."
    Add-AzureRmAccount `
        -ServicePrincipal `
        -TenantId $servicePrincipalConnection.TenantId `
        -ApplicationId $servicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint
}
catch {
    if (!$servicePrincipalConnection)
    {
        $ErrorMessage = "Connection $connectionName not found."
        throw $ErrorMessage
    } else{
        Write-Error -Message $_.Exception
        throw $_.Exception
    }
}
#Script Runs Here
function ResizeVM ($FuncRGName, $FuncVMName, $FuncNewSize)
{
    Write-Output "Upgrading $FuncVMName to $FuncNewSize ... this will require a reboot"
    $FuncVM = Get-AzureRmVM -ResourceGroupName $FuncRGName -Name $FuncVMName
    $FuncVM.HardwareProfile.VmSize = $FuncNewSize
    Update-AzureRmVM -VM $FuncVM -ResourceGroupName $FuncRGName
}
$ResourceGroupName = "petri2"
$VMName = "vm-petri2-01"
Write-Output "Starting the scale up process"
$VM = Get-AzureRmVM -ResourceGroupName $ResourceGroupName -Name $VMName
$CurrentSize = $VM.HardwareProfile.VmSize
switch ($CurrentSize)
{
    'Standard_B1ms' { ResizeVM $ResourceGroupName $VMName "Standard_B2ms" }
    'Standard_B2ms' { ResizeVM $ResourceGroupName $VMName "Standard_B4ms" }
    'Standard_B4ms' { Write-Output "The VM is at the max allowed size for this application" }
}
Write-Output "The resizing runbook is finished"

The Scale-Down Runbook

This runbook is almost identical to the previous one:

  • Some notifications are changed.
  • The switch command uses the reverse logic, planning machines to shrink in size.
  • Once at the minimum spec, the virtual machine won’t be downsized anymore.
$connectionName = "AzureRunAsConnection"
try
{
    # Get the connection "AzureRunAsConnection "
    $servicePrincipalConnection=Get-AutomationConnection -Name $connectionName
    "Logging in to Azure..."
    Add-AzureRmAccount `
        -ServicePrincipal `
        -TenantId $servicePrincipalConnection.TenantId `
        -ApplicationId $servicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint
}
catch {
    if (!$servicePrincipalConnection)
    {
        $ErrorMessage = "Connection $connectionName not found."
        throw $ErrorMessage
    } else{
        Write-Error -Message $_.Exception
        throw $_.Exception
    }
}
#Script Runs Here
function ResizeVM ($FuncRGName, $FuncVMName, $FuncNewSize)
{
    Write-Output "Downgrading $FuncVMName to $FuncNewSize ... this will require a reboot"
    $FuncVM = Get-AzureRmVM -ResourceGroupName $FuncRGName -Name $FuncVMName
    $FuncVM.HardwareProfile.VmSize = $FuncNewSize
    Update-AzureRmVM -VM $FuncVM -ResourceGroupName $FuncRGName
}
$ResourceGroupName = "petri2"
$VMName = "vm-petri2-01"
Write-Output "Starting the scale down process"
$VM = Get-AzureRmVM -ResourceGroupName $ResourceGroupName -Name $VMName
$CurrentSize = $VM.HardwareProfile.VmSize
switch ($CurrentSize)
{
    'Standard_B1ms' { Write-Output "The VM is at the min allowed size for this application" }
    'Standard_B2ms' { ResizeVM $ResourceGroupName $VMName "Standard_B1ms" }
    'Standard_B4ms' { ResizeVM $ResourceGroupName $VMName "Standard_B2ms" }
}
Write-Output "The resizing runbook is finished"
The Scale Up and Scale Down runbooks in Azure Automation [Image Credit: Aidan Finn]
The Scale Up and Scale Down Runbooks in Azure Automation [Image Credit: Aidan Finn]

Triggering the Runbooks

One could always start the runbooks manually, trigger them from a recovery plan, or even schedule them. In this case, I will automatically trigger my runbooks based on a virtual machine performance alert.
You will require at least two alerts:

  • One for when CPU usage is too high, which will scale up the machine.
  • Another for when CPU usage drops and will scale down the machine.

You could also add alerts for memory usage and more.
In my first example, I will alert for when CPU usage has been over 80 percent for more than 5 minutes; this prevents brief spikes causing a needless size increase. An email will be sent to subscription owners, etc, to let them know that the process is being triggered.

Alert for when a virtual machine’s CPU usage is high [Image Credit: Aidan Finn]
Alert for When a Virtual Machine’s CPU Usage Is High [Image Credit: Aidan Finn]
 
I will add an action to start a user-defined runbook from my automation account. In this case, I will start the scale-up runbook.
 

 
Note the option to add parameters. Instead of using variables for the virtual machine name and resource group, I could pass those values into a generic script!
Using an Azure virtual machine metric alert to trigger a runbook [Image Credit: Aidan Finn]
Using an Azure Virtual Machine Metric Alert to Trigger a Runbook [Image Credit: Aidan Finn]
 
I’ll then create a similar alert for under 80 percent utilization that will trigger the scale-down runbook. And that’s it! Azure will now automatically increase and decrease the size of my virtual machine based on CPU utilization or any other metric that I specify.
A drop in CPU utilization automatically resizes and Azure virtual machine [Image Credit: Aidan Finn]
A Drop in CPU Utilization Automatically Resizes and Azure Virtual Machine [Image Credit: Aidan Finn]