[Updated] AZ-104: Microsoft Azure Administrator
Administer Azure Virtual Machines
Deploying Virtual Machine Scale Sets
In this guide, we explore how to deploy Virtual Machine Scale Sets (VMSS) in Azure and understand how scaling works in the cloud. You'll learn the differences between vertical and horizontal scaling and see how VMSS leverages horizontal scaling to manage a group of load-balanced VMs.
Scaling in Azure
Azure supports two primary scaling strategies:
• Vertical scaling involves adjusting the capacity of an individual resource. For example, you might upgrade a virtual machine (VM) to a more powerful configuration (scale up) or downgrade it (scale down). Note that vertical scaling is subject to the maximum instance sizes available (hypothetically 96 vCPUs and 96 GB of RAM) and may require downtime during resizing.
• Horizontal scaling means adding or removing instances rather than modifying individual resource capacity. Adding instances is known as scale out, while removing instances is called scale in.
VM Scale Sets in Azure use horizontal scaling to distribute the load across a group of VMs automatically.
Overview of Virtual Machine Scale Sets
Azure Virtual Machine Scale Sets (VMSS) enable you to create and manage a group of load-balanced VMs with identical configurations or, with recent updates, mixed OS images. VMSS supports scaling based on schedules, performance metrics, or on-demand requirements. Additionally, by distributing instances across availability zones, VMSS ensures high availability.
Key details about VMSS:
- Up to 1,000 instances can be deployed using marketplace or custom images.
- The limit is 600 instances when deploying with a managed image.
Deploying a Virtual Machine Scale Set via the Azure Portal
Deploying a VMSS resembles deploying a single VM, with additional scaling-specific options. Follow these steps:
In the Azure portal, search for "VMSS" or "Virtual Machine Scale Set" and select it.
Click on the option to create a new Virtual Machine Scale Set. For instance, you might choose a resource group named "RGHA" and name your scale set "VMSS HA01". When configuring, select the availability zones to ensure that instances are spread across different zones.
Choose an orchestration mode based on your needs:
- Select Flexible if you need to run varied VM types or mix configurations.
- Select Uniform if you prefer all instances to be identical.
In this example, we choose Uniform and select a Standard VM image.
Set your admin credentials and configure network options. You can also configure settings related to scaling. For load balancing, you may opt for the Azure Load Balancer or Application Gateway. In our demonstration, we choose not to use either option.
Configure the scaling settings:
- Set the initial number of instances.
- Define the scaling policy by specifying the minimum and maximum instance counts.
- Configure thresholds; for example, add one instance (scale out) if CPU utilization exceeds 75% for 10 minutes, or remove one instance (scale in) if it drops to 25% after a cooldown period.
On the Management tab, review the default settings for health monitoring and additional options. In the Advanced tab, you can enable the option to scale beyond 100 instances and enforce an even distribution of instances across availability zones.
Finally, click Review and Create to deploy your scale set.
Verifying the Deployment and Testing Autoscaling
After deployment, navigate to the VMSS resource page and verify the running instances. Depending on your configuration, you might initially see one or two instances. The activity log can also provide details about scaling operations, such as scaling down from two instances to one.
Connecting to an Instance for Stress Testing
Since scale set instances do not receive public IP addresses by default, a jumpbox—a VM within the same virtual network that has a public IP—is required to connect to them.
Identify a jumpbox VM from the list of connected devices within your virtual network.
Connect to the jumpbox VM via SSH using its public IP address. For example, in your terminal:
ssh [email protected]
After entering the password and being authenticated, use this jumpbox to connect to your scale set instance. Identify the internal IP of the scale set instance (e.g., 10.0.0.6) and connect via SSH or Bastion.
Installing and Running Stress on the Instance
Once connected to your scale set instance, update the package list and install the stress testing tool:
sudo apt update
sudo apt install stress -y
Below is an example of the installation process:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
stress
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 18.4 kB of archives.
After this operation, 55.3 kB of additional disk space will be used.
Fetched 18.4 kB in 0s (533 kB/s)
Selecting previously unselected package stress.
(Reading database ... 58930 files and directories currently installed.)
Preparing to unpack .../stress_1.0.4-6_amd64.deb ...
Unpacking stress (1.0.4-6) ...
Setting up stress (1.0.4-6) ...
Processing triggers for install-info (6.7.0.dfsg.2-5) ...
Processing triggers for man-db (2.9.1-1) ...
Before running tests, refer to the manual page of the stress tool:
man stress
A common usage example is:
stress -c 2 -t 600 -v
In this command:
-c 2
spawns two CPU stress workers.-t 600
sets the test duration to 600 seconds.-v
enables verbose output.
To monitor the impact on CPU usage, open another terminal session (via the jumpbox) and run:
htop
You should observe CPU usage rising significantly, nearing 100% due to the stress test.
As the CPU usage climbs, the autoscale policy initiates the creation of additional instances. Monitor these scaling activities in the activity log:
If the load remains high, additional instances will be added until the maximum limit defined during configuration (e.g., 10 instances) is reached.
Conclusion
In this article, you learned how to deploy and configure Virtual Machine Scale Sets in Azure. By leveraging VMSS with a load balancer, your application can automatically scale based on CPU utilization thresholds, ensuring resources are dynamically allocated to meet demand.
Note
Autoscaling is a powerful feature of Azure that helps maintain performance and availability under varying load conditions. Always monitor the resource usage and adjust your scaling rules as needed.
This concludes our deep dive into administering virtual machine scale sets. For more insights on Azure solutions, please refer to the Azure Documentation.
Watch Video
Watch video content