Inspect infrastructure performance indicators including CPU memory disk and network

In this guide, you’ll discover how to monitor four critical performance metrics in Azure:

Metric	Why It Matters	Azure Tools
CPU	Measures compute load and identifies processing bottlenecks	Azure Monitor, Azure Metrics
Memory	Tracks RAM consumption to prevent slowdowns and crashes	Azure Monitor, Azure Metrics
Disk	Monitors IOPS, latency, and throughput for data operations	Azure Storage Metrics, Azure Monitor
Network	Analyzes bandwidth, latency, and packet loss for connectivity	Azure Network Watcher, Network Performance Monitor

Understanding these indicators is essential for maintaining optimal performance, minimizing downtime, and preparing for the AZ-400 exam. By keeping an eye on these metrics, you’ll ensure a smooth user experience while optimizing costs.

The image is an infographic titled "Infrastructure Performance Indicators," highlighting four key performance indicators (KPIs): CPU usage, memory utilization, disk performance, and network activity.

CPU Performance

CPU performance reflects the percentage of processing capacity your workloads consume. Sustained high CPU can lead to slow response times and application failures.

The image illustrates CPU performance, highlighting that it is typically expressed as a percentage of total available CPU capacity.

Use Azure Monitor and Azure Metrics to collect real-time and historical CPU data:

The image lists tools for monitoring CPU usage in Azure, specifically Azure Monitor and Azure Metrics.

High CPU usage often signals a busy application or a resource-intensive process. If it remains above 80% for extended periods, you may experience:

Slow response times
Increased processing latency
Application crashes

The image illustrates the impact of high CPU usage, showing slow performance with a warning symbol and an application crash with an error message.

Practical Example: Alerting on CPU Spikes

# Steps to create a CPU alert in Azure Portal:
1. Go to Azure Monitor > Alerts  
2. Click New alert rule  
3. Select target resource (VM or App Service)  
4. Under Condition, choose "CPU Percentage"  
5. Set threshold (e.g., CPU > 80% for 5 minutes)  
6. Define an action group for notifications  
7. Review and create

By proactively tracking CPU trends, you can right-size VMs or refactor code before performance degrades.

Memory Utilization

Memory utilization shows how much RAM your applications consume. Excessive memory usage can trigger slowdowns or out-of-memory errors.

The image illustrates memory utilization issues, showing slow performance with a warning symbol and an application crash with an error message.

How to Monitor Memory

In Azure Portal, navigate to Azure Metrics.
Select your target resource (e.g., VM, Web App).
Add the Memory Usage metric to a chart.
Configure an alert on critical thresholds.

The image is a flowchart illustrating a practical example of memory monitoring in three steps: navigating Azure metrics, selecting a relevant resource, and choosing a memory usage metric.

Review memory usage graphs over time to uncover leaks or inefficient allocation:

The image shows a memory monitoring graph with available memory data over time, highlighting average, 5th, and 10th percentile values. It emphasizes identifying trends and spikes in memory consumption.

Remediation Tips:

Optimize code to release unused memory
Scale up the VM or App Service plan if needed

Disk Performance

Disk performance metrics gauge how efficiently your storage layer handles read/write operations—vital for data-intensive workloads.

The image illustrates disk performance, showing a diagram of data being read from and written to a storage disk, with accompanying text explaining the concept.

Key Disk Metrics

Metric	Description
IOPS	Input/Output Operations per Second
Latency	Time taken for each read/write request
Throughput	Volume of data transferred per second

The image is a diagram titled "Disk Performance" featuring three colored boxes labeled "Input/Output Operations per Second (IOPS)," "Latency," and "Throughput."

The image is a diagram titled "Disk Performance," showing three components: Input/Output Operations per Second (IOPS), Latency, and Throughput, with a note that throughput measures the amount of data transferred per second.

Poor disk performance manifests as slow file operations and timeouts:

The image illustrates a decline in disk performance, represented by a downward graph and arrows, with a label indicating "Poor disk performance."

The image is a diagram about disk performance, highlighting issues like slow response times and increased latency, which impact user experience.

Monitoring Disk Performance

Enable metrics for IOPS, latency, and throughput on your storage account or managed disk.
Use Azure Monitor and Azure Storage Metrics to chart and alert.
Set thresholds (e.g., latency > 20 ms) to trigger notifications.

The image is a diagram illustrating disk performance monitoring using Azure Monitor and Azure Storage Metrics to track IOPS, latency, and throughput.

Remediation Strategies:

Upgrade to Premium or Ultra disks
Use striping and caching for high-throughput scenarios
Implement an in-memory or CDN cache for hot data

Network Performance

Network performance determines how swiftly and reliably data travels across your Azure resources and to end users.

Key metrics:

Bandwidth: Maximum data transfer rate
Latency: Round-trip time between endpoints
Packet Loss: Percentage of dropped packets

Poor network health can cause application delays, timeouts, and degraded user satisfaction.

Monitoring with Azure Network Watcher

Enable Network Watcher in your subscription.
Use Connection Monitor to assess latency and packet loss.
Review bandwidth usage on each virtual NIC.

The image illustrates a practical example of network performance monitoring using Azure Network Watcher, focusing on bandwidth, latency, and packet loss.

Azure Network Watcher’s Network Performance Monitor provides end-to-end visibility:

The image is a slide titled "Practical Example of Network Performance Monitoring," showing a diagram of network issues and listing corrective actions: optimizing network configurations, increasing bandwidth, and implementing QoS policies.

Remediation Tips:

Optimize routing, peering, and gateway configurations
Increase bandwidth allocation for high-traffic workloads
Apply QoS policies to prioritize mission-critical packets

Benefits and Common Challenges

Proactive monitoring helps you:

Detect issues before they impact users
Optimize resource allocation and reduce costs
Maintain consistent performance under load

However, you may encounter:

Alert fatigue from too many notifications
Difficulty selecting the most relevant metrics
Balancing performance improvements with budget constraints

The image is a diagram titled "Common Challenges," highlighting three issues: managing alert fatigue, identifying relevant metrics, and balancing performance and cost.

References

Watch Video

Watch video content