AZ-400: Designing and Implementing Microsoft DevOps Solutions

Analyze Metrics

Introduction

In this lesson, you’ll learn how to analyze metrics for the AZ-400 exam by inspecting Azure infrastructure performance and leveraging telemetry data. Effective monitoring helps you optimize resource health, detect issues early, and ensure your applications run smoothly.

By the end of this tutorial, you'll be able to:

  • Track critical infrastructure metrics (CPU, memory, disk, network)
  • Configure Azure monitoring services and alerts
  • Analyze usage and application performance telemetry
  • Build custom dashboards and follow best practices for Azure performance management

Understanding core infrastructure metrics lets you proactively manage Azure resources and avoid bottlenecks.

Key Metrics Overview

MetricDefinitionAzure Monitor Metric NameTypical Threshold
CPU UsagePercentage of CPU capacity in usePercentage CPU70%
Memory UtilizationRatio of committed vs. available memoryAvailable Memory80%
Disk I/ORead/write operations per secondDisk Read/Write Ops/SecVaries by workload
Network ThroughputInbound/outbound bytes per secondNetwork In/Out BytesVaries by workload

The image is a slide titled "Inspecting Infrastructure Performance Indicators, Including CPU, Memory, Disk, and Network," listing four key metrics: Understanding Key Metrics for Azure Performance Management, CPU Performance, Memory Utilization, and Disk Performance.

Note

Import these metrics into Azure Monitor to visualize trends, set alerts, and automate scaling actions.

Practical Monitoring Scenarios

  • Scenario 1: Scale out a compute cluster when CPU usage exceeds 75% for 5 minutes
  • Scenario 2: Trigger an alert on sustained disk latency spikes in a database VM
  • Scenario 3: Throttle network-intensive workloads to prevent bandwidth saturation

The image is a slide titled "Inspecting Infrastructure Performance Indicators, Including CPU, Memory, Disk, and Network," listing three points: practical examples of performance monitoring, benefits of proactive performance, and common challenges.

Benefits & Challenges

BenefitChallenge
Early issue detectionAlert fatigue if thresholds too strict
Optimized resource utilizationData overload without proper filtering
Reduced downtime and faster MTTRMisconfigured alerts can mask real issues

Telemetry data provides deeper insights into application usage and performance.

The image is a slide titled "Analyzing Metrics by Using Collected Telemetry, Including Usage and Application Performance," listing four topics related to Azure: introduction to telemetry, key services, monitoring services, and configuring alerts.

Configuring Alerts

  1. Navigate to Azure Monitor > Alerts
  2. Create an Alert Rule for a selected metric
  3. Define Action Groups to notify, log, or trigger automation

Warning

Avoid setting excessive alert rules. Prioritize critical metrics to reduce noise and ensure timely response.

Building Custom Dashboards

  • Pin metric charts from multiple resources
  • Use Workbooks for interactive reports
  • Share dashboards with your team via Azure Portal

Monitor end-to-end application health by analyzing real usage metrics, dependencies, and response times.

The image is a slide titled "Analyzing Metrics by Using Collected Telemetry, Including Usage and Application Performance," listing topics like monitoring application performance, analyzing usage metrics, custom dashboards in Azure Monitor, and best practices.

Best Practices

  • Enable Application Insights for distributed tracing
  • Define Failure Anomalies to catch performance regressions
  • Leverage Live Metrics Stream during load testing
  • Tag resources consistently for grouped telemetry analysis

Watch Video

Watch video content

Previous
Summary