Skip to main content
Monitoring Azure AI services ensures performance, reliability, and security for production workloads. Azure provides integrated monitoring features—Metrics, Diagnostic Settings, Logs, and Alerts—that help you track health, analyze behavior, and respond to incidents. This guide shows where to find those features in the Azure portal and how to use them effectively. Key monitoring components at a glance:
ComponentPurposeTypical use case
AlertsNotify or automate when conditions occurNotify on usage spikes, trigger remediation runbooks
MetricsNumeric, time-series measurements (e.g., response time, request count)Real-time dashboards and trend analysis
Diagnostic SettingsConfigure export of logs and platform metrics to destinationsCentralize logs to Log Analytics, archive to Storage, or forward to Event Hubs
LogsDetailed, timestamped records for auditing and troubleshootingForensics, compliance, and custom alerting with KQL
A presentation slide titled "Monitoring Azure AI Services Activity" that shows four monitoring components—Alerts, Metrics, Diagnostic Settings, and Logs—each with an icon and a short bullet description. It summarizes how to track and analyze service performance, security, and operational insights.
This article walks through the Azure portal to locate these features for an Azure AI (Cognitive) resource (for example, Language or Vision services), and explains how to combine them for operational monitoring and security posture.

Locate Metrics in the Azure portal

Steps to view metrics for a Cognitive Services / Azure AI resource:
  1. In the Azure portal, open your Cognitive Services or specific Azure AI resource (e.g., Language service).
  2. In the left-hand menu navigate to Monitoring > Metrics.
  3. Select a metric (Total Calls, Latency, Throttled Requests, etc.), choose aggregation (Sum, Average, Count), and set the time range.
  4. Add additional metrics to the chart to compare trends and spot correlations.
Metrics are ideal for dashboards, real-time monitoring, and identifying sudden spikes or gradual performance degradation. Example: viewing the “Total Calls” metric for a deployed language resource:
A screenshot of a metrics dashboard showing the "Total Calls" metric for the resource ai102cogservices909 with the aggregation set to Sum. The line chart below shows a flat/zero series across the day (no call activity).
Tips:
  • Combine related metrics (e.g., Total Calls + Throttled Requests) to detect capacity or throttling issues.
  • Pin metrics charts to Azure dashboards for consolidated operational views.
  • Use appropriate aggregations for your scenario (Sum for totals, Average/Percentile for latency).

Diagnostic settings and Logs

Diagnostic settings determine where resource logs and metrics are exported for deeper analysis, retention, or integration with SIEMs. What diagnostic settings can export:
  • Resource logs: request/response logs and resource-specific events.
  • Platform metrics (where applicable).
  • Activity and audit logs for Azure AI capabilities (for example, Azure OpenAI request usage).
Destinations supported:
DestinationUse case
Log Analytics workspaceQuery and analyze logs with Kusto Query Language (KQL); build custom dashboards and log alerts
Storage accountLong-term archival and compliance retention
Event HubStream logs to third-party analytics or SIEMs (Splunk, external systems)
Partner solutionsForward to available partner monitoring/analytics integrations
To configure diagnostic settings:
  1. Open your resource in the Azure portal.
  2. Select Diagnostic settings > Add diagnostic setting.
  3. Choose the log categories you need (Audit Logs, Request and Response Logs, Trace Logs, Azure OpenAI Request Usage, etc.).
  4. Select one or more destinations (Log Analytics, Storage, Event Hub, Partner).
  5. Save the diagnostic setting.
A screenshot of an Azure "Diagnostic setting" configuration page showing log categories (Audit Logs, Request and Response Logs, Azure OpenAI Request Usage, Trace Logs) and metric options on the left, with destination checkboxes on the right (Send to Log Analytics workspace, Archive to a storage account, Stream to an event hub, Send to partner solution). The top toolbar includes Save, Discard, Delete and Feedback actions.
Diagnostic Settings do not automatically send logs anywhere — you must create a diagnostic setting and choose a destination (Log Analytics, Storage, Event Hub, etc.) to collect logs for analysis and retention.
Logs stored in Log Analytics are queryable using Kusto Query Language (KQL). Use KQL to:
  • Perform forensics and investigations (who accessed what and when).
  • Satisfy compliance and retention requirements.
  • Create custom dashboards and log-based alert rules.
Useful references:
Carefully consider data sensitivity before exporting request/response logs. Avoid sending Personally Identifiable Information (PII) or secrets to destinations unless you have proper data governance and encryption in place.

Alerts: detect and respond

Azure Monitor alerts let you create rules that notify teams or trigger automation when metrics or logs meet defined conditions. Alert types:
Alert typeTriggers onUse case
Metric alertsNumeric metric thresholds or trendsHigh error rates, CPU or request count thresholds
Log alertsResults of a KQL queryDetect suspicious patterns in logs, failed authentication attempts
Activity Log alertsAzure activity eventsResource creation, role changes, subscription-level events
Typical alert rule workflow:
  1. Define the scope (select the resource(s) to monitor).
  2. Define the condition (metric threshold or KQL query and evaluation frequency).
  3. Define actions by associating an Action Group (email, SMS, webhook, Logic App, Azure Function, Teams, PagerDuty, etc.).
  4. Provide alert details (severity, description) and create the rule.
Use cases:
  • Notify DevOps on usage spikes or quota exhaustion.
  • Trigger automated remediation (e.g., scale-out, restart services).
  • Escalate security incidents to on-call via PagerDuty or Teams.
Reference:

Putting it together

  • Metrics: Best for real-time numeric monitoring and dashboards.
  • Diagnostic Settings + Logs: Centralize and retain logs for deep analysis, compliance, and alerting using KQL.
  • Alerts: Bridge monitoring and operations by notifying teams and invoking automated responses.
A recommended monitoring approach:
  1. Enable Metrics and pin key charts to an Azure dashboard.
  2. Configure Diagnostic Settings to send resource logs to a Log Analytics workspace (and archive critical logs to Storage).
  3. Create log- and metric-based alerts for operational and security thresholds.
  4. Automate common remediations via Action Groups connected to Logic Apps or Functions.
  5. Regularly review dashboard trends, alert history, and log queries to refine detection and reduce noise.
Monitoring is the foundation for keeping Azure AI services reliable, performant, and secure. With metrics, diagnostics, logs, and alerts configured, you can build dashboards, runbooks, and automated responses to maintain service health.

Watch Video