Prometheus Certified Associate (PCA)
Alerting
Configuration
In this guide, we will review the Alertmanager configuration and explain its primary components. The Alertmanager configuration (typically stored in alertmanager.yml) is divided into three main sections: the global configuration, routing rules, and receivers. Each section plays a distinct role in how alerts are processed and forwarded.
Global Configuration
The global configuration sets default parameters for the entire Alertmanager setup. Settings defined here—such as SMTP settings for email notifications—serve as defaults that individual routes and receivers can override. This structure mirrors the global configuration approach in Prometheus.
Below is an example of a global configuration block with routing rules and receivers defined:
global:
smtp_smarthost: 'mail.example.com:25'
smtp_from: '[email protected]'
route:
receiver: staff
group_by: ['alertname', 'job']
routes:
- match_re:
job: (node|windows)
receiver: infra-email
- matchers:
job: kubernetes
receiver: k8s-slack
receivers:
- name: k8s-slack
slack_configs:
- channel: '#alerts'
text: 'https://exampl.com/alerts/{{ .GroupLabels.app }}'
Key components in this configuration include:
- Global Settings: Define SMTP parameters and other default configurations.
- Route Section: Establishes matching rules to determine which alerts are sent to which receivers.
- Receivers Section: Contains notifier configurations (e.g., Slack, email) that manage how alerts are delivered.
Routing Section
The routing section assigns alerts to specific receivers based on matching criteria. Routes can use either matchers
for direct label comparisons or match_re
for regular expression-based matching.
Matching Rules
Use matchers for exact label value comparisons. Leverage match_re when you require more flexible, regular expression-based matching.
For example, the following route configuration uses both methods:
route:
routes:
- match_re:
job: (node|windows)
receiver: infra-email
- matchers:
job: kubernetes
severity: ticket
receiver: k8s-slack
Once an alert matches the defined criteria, it is forwarded to the corresponding receiver as specified in the receivers section.
Default (Fallback) Route
You can define a default or fallback route at the top level of your routing configuration. These routes catch any alerts that do not match more specific rules. In the example below, alerts that do not match any nested routes are grouped by alertname
and job
, then sent to the receiver named staff
:
route:
receiver: staff
group_by: ['alertname', 'job']
routes:
- match_re:
job: (node|windows)
receiver: infra-email
- matchers:
job: kubernetes
receiver: k8s-slack
Nested Routes (Subroutes)
Alertmanager supports nested routes (subroutes) to allow for additional filtering. With nested routes, a parent route can first match alerts with a specific label (such as job
), and subroutes can further refine the selection based on additional criteria, like severity
.
Consider the following configuration. Alerts with the label job: kubernetes
are routed to k8s-email
by default; however, if an alert also includes severity: pager
, it is forwarded to k8s-pager
:
route:
routes:
- matchers:
job: kubernetes
receiver: k8s-email
routes:
- matchers:
severity: pager
receiver: k8s-pager
Processing steps include:
- The alert is first evaluated against the parent route (matching
job: kubernetes
). - If the alert has the label
severity: pager
, the subroute is matched and sent to thek8s-pager
receiver. - Alerts missing the extra
severity
label default to the parent route’s receiver (k8s-email
).
Managing Multiple Teams with Subroutes
You can further refine routing for multiple teams by using subroutes. This approach ensures that alerts reach the appropriate team's notifiers. The following configuration demonstrates how to set up parent routes based on a team label, with subroutes handling alert severity:
route:
routes:
# Database team configuration
- match:
team: database
receiver: database-pager
routes:
- match:
severity: page
receiver: database-pager
- match:
severity: email
receiver: database-email
# API team configuration
- match:
team: api
receiver: api-pager
routes:
- match:
severity: page
env: dev
receiver: api-ticket
- match:
severity: page
receiver: api-pager
- match:
severity: ticket
receiver: api-ticket
In this configuration:
- Alerts for the database team default to
database-pager
, unless further specifications direct them todatabase-email
. - Alerts for the API team default to
api-pager
, with additional subroutes managing alerts based on environment or severity.
Reloading the Alertmanager Configuration
After making changes to your alertmanager.yml file, Alertmanager does not automatically reload the updated configurations. You must take one of the following actions to apply changes:
Restart the Alertmanager process.
Send a SIGHUP signal. For example:
sudo killall -HUP alertmanager
Send an HTTP POST request to the /-/reload endpoint.
Important
For production environments, ensure that your configuration file is validated before reloading to avoid disruptions in alert processing.
Matching Multiple Routes
By default, Alertmanager stops processing routes after the first match is found. In the example below:
route:
routes:
- receiver: alert-logs
- matchers:
job: kubernetes
receiver: k8s-email
If an alert matches the first route (alert-logs
), it will not be evaluated for further matches with the Kubernetes-specific rule. To allow an alert to trigger multiple receivers, use the continue
property on a route. Setting continue: true
instructs Alertmanager to evaluate subsequent routes even after a match.
Alert Grouping
Alert grouping consolidates multiple alerts into a single notification based on specified label values. The group_by
field controls how alerts are bundled, which can be particularly useful when organizing alerts by team or other criteria.
For example, grouping alerts by team ensures that alerts for the same team are sent together:
route:
receiver: fallback-pager
group_by: [team]
routes:
- match:
team: infra
group_by: [region, env]
receiver: infra-email
routes:
- match:
severity: page
receiver: infra-pager
In this configuration:
- The default route groups alerts based on the
team
label. - A nested route for the infrastructure team further groups alerts by
region
andenv
. - Child routes inherit the grouping rules unless they define their own
group_by
values.
This comprehensive overview of the Alertmanager configuration covers global settings, routing strategies (including nested routes), receivers, and alert grouping. By understanding and leveraging these configurations, you can customize alert notifications to align with your infrastructure and operational requirements.
For more detailed information, visit the Alertmanager Documentation.
Watch Video
Watch video content