With the observability service enabled, you can use Red Hat Advanced Cluster Management for Kubernetes (RHACM) to gain insight about and optimize your managed clusters. If the managed cluster is Red Hat OpenShift Container Platform (RHOCP) 4.8+ or KS cluster, you can see alerts from all the managed clusters in the hub cluster.
You also can configure forward alerts with an external notification system.
In this blog post, I introduce how to use amtool
to manage RHACM alerts.
amtool
The amtool
is a CLI tool for interacting with the alertmanager
API. It's bundled with all releases of Alertmanager. You can install locally with the following command: go get github.com/prometheus/alertmanager/cmd/amtool
Note: If you're the user who can access the observability-alertmanager-0
pod directly, you can use amtool,
which is bundled with that pod. Use the amtool alert --alertmanager.url=http://localhost:9093
command to list alerts.
Connect to RHACM Alertmanager
RHACM exposes the alertmanager API through a route. You can get the alertmanager URL by using the following command:
oc get route alertmanager -n open-cluster-management-observability -o jsonpath="{.spec.host}"
Before you connect to alertmanager, you also need to pass bearer-token
to amtool
. You can get the bearer token from RHACM console Configure client
or fetch the token with the following command: oc whoami -t
if you have logged in your OCP cluster. You can create a config file in YAML format from one of two default config locations: $HOME/.config/amtool/config.yml
or /etc/amtool/config.yml
. View the following example syntax:
alertmanager.url: https://alertmanager-open-cluster-management-observability.apps.xxx
http.config.file: $HOME/.config/amtool/http_config.yml
Specify the file for the http.config.file
parameter and in http_config
format. View the following example synntax:
authorization:
type: Bearer
credentials: sha256~xxxxxxx
tls_config:
insecure_skip_verify: true
Configuration
You can use amtool
to understand the current alertmanager configuration. View the following alertmanager configuration sample:
global:
resolve_timeout: 5m
http_config: {}
smtp_hello: localhost
smtp_require_tls: true
slack_api_url: <secret>
pagerduty_url: https://events.pagerduty.com/v2/enqueue
hipchat_api_url: https://api.hipchat.com/
opsgenie_api_url: https://api.opsgenie.com/
wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/
victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/
route:
receiver: default-receiver
group_by:
- alertname
- cluster
repeat_interval: 45m
receivers:
- name: default-receiver
slack_configs:
- send_resolved: true
http_config: {}
api_url: <secret>
...
Examples
Continue reading to learn how you can use amtool
to manage alerts.
1. View all active alerts with the following command:
$ amtool alert
Alertname Starts At Summary State
KubeCPUOvercommit 2021-10-27 07:47:32 UTC Cluster has overcommitted CPU resource requests. active
2. View all active alerts with extended outputs by running the following command:
$ amtool alert -o extended
Labels Annotations Starts At Ends At Generator URL State
alertname="KubeCPUOvercommit" cluster="cyang2-kind" severity="warning" description="Cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure." summary="Cluster has overcommitted CPU resource requests." 2021-10-27 07:47:32 UTC 2021-11-10 13:50:02 UTC http://prometheus-k8s-0:9090/graph?g0.expr=sum%28namespace_cpu%3Akube_pod_container_resource_requests%3Asum%29+%2F+sum%28kube_node_status_allocatable%7Bresource%3D%22cpu%22%7D%29+%3E+%28count%28kube_node_status_allocatable%7Bresource%3D%22cpu%22%7D%29+-+1%29+%2F+count%28kube_node_status_allocatable%7Bresource%3D%22cpu%22%7D%29&g0.tab=1 active
3. Silence a specific alert with the following command:
$ amtool silence add alertname=KubeCPUOvercommit --comment=acked
290bb29e-6457-47b0-b10d-140b10418c4c
4. Silence all alerts with the label matches.
RHACM adds the cluster
label for each alert. RHACM uses this label to identify where the alert is from. So you can silence all alerts from 1 cluster by using the following commands:
$ amtool silence add cluster="local-cluster" --comment=acked
48ccecdc-abb1-4196-83fd-593ba010ddf3
$ amtool silence add alertname="KubeCPUOvercommit" cluster=~".+1" --comment=acked
18abf36d-b01e-46a0-ba1a-b814acb4bae0
Similarly, regex matching is also supported. The =~
syntax (similar to Prometheus) is used to represent a regex match. Regex matching can be used in combination with a direct match. This statement adds a silence that matches alerts with the alertname="KubeCPUOvercommit"
, and cluster is at end of 1
label value pairs set.
5. View silences with the following command:
$ amtool silence query
ID Matchers Ends At Created By Comment
290bb29e-6457-47b0-b10d-140b10418c4c alertname="KubeCPUOvercommit" 2021-11-10 14:54:44 UTC chuyang acked
6. Expire a silence with the following command:
$ amtool silence expire 290bb29e-6457-47b0-b10d-140b10418c4c
7. Expire all silences using the following command:
$ amtool silence expire $(amtool silence query -q)
Conclusion
In conclusion, RHACM supports use of amtool
to manage RHACM alerts. I hope this blog is helpful to you!
저자 소개
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래