本文最后更新于 2024-07-17,

若内容或图片失效,请留言反馈。部分素材来自网络,若不小心影响到您的利益, 请联系我 删除。

本站只有Telegram群组为唯一交流群组, 点击加入

文章内容有误?申请成为本站文章修订者或作者? 向站长提出申请

之前安装了Prometheus,但是只有监控没有告警也不行,于是搭配Alertmanager和钉钉进行机器告警。顺便记录一下搭建使用方法

设置钉钉webhook

1719115086240.webp

1719115097542.webp
1719115107161.webp
1719115111401.webp
1719115115249.webp
1719115119177.webp
1719115123767.webp
1719115128206.webp

设置dingtalk

GitHub:https://github.com/timonwong/prometheus-webhook-dingtalk/

下载dingtalk

~# wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.0.0/prometheus-webhook-dingtalk-2.0.0.linux-amd64.tar.gz
~# tar xf prometheus-webhook-dingtalk-2.0.0.linux-amd64.tar.gz -C /usr/local/
~# ln -sv /usr/local/prometheus-webhook-dingtalk-2.0.0.linux-amd64/ /usr/local/prometheus-webhook-dingtalk
'/usr/local/prometheus-webhook-dingtalk' -> '/usr/local/prometheus-webhook-dingtalk-2.0.0.linux-amd64/'

dingtalk使用帮助

usage: prometheus-webhook-dingtalk [<flags>]

Flags:
  -h, --help                    Show context-sensitive help (also try --help-long and --help-man).
      --web.listen-address=:8060
                                The address to listen on for web interface.
      --web.enable-ui           Enable Web UI mounted on /ui path
      --web.enable-lifecycle    Enable reload via HTTP request.
      --config.file=config.yml  Path to the configuration file.
      --log.level=info          Only log messages with the given severity or above. One of: [debug, info, warn, error]
      --log.format=logfmt       Output format of log messages. One of: [logfmt, json]
      --version                 Show application version.

dingtalk配置文件

~# cat /usr/local/prometheus-webhook-dingtalk/config.yml
## Request timeout
timeout: 5s

## Customizable templates path
templates:
  - contrib/templates/legacy/template.tmpl

## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
default_message:
  title: '{{ template "legacy.title" . }}'
  text: '{{ template "legacy.content" . }}'

## Targets, previously was known as "profiles"
targets:
  webhook1:
    url: https://oapi.dingtalk.com/robot/send?access_token=2c12e095bf94e7fdde88cf3379023f800ecc26a44d25f3002a781e1cee825ad4
    # secret for signature
    secret: SEC01944594567bfc02c1888dacbdf8115b4b6725b39fa26bd300bd3455fdc20e3b
  webhook_mention_all:
    url: https://oapi.dingtalk.com/robot/send?access_token=2c12e095bf94e7fdde88cf3379023f800ecc26a44d25f3002a781e1cee825ad4
    secret: SEC01944594567bfc02c1888dacbdf8115b4b6725b39fa26bd300bd3455fdc20e3b
    mention:
      all: true
  webhook_mention_users:
    url: https://oapi.dingtalk.com/robot/send?access_token=2c12e095bf94e7fdde88cf3379023f800ecc26a44d25f3002a781e1cee825ad4
    mention:
      mobiles: ['13618666666']

添加dingtalk.service文件

~# cat /lib/systemd/system/dingtalk.service
[Unit]
Descripton=dingtalk
Documentation=https://github.com/timonwong/prometheus-webhook-dingtalk/
After=network.target

[Service]
Restart=on-failure
WorkingDirectory=/usr/local/prometheus-webhook-dingtalk
ExecStart=/usr/local/prometheus-webhook-dingtalk/prometheus-webhook-dingtalk --config.file=/usr/local/prometheus-webhook-dingtalk/config.yml

[Install]
WantedBy=multi-user.target

设置dingtalk开机启动

~# systemctl enable dingtalk
Created symlink /etc/systemd/system/multi-user.target.wants/dingtalk.service → /lib/systemd/system/dingtalk.service.
~# systemctl start dingtalk
~# systemctl status dingtalk
● dingtalk.service
     Loaded: loaded (/lib/systemd/system/dingtalk.service; disabled; vendor preset: enabled)
     Active: active (running) since Wed 2021-12-01 14:29:35 CST; 4s ago
       Docs: https://github.com/timonwong/prometheus-webhook-dingtalk/
   Main PID: 26590 (prometheus-webh)
      Tasks: 7 (limit: 7069)
     Memory: 2.5M
     CGroup: /system.slice/dingtalk.service
             └─26590 /usr/local/prometheus-webhook-dingtalk/prometheus-webhook-dingtalk --config.file=/usr/local/prometheus>

Dec 01 14:29:35 nacos-03 systemd[1]: Started dingtalk.service.
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.918Z caller=main.go:60 msg=">
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.918Z caller=main.go:61 msg=">
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.918Z caller=coordinator.go:8>
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.919Z caller=coordinator.go:9>
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.919Z caller=main.go:98 compo>
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: ts=2021-12-01T06:29:35.920Z caller=main.go:114 component=confi>
Dec 01 14:29:35 nacos-03 prometheus-webhook-dingtalk[26590]: level=info ts=2021-12-01T06:29:35.920Z caller=web.go:210 compo>

验证dingtalk端口

~# lsof -i :8060
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
prometheu 26590 root    3u  IPv6 100982      0t0  TCP *:8060 (LISTEN)

设置alertmanager

修改alertmanager.yml

~# cat /usr/local/alertmanager/alertmanager.yml
global:
  resolve_timeout: 5m

route:
  group_by: ['alertname', 'severity', 'namespace']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 10s
  receiver: 'dingding.webhook1'
  routes:
  - receiver: 'dingding.webhook1'
    match:
      team: DevOps
    group_wait: 10s
    group_interval: 15s
    repeat_interval: 3h
  - receiver: 'dingding.webhook.all'
    match:
      team: SRE
    group_wait: 10s
    group_interval: 15s
    repeat_interval: 3h

receivers:
- name: 'dingding.webhook1'
  webhook_configs:
  - url: 'http://192.168.174.105:8060/dingtalk/webhook1/send'
    send_resolved: true
- name: 'dingding.webhook.all'
  webhook_configs:
  - url: 'http://192.168.174.105:8060/dingtalk/webhook_mention_all/send'
    send_resolved: true

重启Alertmanager服务

~# systemctl restart alertmanager.service 

验证告警信息

1719125810513.webp