章工运维 章工运维
首页
  • linux
  • windows
  • 中间件
  • 监控
  • 网络
  • 存储
  • 安全
  • 防火墙
  • 数据库
  • 系统
  • docker
  • 运维工具
  • other
  • elk
  • K8S
  • ansible
  • Jenkins
  • GitLabCI_CD
  • 随笔
  • 面试
  • 工具
  • 收藏夹
  • Shell
  • python
  • golang
友链
  • 索引

    • 分类
    • 标签
    • 归档
    • 首页 (opens new window)
    • 关于我 (opens new window)
    • 图床 (opens new window)
    • 评论 (opens new window)
    • 导航栏 (opens new window)
周刊
GitHub (opens new window)

章工运维

业精于勤,荒于嬉
首页
  • linux
  • windows
  • 中间件
  • 监控
  • 网络
  • 存储
  • 安全
  • 防火墙
  • 数据库
  • 系统
  • docker
  • 运维工具
  • other
  • elk
  • K8S
  • ansible
  • Jenkins
  • GitLabCI_CD
  • 随笔
  • 面试
  • 工具
  • 收藏夹
  • Shell
  • python
  • golang
友链
  • 索引

    • 分类
    • 标签
    • 归档
    • 首页 (opens new window)
    • 关于我 (opens new window)
    • 图床 (opens new window)
    • 评论 (opens new window)
    • 导航栏 (opens new window)
周刊
GitHub (opens new window)
  • linux

  • windows

  • 中间件

  • 网络

  • 安全

  • 存储

  • 防火墙

  • 数据库

  • 系统

  • docker

  • other

  • 监控

    • zabbix

    • prometheus

      • prometheus监控、告警与存储
      • 使用docker-compose搭建promethes+grafana监控系统
      • prometheus添加钉钉消息告警
      • alertmanager实现某个时间段静默某些告警项
        • 【0】解决思路分析
        • 【1】API
          • API Version
          • POST /api/v2/alerts
          • GET /api/v2/alerts
          • GET /api/v2/alerts/groups
        • 【2】API使用实践
          • (2.1)基本案例
        • 【3】最佳实践(定时静默)
          • (3.1)立马构造出故障的告警信息
          • (3.2)构建好静默,获取静默语句
          • (3.3)使用生成静默规则
          • (3.4)自动化脚本
      • prometheus设置表达式触发告警
      • 监控MySQL运行状态:MySQLD Exporter
      • alertmanager设置定时静默告警脚本
      • 构建高大上的黑盒监控平台
      • prometheus使用一个redis_exporter监控所有redis实例
      • alertmanager-webhook与API
  • 运维
  • 监控
  • prometheus
章工运维
2023-01-11
目录

alertmanager实现某个时间段静默某些告警项

# 【0】解决思路分析

需求:我需要每天屏蔽某个时间段的某些告警项

比如:凌晨4点会异地备份,导致流量报警,我想屏蔽每天4点-4点30分的该告警项

思路:

直接操作是没有这个步骤的,曲线救国吧;

(1)定时任务,每天凌晨四点的 silences

(2)定时任务中使用 alertmanager api 来建设

# 【1】API

官网:AlertManager的API · prometheus · 看云 (opens new window)

# API Version

AlertManager有两套API,v1与v2,不过两套API的内部逻辑基本是一致的,调用哪套都没有关系。v1没有相关的文档,不过我们可以找到v2的相关文档。
API-v2的swagger文件的链接为:

alertmanager/openapi.yaml at main · prometheus/alertmanager · GitHub (opens new window)

把这个文件的内容拷贝到 https://editor.swagger.io (opens new window) 里面,便可以查看API。下面罗列了v2版本的所有API:

# Alert
GET    /api/v2/alerts
POST   /api/v2/alerts

# AlertGroup
GET    /api/v2/alerts/groups  

# General
GET    /api/v2/status

# Receiver
GET    /api/v2/receivers

# Silence
GET    /api/v2/silences
POST   /api/v2/silences
GET    /api/v2/silence/{silenceID}
DELETE /api/v2/silence/{silenceID}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

其中最重要的是Alert与AlertGroup的那三个API,接下来我们详细地讲解一下

使用其中提供的url:http://127.0.0.1:9090/api/v1/alerts ,可以获取到报警的json信息,即可获得json的格式

[{"annotations":
{"description":
"localhost:9100 of job exporter has been down for more than 5 minutes.",
"summary":"Instance localhost:9100 down"},
"endsAt":"2021-11-25T08:13:56.026Z",
"fingerprint":"d44e90ffc89b2ea1",
"receivers":[{"name":"mail"}],
"startsAt":"2021-11-25T07:36:11.026Z",
"status":{"inhibitedBy":[],"silencedBy":[],"state":"active"},
"updatedAt":"2021-11-25T16:09:56.030+08:00",
"generatorURL":"http://localhost.localdomain:9090/graph?g0.expr=up+%3D%3D+0\u0026g0.tab=1",
"labels":{"alertname":"InstanceDown","instance":"localhost:9100","job":"exporter","severity":"page"}
}]
1
2
3
4
5
6
7
8
9
10
11
12
13

编写好的json文件可以使用curl语句进行测试

curl -i -k -H "Content-type: application/json" -X POST -d
[{"annotations":
{"description":
"localhost:9100 of job exporter has been down for more than 5 minutes.",
"summary":"Instance localhost:9100 down"},
"endsAt":"2021-11-25T08:13:56.026Z",
"fingerprint":"d44e90ffc89b2ea1",
"receivers":[{"name":"mail"}],
"startsAt":"2021-11-25T07:36:11.026Z",
"status":{"inhibitedBy":[],"silencedBy":[],"state":"active"},
"updatedAt":"2021-11-25T16:09:56.030+08:00",
"generatorURL":"http://localhost.localdomain:9090/graph?g0.expr=up+%3D%3D+0\u0026g0.tab=1",
"labels":{"alertname":"InstanceDown","instance":"localhost:9100","job":"exporter","severity":"page"}
}]
"http://192.168.217.22:9093/api/v1/alerts"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

# POST /api/v2/alerts

Body参数示例如下:

[
  {
    "labels": {"label": "value", ...},
    "annotations": {"label": "value", ...},
    "generatorURL": "string",
    "startsAt": "2020-01-01T00:00:00.000+08:00", # optional
    "endsAt": "2020-01-01T01:00:00.000+08:00" # optional
  },
  ...
]
1
2
3
4
5
6
7
8
9
10

Body参数是一个数组,里面是一个个的告警。其中startsAt与endsAt是可选参数,且格式必须是上面的那种,不能是时间戳。

# GET /api/v2/alerts

Query参数如下,以下参数用来过滤告警

参数名 类型 默认值 是否必须 其他说明
active bool true optional -
silenced bool true optional -
inhibited bool true optional -
unprocessed bool true optional -
filter array[string] 无 optional -
receiver string 无 optional -

其返回值如下:

[
  {
    "labels": {"label": "value", ...},
    "annotations": {"label": "value", ...},
    "generatorURL": "string",
    "startsAt": "2020-01-01T00:00:00.000+08:00", 
    "endsAt": "2020-01-01T01:00:00.000+08:00",
    "updatedAt": "2020-01-01T01:00:00.000+08:00",
    "fingerprint": "string"
    "receivers": [{"name": "string"}, ...],
    "status": {
      "state": "active", # active, unprocessed, ...
      "silencedBy": ["string", ...],
      "inhibitedBy": ["string", ...]
  },
  ...
]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

# GET /api/v2/alerts/groups

Query参数如下,以下参数用来过滤告警

参数名 类型 默认值 是否必须 其他说明
active bool true optional -
silenced bool true optional -
inhibited bool true optional -
unprocessed bool true optional -
filter array[string] 无 optional -
receiver string 无 optional -

其返回值如下:

[
  { 
    "labels": {"label": "value", ...},
    "receiver": {"name": "string"},  # 注意与alert的receivers的区别
    "alerts": [alert1, alert2, ...]  # alert的Json结构与 `GET /api/v2/alerts` 返回值中的结构一致
  },
  ...
]
1
2
3
4
5
6
7
8

# 【2】API使用实践

参考:实践:AlertManager · prometheus · 看云 (opens new window)

# (2.1)基本案例

构造测试数据:

# 构造测试数据

aa='[
    {
        "Labels": {
            "alertname": "NodeCpuPressure",
            "IP": "192.168.2.101"
        },
        "Annotations": {
            "summary": "NodeCpuPressure, IP: 192.168.2.101, Value: 90%, Threshold: 85%"
        },
        "StartsAt": "2020-02-17T23:00:00.000+08:00", 
        "EndsAt": "2023-02-18T23:00:00.000+08:00"
    }
]'

# 执行 POST
curl http://127.0.0.1:9093/api/v2/alerts -XPOST -H'Content-Type: application/json' -d"$aa"
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

如果是使用V1版本,可以不用加  -H'Content-Type: application/json'

595b1be45e21be8e.png

怎么取消掉?把结束时间修改一下就好了;

我们可以通过  curl -X GET localhost:9093/api/v2/alerts  获取

e5fbd0fb17d4e6eb.png

然后修改其时间

427e43321ae4916c.png

然后我们在上述的操作前:

活跃且没被静默的 Alert

操作后:

# 【3】最佳实践(定时静默)

注意,静默条目所生成、显示的时间是UTC时间

# (3.1)立马构造出故障的告警信息

# (3.2)构建好静默,获取静默语句

最终结果:

curl '127.0.0.1:9093/api/v2/silences' \
  -H 'Connection: keep-alive' \
  -H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36' \
  -H 'Content-Type: application/json' \
  -H 'Accept: */*' \
  -H 'Origin: http://127.0.0.1:9093' \
  -H 'Referer: http://127.0.0.1:9093/' \
  -H 'Accept-Language: zh-CN,zh;q=0.9' \
  -H 'Cookie: grafana_session=a504e4a78501efe7009fa8b7587d5fb4' \
  --data-raw '{"matchers":[{"name":"alertname","value":"磁盘读吞吐过高","isRegex":false},{"name":"instance","value":"127.0.0.1:9182","isRegex":false},{"name":"job","value":"测试_win","isRegex":false},{"name":"name","value":"测试数据库鸭[47.103.57.124]","isRegex":false},{"name":"severity","value":"warning","isRegex":false}],"startsAt":"2022-03-17T07:44:45.446Z","endsAt":"2022-03-17T09:44:45.446Z","createdBy":"guochaoqun","comment":"tmp","id":null}' \
  --compressed \
  --insecure
1
2
3
4
5
6
7
8
9
10
11
12

# (3.3)使用生成静默规则

V1方式

curl -X POST http://127.0.0.1:9093/api/v1/silences -d'{"matchers":[{"name":"alertname","value":"磁盘读吞 吐过高","isRegex":false},{"name":"instance","value":"47.103.57.124:9182","isRegex":false},{"name":"job","value":"test","isRegex":false},{"name":"name","value":"test库[47.103.57.124]","isRegex":false},{"name":"severity","value":"warning","isRegex":false}],"startsAt":"2022-03-17T07:44:45.446Z","endsAt":"2022-03-17T09:44:45.446Z","createdBy":"guochaoqun","comment":"tmp","id":null}' 
1

V2 方式,就必须要加 -H

curl '127.0.0.1:9093/api/v2/silences' \-H 'Content-Type: application/json' \
  --data '{"matchers":[{"name":"alertname","value":"磁盘读吞吐过高","isRegex":false},{"name":"instance","value":"127.0.0.1:9182","isRegex":false},{"name":"job","value":"测试_win","isRegex":false},{"name":"name","value":"测试库[47.103.57.124]","isRegex":false},{"name":"severity","value":"warning","isRegex":false}],"startsAt":"2022-03-17T07:44:45.446Z","endsAt":"2022-03-17T09:44:45.446Z","createdBy":"guochaoqun","comment":"tmp","id":null}' \
  --compressed \
  --insecure
1
2
3
4

# (3.4)自动化脚本

now_date=`date +%F --date="-1 day"`
curl 'http://127.0.0.1:9093/api/v2/silences' \
-H 'Content-Type: application/json' \
--data '{"matchers":[{"name":"alertname","value":"磁盘读吞吐过高","isRegex":false},{"name":"instance","value":"47.103.57.124:9182","isRegex":false},{"name":"job","value":"金游世界_win","isRegex":false},{"name":"name","value":"8833主库_详细对局[47.103.57.124]","isRegex":false},{"name":"severity","value":"warning","isRegex":false}],"startsAt":"'${now_date}'T20:10:45.446Z","endsAt":"'${now_date}'T20:40:45.446Z","createdBy":"guochaoqun","comment":"tmp","id":null}' \
--compressed --insecure
1
2
3
4
5

原文链接:

https://www.cnblogs.com/gered/p/15946822.html

微信 支付宝
上次更新: 2024/12/30, 21:04:26

← prometheus添加钉钉消息告警 prometheus设置表达式触发告警→

最近更新
01
shell脚本模块集合
05-13
02
生活小技巧(认知版)
04-29
03
生活小技巧(防骗版)
04-29
更多文章>
Theme by Vdoing | Copyright © 2019-2025 | 点击查看十年之约 | 鄂ICP备2024072800号
  • 跟随系统
  • 浅色模式
  • 深色模式
  • 阅读模式