Coder Social home page Coder Social logo

mail-sender's Introduction

mail-sender

Nightingale的理念,是将告警事件扔到redis里就不管了,接下来由各种sender来读取redis里的事件并发送,毕竟发送报警的方式太多了,适配起来比较费劲,希望社区同仁能够共建。

最常见的告警发送方式是邮件,所以这里我写了一个mail-sender,供参考

compile

cd $GOPATH/src
mkdir -p github.com/n9e
cd github.com/n9e
git clone https://github.com/n9e/mail-sender.git
cd mail-sender
./control build

如上编译完就可以拿到二进制了。

configuration

读取告警事件,自然要给出redis的连接地址;发送邮件,自然要给出smtp配置;直接修改etc/mail-sender.yml即可

pack

编译完成之后可以打个包扔到线上去跑,将二进制和配置文件打包即可:

tar zcvf mail-sender.tar.gz mail-sender etc/mail.html etc/mail-sender.yml

test

配置etc/mail-sender.yml,相关配置修改好,我们先来测试一下smtp是否好使, ./mail-sender -t [email protected],程序会自动读取etc目录下的配置文件,发一封测试邮件给[email protected]

run

如果测试邮件发送没问题,扔到线上跑吧,使用systemd或者supervisor之类的托管起来,systemd的配置实例:

$ cat mail-sender.service
[Unit]
Description=Nightingale mail sender
After=network-online.target
Wants=network-online.target

[Service]
User=root
Group=root

Type=simple
ExecStart=/home/n9e/mail-sender
WorkingDirectory=/home/n9e

Restart=always
RestartSec=1
StartLimitInterval=0

[Install]
WantedBy=multi-user.target

mail-sender's People

Contributors

ulricqin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mail-sender's Issues

内网环境下的邮件告警失败

runner.cwd: /root/gopath/src/github.com/didi/nightingale
runner.hostname: **************
parse configuration file: /root/gopath/src/github.com/didi/nightingale/etc/mail-sender.yml
panic: dial tcp: lookup mail.*******.com on [::1]:53: read udp [::1]:41768->[::1]:53: read: connection refused

goroutine 1 [running]:
github.com/n9e/mail-sender/config.TestSMTP(0xc000085460, 0x1, 0x1)
C:/Users/12396/Desktop/mail-sender-master/config/funcs.go:48 +0x685
main.main()
C:/Users/12396/Desktop/mail-sender-master/main.go:55 +0xcb

邮件发送失败

您好。
我这边在试用夜莺的时候,测试邮件发送,结果邮件发送失败,并且没有mail-sender没有看到相应日志。
我的步骤是这样的:

  1. 使用 -t 测试,能收到测试邮件
  2. 启动mail-sender,启动成功
  3. 模拟告警,monapi也看到对应日志,页面也看到对应的报警
  4. 检查邮箱,没有收到告警邮件。mail-sender日志没有变化

夜莺版本:3.1.6

以下是 monapi 的日志。

2020-11-24 10:19:12.906605 INFO alarm/event_consumer.go:183 converge max counts: 1 reached, currend: 1, event hashid: 707578489827945943
2020-11-24 10:19:12.907895 INFO alarm/event_consumer.go:283 set event status succ, event hasid: 707578489827945943, status: converge
2020-11-24 10:19:12.908797 INFO alarm/event_consumer.go:290 set event_cur status succ, event hashid: 707578489827945943, status: converge
2020-11-24 10:22:13.006206 INFO alarm/event_consumer.go:183 converge max counts: 1 reached, currend: 1, event hashid: 707578489827945943
2020-11-24 10:22:13.008335 INFO alarm/event_consumer.go:283 set event status succ, event hasid: 707578489827945943, status: converge
2020-11-24 10:22:13.009272 INFO alarm/event_consumer.go:290 set event_cur status succ, event hashid: 707578489827945943, status: converge
2020-11-24 10:22:52.814817 INFO alarm/event_merge.go:63 hset event to mon-merge succ, event: &{Id:95 Sid:3 Sname:test_alert Nid:14 NodePath:A_group.B_project.C_app.server_resource CurNodePath:A_group.B_project.C_app.server_resource Endpoint:172.18.73.150 Priority:3 EventType:recovery Category:1 Status:0 HashId:707578489827945943 Etime:1606184570 Value:proc.port.listen: 1 Info: proc.port.listen(all,180s) = 0 Created:2020-11-24 10:22:52.811910391 +0800 CST Detail:[{"metric":"proc.port.listen","tags":{"port":"9999","service":"port_9999"},"points":[{"timestamp":1606184570,"value":1.000000,"extra":""},{"timestamp":1606184560,"value":0.000000,"extra":""},{"timestamp":1606184550,"value":0.000000,"extra":""},{"timestamp":1606184540,"value":0.000000,"extra":""},{"timestamp":1606184530,"value":0.000000,"extra":""},{"timestamp":1606184520,"value":0.000000,"extra":""},{"timestamp":1606184510,"value":0.000000,"extra":""},{"timestamp":1606184500,"value":0.000000,"extra":""},{"timestamp":1606184490,"value":0.000000,"extra":""},{"timestamp":1606184480,"value":0.000000,"extra":""},{"timestamp":1606184470,"value":0.000000,"extra":""},{"timestamp":1606184460,"value":0.000000,"extra":""},{"timestamp":1606184450,"value":0.000000,"extra":""},{"timestamp":1606184440,"value":0.000000,"extra":""},{"timestamp":1606184430,"value":0.000000,"extra":""},{"timestamp":1606184420,"value":0.000000,"extra":""},{"timestamp":1606184410,"value":0.000000,"extra":""},{"timestamp":1606184400,"value":0.000000,"extra":""}]}] Users:[1] Groups:[] Runbook: NeedUpgrade:0 AlertUpgrade:{"users":"[1]","groups":"[]","duration":60,"level":1} RecvUserIDs:[1] RealUpgrade:false WorkGroups:[] CurNid:}
2020-11-24 10:23:50.846811 INFO alarm/event_consumer.go:283 set event status succ, event hasid: 707578489827945943, status: send
2020-11-24 10:23:50.847067 INFO alarm/event_merge.go:145 hdel events succ, eventStringsHashKey: [mon-merge {"id":95,"sid":3,"sname":"test_alert","nid":14,"node_path":"A_group.B_project.C_app.server_resource","cur_node_path":"A_group.B_project.C_app.server_resource","endpoint":"172.18.73.150","priority":3,"event_type":"recovery","category":1,"status":0,"hashid":707578489827945943,"etime":1606184570,"value":"proc.port.listen: 1","info":" proc.port.listen(all,180s) = 0","created":"2020-11-24T10:22:52.811910391+08:00","detail":"[{\"metric\":\"proc.port.listen\",\"tags\":{\"port\":\"9999\",\"service\":\"port_9999\"},\"points\":[{\"timestamp\":1606184570,\"value\":1.000000,\"extra\":\"\"},{\"timestamp\":1606184560,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184550,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184540,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184530,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184520,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184510,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184500,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184490,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184480,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184470,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184460,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184450,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184440,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184430,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184420,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184410,\"value\":0.000000,\"extra\":\"\"},{\"timestamp\":1606184400,\"value\":0.000000,\"extra\":\"\"}]}]","users":"[1]","groups":"[]","runbook":"","need_upgrade":0,"alert_upgrade":"{\"users\":\"[1]\",\"groups\":\"[]\",\"duration\":60,\"level\":1}","recv_user_ids":[1],"real_upgrade":false,"work_groups":null,"cur_nid":""}]
2020-11-24 10:23:50.849086 INFO notify/notify.go:78 sendMail: &{Id:95 Sid:3 Sname:test_alert Nid:14 NodePath:A_group.B_project.C_app.server_resource CurNodePath:A_group.B_project.C_app.server_resource Endpoint:172.18.73.150 Priority:3 EventType:recovery Category:1 Status:0 HashId:707578489827945943 Etime:1606184570 Value:proc.port.listen: 1 Info: proc.port.listen(all,180s) = 0 Created:2020-11-24 10:22:52.811910391 +0800 CST Detail:[{"metric":"proc.port.listen","tags":{"port":"9999","service":"port_9999"},"points":[{"timestamp":1606184570,"value":1.000000,"extra":""},{"timestamp":1606184560,"value":0.000000,"extra":""},{"timestamp":1606184550,"value":0.000000,"extra":""},{"timestamp":1606184540,"value":0.000000,"extra":""},{"timestamp":1606184530,"value":0.000000,"extra":""},{"timestamp":1606184520,"value":0.000000,"extra":""},{"timestamp":1606184510,"value":0.000000,"extra":""},{"timestamp":1606184500,"value":0.000000,"extra":""},{"timestamp":1606184490,"value":0.000000,"extra":""},{"timestamp":1606184480,"value":0.000000,"extra":""},{"timestamp":1606184470,"value":0.000000,"extra":""},{"timestamp":1606184460,"value":0.000000,"extra":""},{"timestamp":1606184450,"value":0.000000,"extra":""},{"timestamp":1606184440,"value":0.000000,"extra":""},{"timestamp":1606184430,"value":0.000000,"extra":""},{"timestamp":1606184420,"value":0.000000,"extra":""},{"timestamp":1606184410,"value":0.000000,"extra":""},{"timestamp":1606184400,"value":0.000000,"extra":""}]}] Users:[1] Groups:[] Runbook: NeedUpgrade:0 AlertUpgrade:{"users":"[1]","groups":"[]","duration":60,"level":1} RecvUserIDs:[1] RealUpgrade:false WorkGroups:[] CurNid:}

谢谢~

在Nightingale中配置邮件报警后,无法收到报警邮件

1、在Nightingale中通过编译安装并配置邮件告警,mail-sender.yml 配置如下:

smtp:
  host: "smtp.qq.com"
  port: 465
  user: "13443*[email protected]"
  pass: "**********"
  insecureSkipVerify: true

测试邮件发送 ./mail-sender -t 13443*[email protected],可以发送,也收的到

2、但是在Nightingale配置好用户的邮件后,根据报警策略触发诸多报警,但设置的邮件账号无法收到报警的邮件信息
3、创建很多新的告警策略,均触发,通知结果也显示已发送
4、查看mail-sender 的 logs,显示 “Authentication unsuccessful”,如下,测试OK,如何会Authentication unsuccessful,如何解决

2020-08-05 14:59:52.928990 INFO cron/sender.go:85 hashid: 671834325638967801: subject: [P1 告警]测试CPU idle - 10.07.24.210(Ubuntu-210), tos: [13443*[email protected] 13443*[email protected]], error: 535 5.7.3 Authentication unsuccessful
2020-08-05 14:59:52.929017 INFO cron/sender.go:86 hashid: 671834325638967801: endpoint: 10.07.24.210(Ubuntu-210), metric: cpu.idle, tags: 
2020-08-05 14:59:54.746297 INFO cron/sender.go:85 hashid: 906849324607340805: subject: [P1 告警]测试CPU idle - 192.168.44.1(笔记本电脑), tos: [13443*[email protected] 13443*[email protected]], error: 535 5.7.3 Authentication unsuccessful
2020-08-05 14:59:54.746608 INFO cron/sender.go:86 hashid: 906849324607340805: endpoint: 192.168.44.1(笔记本电脑), metric: cpu.idle, tags: 
2020-08-05 14:59:59.330719 INFO cron/sender.go:85 hashid: 144856238769219143: subject: [P1 告警]测试CPU idle - 10.07.29.80(GPU-TEST), tos: [13443*[email protected] 13443*[email protected]], error: 535 5.7.3 Authentication unsuccessful

night1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.