tencentyun / tencentcloud-exporter Goto Github PK
View Code? Open in Web Editor NEWTencentCloud Prometheus Exporter
License: MIT License
TencentCloud Prometheus Exporter
License: MIT License
现象: Prometheus内搜不到特定label的指标
过程:
reload ${namespace} instances every 300 minutes
日志:
level=info ts=2021-07-20T08:57:08.074Z caller=qcloud_exporter.go:86 msg="Starting qcloud_exporter" version="(version=, branch=, revision=)" level=info ts=2021-07-20T08:57:08.074Z caller=qcloud_exporter.go:87 msg="Build context" build_context="(go=go1.16.5, user=, date=)" level=info ts=2021-07-20T08:57:08.080Z caller=qcloud_exporter.go:94 msg="Load config ok" level=info ts=2021-07-20T08:57:08.634Z caller=cache.go:76 msg="Reload metric meta cache" namespace=QCE/CMONGO num=34 level=info ts=2021-07-20T08:57:09.521Z caller=cache.go:104 msg="Reload instance cache" num=4 changed=4 level=info ts=2021-07-20T08:57:09.522Z caller=product.go:227 msg="Init all query ok" Namespace=QCE/CMONGO numMetric=34 numSeries=160 level=info ts=2021-07-20T08:57:09.522Z caller=collector.go:117 msg="Create product collecter ok" Namespace=QCE/CMONGO level=info ts=2021-07-20T08:57:09.522Z caller=collector.go:124 msg="reload QCE/CMONGO instances every 300 minutes" level=info ts=2021-07-20T08:57:09.765Z caller=cache.go:76 msg="Reload metric meta cache" namespace=QCE/REDIS_MEM num=75 level=info ts=2021-07-20T08:57:10.187Z caller=cache.go:104 msg="Reload instance cache" num=16 changed=16 level=info ts=2021-07-20T08:57:11.838Z caller=product.go:227 msg="Init all query ok" Namespace=QCE/REDIS_MEM numMetric=26 numSeries=3896 level=info ts=2021-07-20T08:57:11.838Z caller=collector.go:117 msg="Create product collecter ok" Namespace=QCE/REDIS_MEM level=info ts=2021-07-20T08:57:12.339Z caller=cache.go:76 msg="Reload metric meta cache" namespace=QCE/CDB num=302 level=info ts=2021-07-20T08:57:12.748Z caller=cache.go:104 msg="Reload instance cache" num=135 changed=135 level=info ts=2021-07-20T08:57:12.754Z caller=product.go:227 msg="Init all query ok" Namespace=QCE/CDB numMetric=11 numSeries=1485 level=info ts=2021-07-20T08:57:12.754Z caller=collector.go:117 msg="Create product collecter ok" Namespace=QCE/CDB level=info ts=2021-07-20T08:57:12.988Z caller=cache.go:76 msg="Reload metric meta cache" namespace=QCE/NAT_GATEWAY num=6 level=info ts=2021-07-20T08:57:13.405Z caller=cache.go:104 msg="Reload instance cache" num=1 changed=1 level=info ts=2021-07-20T08:57:13.405Z caller=product.go:227 msg="Init all query ok" Namespace=QCE/NAT_GATEWAY numMetric=6 numSeries=6 level=info ts=2021-07-20T08:57:13.405Z caller=collector.go:117 msg="Create product collecter ok" Namespace=QCE/NAT_GATEWAY level=info ts=2021-07-20T08:57:13.405Z caller=collector.go:124 msg="reload QCE/NAT_GATEWAY instances every 300 minutes" level=info ts=2021-07-20T08:57:13.405Z caller=collector.go:131 msg="Create all product collecter ok" num=4 level=info ts=2021-07-20T08:57:13.405Z caller=qcloud_exporter.go:114 msg="Listening on" address=:9123后面都是
Start collect
...... Collect done
没有有用的内容
level=info ts=2021-08-12T11:34:35.968Z caller=qcloud_exporter.go:86 msg="Starting qcloud_exporter" version="(version=, branch=, revision=)" level=info ts=2021-08-12T11:34:36.044Z caller=qcloud_exporter.go:87 msg="Build context" build_context="(go=go1.16.5, user=, date=)" level=info ts=2021-08-12T11:34:36.091Z caller=qcloud_exporter.go:94 msg="Load config ok" level=info ts=2021-08-12T11:34:36.734Z caller=cache.go:76 msg="Reload metric meta cache" namespace=QCE/CMONGO num=34 level=info ts=2021-08-12T11:34:37.634Z caller=cache.go:104 msg="Reload instance cache" num=4 changed=4 level=info ts=2021-08-12T11:34:37.635Z caller=product.go:227 msg="Init all query ok" Namespace=QCE/CMONGO numMetric=34 numSeries=160 level=info ts=2021-08-12T11:34:37.635Z caller=collector.go:117 msg="Create product collecter ok" Namespace=QCE/CMONGO level=info ts=2021-08-12T11:34:37.635Z caller=collector.go:124 msg="reload QCE/CMONGO instances every 300 minutes" level=info ts=2021-08-12T11:34:37.882Z caller=cache.go:76 msg="Reload metric meta cache" namespace=QCE/REDIS_MEM num=77 level=info ts=2021-08-12T11:34:38.328Z caller=cache.go:104 msg="Reload instance cache" num=16 changed=16 level=info ts=2021-08-12T11:34:39.985Z caller=product.go:227 msg="Init all query ok" Namespace=QCE/REDIS_MEM numMetric=26 numSeries=2720 level=info ts=2021-08-12T11:34:39.985Z caller=collector.go:117 msg="Create product collecter ok" Namespace=QCE/REDIS_MEM level=info ts=2021-08-12T11:34:40.379Z caller=cache.go:76 msg="Reload metric meta cache" namespace=QCE/CDB num=302 level=info ts=2021-08-12T11:34:40.736Z caller=cache.go:104 msg="Reload instance cache" num=134 changed=134 level=info ts=2021-08-12T11:34:40.744Z caller=product.go:227 msg="Init all query ok" Namespace=QCE/CDB numMetric=11 numSeries=1474 level=info ts=2021-08-12T11:34:40.744Z caller=collector.go:117 msg="Create product collecter ok" Namespace=QCE/CDB level=info ts=2021-08-12T11:34:40.992Z caller=cache.go:76 msg="Reload metric meta cache" namespace=QCE/NAT_GATEWAY num=8 level=info ts=2021-08-12T11:34:41.195Z caller=cache.go:104 msg="Reload instance cache" num=1 changed=1 level=info ts=2021-08-12T11:34:41.195Z caller=product.go:227 msg="Init all query ok" Namespace=QCE/NAT_GATEWAY numMetric=8 numSeries=8 level=info ts=2021-08-12T11:34:41.195Z caller=collector.go:117 msg="Create product collecter ok" Namespace=QCE/NAT_GATEWAY level=info ts=2021-08-12T11:34:41.195Z caller=collector.go:124 msg="reload QCE/NAT_GATEWAY instances every 300 minutes" level=info ts=2021-08-12T11:34:41.195Z caller=collector.go:131 msg="Create all product collecter ok" num=4 level=info ts=2021-08-12T11:34:41.195Z caller=qcloud_exporter.go:114 msg="Listening on" address=:9123
使用的镜像: boringcat/qcloud-exporter:v2.3.0
FROM golang:alpine as builderARG VERSION
RUN set -xe
; [ -z "${VERSION}" ] && apk add --update curl jq
&& VERSION=curl -s https://api.github.com/repos/tencentyun/tencentcloud-exporter/releases/latest | jq -r .name
; VERSION=${VERSION##*v}
&& wget https://github.com/tencentyun/tencentcloud-exporter/archive/refs/tags/v${VERSION}.tar.gz -O /tmp/v${VERSION}.tar.gz
&& tar xf /tmp/v${VERSION}.tar.gz tencentcloud-exporter-${VERSION}
&& cd tencentcloud-exporter-${VERSION}
&& go build -o /qcloud-exporter cmd/qcloud-exporter/qcloud_exporter.goFROM alpine
COPY --from=builder /qcloud-exporter /usr/local/bin/qcloud-exporter
ENTRYPOINT [ "/usr/local/bin/qcloud-exporter" ]
EXPOSE 9123
部分配置如下:
products:
- namespace: QCE/REDIS_MEM
all_instances: true
extra_labels: [InstanceName, WanIp]
instance_filters:
Status: 2
relod_interval_minutes: 300
如项获取CVM的IP信息作为label应该如何填写?
增加对腾讯云 Elasticsearch 产品监控指标的支持。
如题
现在只能监控一个租户的数据,如果我有多个账号都需要监控,那啷个办嘛
你好,最在在测试这个项目,我们导入了 66 的CDB(MySQL)的Metric,有六台 CDB,ratelimit 配置为 10 的时候提示 超过了每秒频率上限
然后我看了一下代码,按照我的理解 ratelimit 应该是全局的,对 getMonitorDataByMultipleKeys
调用做限制,但是我发现不太管用,然后和我看到 rateLimitCheck
函数中有一条判断,if sleepCount > 某个数值
就不 sleep 了,我的理解是 ratelimit 就不生效了?我尝试把这个参数调整为 1000 秒后,就能够正常拿到数据了。
我对 Go 不太了解,所以请教一下这个参数有什么意义以及 rate limit 是否是全局的?还是只是对单个 goroutine 内生效,谢谢。
如果要监控多个region,是不是部署多个实例?
请问能否支持腾讯云clickhouse产品 ,并且列出export的指标和clickhouse文档中的指标对应关系
我看文档:https://mc.qcloudimg.com/static/qc_doc/ef1ccf096001bd855aac0cc56d30a9a2/6140.v20161103185912.pdf 现在mysql namespace叫做qce/cdb
, 然而用tc_namespace: QCE/CDB
,会直接报错:
FATA[0000] not support product [cdb] yet, need monitor api code. source="qcloud_exporter.go:56"
改为QCE/mysql
后正常。
根据README:
tc_namespace: xxx/CVM #命名空间(xxx是a-z随意定的名字, 而后面的cvm是固定的,是每个产品的名字)
似乎把QCE
换成随便一个名字都可以?
参见日志:
❯ go run cmd/qcloud-exporter/qcloud_exporter.go --config.file="/Users/chaim/Work/own/tencent_exporter/cdn.yml"
level=info ts=2020-09-09T07:23:49.424Z caller=qcloud_exporter.go:85 msg="Starting qcloud_exporter" version="(version=, branch=, revision=)"
level=info ts=2020-09-09T07:23:49.424Z caller=qcloud_exporter.go:86 msg="Build context" build_context="(go=go1.14.2, user=, date=)"
level=info ts=2020-09-09T07:23:49.425Z caller=qcloud_exporter.go:93 msg="Load config ok"
level=info ts=2020-09-09T07:23:52.878Z caller=cache.go:65 msg="Reload metric meta cache" namespace=QCE/CDN num=8
level=info ts=2020-09-09T07:23:52.878Z caller=product.go:176 msg="Init all query ok" Namespace=QCE/CDN numMetric=8 numSeries=16
level=info ts=2020-09-09T07:23:52.878Z caller=collector.go:109 msg="Create product collecter ok" Namespace=QCE/CDN
level=info ts=2020-09-09T07:23:52.878Z caller=collector.go:112 msg="Create all product collecter ok" num=1
level=info ts=2020-09-09T07:23:52.878Z caller=qcloud_exporter.go:113 msg="Listening on" address=:9123
level=info ts=2020-09-09T07:24:03.121Z caller=collector.go:69 msg="Start collect......" name=QCE/CDN
level=info ts=2020-09-09T07:24:03.121Z caller=collector.go:70 msg="test <--->" QCE/CDN=(MISSING)
level=warn ts=2020-09-09T07:24:05.673Z caller=repository.go:200 msg="Instance has not monitor data" metric=BackOriginFailRate dimension="map[domain:xxx.com projectId:0]"
func (h *redisMemHandler) getNodeSeries(m *metric.TcmMetric, ins instance.TcInstance) ([]*metric.TcmSeries, error) {
var series []*metric.TcmSeries
resp, err := h.nodeRepo.GetNodeInfo(ins.GetInstanceId())
if err != nil {
return nil, err
}
for _, node := range resp.Response.Redis {
ql := map[string]string{
h.monitorQueryKey: ins.GetMonitorQueryKey(),
"rnodeid": *node.NodeId,
"rnoderole": *node.NodeRole,
}
s, err := metric.NewTcmSeries(m, ql, ins)
if err != nil {
return nil, err
}
series = append(series, s)
}
return series, nil
}
公网负载均衡,数据采集与console查询数据、以及https://console.cloud.tencent.com/api/explorer?Product=monitor&Version=2018-07-24&Action=GetMonitorData&SignVersion=在线接口调用不一致
Redis集群延迟的指标 数据无法获取,日志报错:
level=warn ts=2020-09-09T08:30:54.080Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencySetMin dimension=map[instanceid:crs-8xx]
level=warn ts=2020-09-09T08:30:54.080Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencySetMin dimension=map[instanceid:crs-hxx]
level=warn ts=2020-09-09T08:30:54.080Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencySetMin dimension=map[instanceid:crs-6xx]
level=warn ts=2020-09-09T08:30:54.080Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencySetMin dimension=map[instanceid:crs-2xx]
level=warn ts=2020-09-09T08:30:54.991Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyOtherMin dimension=map[instanceid:crs-8xx]
level=warn ts=2020-09-09T08:30:54.991Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyOtherMin dimension=map[instanceid:crs-hxx]
level=warn ts=2020-09-09T08:30:54.991Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyOtherMin dimension=map[instanceid:crs-6xx]
level=warn ts=2020-09-09T08:30:54.991Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyOtherMin dimension=map[instanceid:crs-2xx]
level=warn ts=2020-09-09T08:30:54.993Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyGetMin dimension=map[instanceid:crs-8xx]
level=warn ts=2020-09-09T08:30:54.993Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyGetMin dimension=map[instanceid:crs-hxx]
level=warn ts=2020-09-09T08:30:54.993Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyGetMin dimension=map[instanceid:crs-6xx]
level=warn ts=2020-09-09T08:30:54.993Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyGetMin dimension=map[instanceid:crs-2xx]
level=warn ts=2020-09-09T08:30:55.249Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyMin dimension=map[instanceid:crs-2xx]
level=warn ts=2020-09-09T08:30:55.249Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyMin dimension=map[instanceid:crs-8xx]
level=warn ts=2020-09-09T08:30:55.249Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyMin dimension=map[instanceid:crs-hxx]
level=warn ts=2020-09-09T08:30:55.249Z caller=repository.go:200 msg="Instance has not monitor data" metric=LatencyMin dimension=map[instanceid:crs-6xx]
level=warn ts=2020-09-09T08:30:55.575Z caller=repository.go:200 msg="Instance has not monitor data" metric=CacheHitRatioMin dimension=map[instanceid:crs-2xx]
另外 CLB 7层负载的监控指标 数据也无法获取 "Instance has not monitor data"
level=info ts=2021-09-09T05:14:45.639Z caller=collector.go:72 msg="Start collect......" name=QCE/CDB
level=error ts=2021-09-09T05:14:55.089Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=d4b2c621-762a-41e8-9870-e96e8978b089"
level=error ts=2021-09-09T05:15:00.024Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=892c85aa-deab-46f6-aa28-146e25840046"
level=error ts=2021-09-09T05:15:03.204Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=f87b7462-1c5d-4fd5-8fdd-dd0dd150a861"
level=error ts=2021-09-09T05:15:03.556Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=f06ba41a-c2e7-4dae-89d2-32eefd95fd90"
level=error ts=2021-09-09T05:15:03.953Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=0a824971-f5dc-40af-ba07-4076114b610b"
level=error ts=2021-09-09T05:15:15.008Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=f45a2991-e93b-4b28-b50f-5dcda295e6fe"
level=error ts=2021-09-09T05:15:19.961Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=0c091660-0fb1-4ef1-a813-a0c289f3cf8d"
level=error ts=2021-09-09T05:15:22.941Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=072e7f69-b63b-4fd4-928f-8689ca08eb91"
level=error ts=2021-09-09T05:15:23.351Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=4585bd22-1f45-49c3-b1f0-f8eaef4fc4e9"
level=error ts=2021-09-09T05:15:23.676Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=5830313c-0613-424f-8886-6147d45527f9"
level=error ts=2021-09-09T05:15:34.810Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=fe9562fb-d8b7-496f-8ee6-6c7439e49fe9"
level=error ts=2021-09-09T05:15:39.680Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=41bad52c-cbca-48b4-8029-42c13b37a9ce"
level=error ts=2021-09-09T05:15:42.682Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=9861bd54-4fec-49fa-8014-a9559c730f58"
level=error ts=2021-09-09T05:15:43.253Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=4167bc28-a65f-4537-9995-2aae1ce2bf75"
level=error ts=2021-09-09T05:15:43.540Z caller=repository.go:148 msg="[TencentCloudSDKError] Code=InvalidParameterValue, Message=there are no valid statistics type, RequestId=9cc0f77e-3a10-4b9d-bfa0-4426ff3e1b9b"
level=info ts=2021-09-09T05:15:46.414Z caller=collector.go:82 msg="Collect done" name=QCE/CDB duration_seconds=60.774686349
https://github.com/tencentyun/tencentcloud-exporter/blob/master/readme.md
git clone http://git.code.oa.com/rig/tencentcloud-exporter.git
go build cmd/qcloud-exporter/qcloud_exporter.go
应该是github地址
试了下 QCE/CES , 能导出所有实例, 但是只有 实例 id , 没有 实例的名字, 这个对于我们来说不是很友好, 希望有实例的名字, 甚至标签, 这样比较方便
有几个指标是云监控项目就不支持这个维度, 但是exporter 可以在启动的时候用tag 接口请求到实例的列表, 从而生成对应的 dimension , 可以从这个角度实现用 tag 过滤实例列表?
cdb监控得instanceid="",insttype=""没取出来?这个怎么应该是自动带出来得把
建议可以通过tag排除intance,指标太多容易超时
包括实例、消费组、Topic的监控数据
请问购买TDSQL私有云能否接入该指标监控?
在使用过程中发现instance_filters 定义的是map[string]string
但是示例配置和README.md配置详情中为
instance_filters: // 可选, 在all_instances开启情况下, 根据每个实例的字段进行过滤
- ProjectId: 1
Status: 1
按理解析为[]map[string]string
测试了一些按照[]map[string]string无法解析
按照map[string]string无法过滤,希望可以修复
rt
ts=2022-12-07T14:30:53.855Z caller=repository.go:200 level=debug msg="this instance may not have metric data" metric=Bandwidth dimension="map[domain:xxx projectId:xxx]"
配置文件
credential:
region: ap-guangzhou
products:
- namespace: QCE/CDN
only_include_metrics:
- RequestsHitRate
- FluxHitRate
- HttpStatus4xxRate
- HttpStatus403Rate
- HttpStatus5xxRate
- Bandwidth
- BackOriginBandwidth
- BackOriginHttp4xx
- BackOriginHttp5xx
- BackOriginHttp403
- BackOriginHttp404
- BackOriginRequests
custom_query_dimensions:
- projectId: xxx
domain: xxx
- projectId: xxx
domain: xxx
- projectId: xxx
domain: xxx
- projectId: xxx
domain: xxx
rate_limit: 10
qce_redis_bigvaluemin_sum{instance_name="ak-php-saas-21",instanceid="crs-rv962yek",zone=""} 8.416
instance_name 正常
zone 为空
null
credential:
region: ap-beijing
products:
- namespace: QCE/CDN
all_metrics: true
all_instances: true
配置后无法抓取, metric 页面上显示抓取成功, 但是没有对应的指标, 一个也没有
能否支持VOD服务?
本人从事十年年大数据相关工作,做过用户增长,BI,大数据中台,知识图谱,AI中台,擅长大数据AI相关技术栈。在CSDN输出很多专栏,是CSDN博客专家,CSDN大数据领域优质创作者,2018年参与共建WeDataSphere开源社区,社区属性是数据相关综合社区,共建过DataSphereStudio(开发管理集成框架),Exchangis(数据交换工具),Streamis(流式应用开发管理系统),Apache Linkis (计算中间件) 。个人发起SolidUI数据可视化社区。Apache Asia 2022 讲师 ,Hadoop Meetup 2022 讲师,WeDataSphere Meetup 2022讲师。Apache Linkis Committer , EXIN DPO (数据保护官)。
2023年2月开始创业,全职运营SolidUI。
一句话生成任何图形。
随着文本生成图像的语言模型兴起,SolidUI想帮人们快速构建可视化工具,可视化内容包括2D,3D,3D场景,从而快速构三维数据演示场景。SolidUI 是一个创新的项目,旨在将自然语言处理(NLP)与计算机图形学相结合,实现文生图功能。通过构建自研的文生图语言模型,SolidUI 利用 RLHF (Reinforcement Learning Human Feedback) 流程实现从文本描述到图形生成的过程。
SolidUI Gitee https://gitee.com/CloudOrc/SolidUI
SolidUI GitHub https://github.com/CloudOrc/SolidUI
SolidUI 官网地址 https://cloudorc.github.io/SolidUI-Website/
Discord https://discord.gg/NGRNu2mGeQ
SolidUI v0.3.0 发版文章 https://mp.weixin.qq.com/s/KEFseiQJgK87zvpslhAAXw
SolidUI v0.3.0 概念视频 https://www.bilibili.com/video/BV1GV411A7Wn/
SolidUI v0.3.0 教程视频 https://www.bilibili.com/video/BV1xh4y1e7j6/
SolidUI 演示环境 http://www.solidui.top/ admin/admin
ts=2022-08-28T14:25:53.928Z caller=product.go:99 level=error msg="create metric series err" err="Get \"http://cos.ap-guangzhou.myqcloud.com/\": invalid character '<' looking for beginning of value" Namespace=QCE/COS name=InternalTrafficUp
把一个exporter部署到k8s 内部了,用prometheus进行抓取,偶尔就会遇到抓取超时的,自己去抓也偶尔超时,而且响应时间都很规律,都接近30的倍数,30s,60s,之类的
有没有可能是请求腾讯云的时候超时了,过了30s重试?或者网关重试导致的?如果是这种情况,sdk会打印对应的日志吗? 我应该如何排查?
公司里在集成腾讯云redis、mysql、mongodb 监控展示到本地的grafana,不知道有没有现成的dashboard。我没找到网上现成的,所以自己编辑了一些放在grafana官网:
redis grafana dashboard:
https://grafana.com/grafana/dashboards/13231
mongo grafana dashboard:
https://grafana.com/grafana/dashboards/13237
mysql dashboard:
https://grafana.com/grafana/dashboards/13270/
报错:
level=debug ts=2022-01-26T08:59:54.420Z caller=repository.go:181 msg="response data point not match series" metric=BytesSent dimension="map[InstanceId:cdbro-k2xxxxxx InstanceType:3]"
排查:
在pkg/metric/series.go文件的27行,生成TcmSeries的id时,使用的ql Labels,为InstanceId。
func NewTcmSeries(m *TcmMetric, ql Labels, ins instance.TcInstance) (*TcmSeries, error) {
id, err := GetTcmSeriesId(m, ql)
if err != nil {
return nil, err
}
s := &TcmSeries{
Id: id,
Metric: m,
QueryLabels: ql,
Instance: ins,
}
return s, nil
}
而在pkg/metric/repository.go的buildSamples方法中,使用的ql是从接口返回的points.Dimensions中取得。
func (repo *TcmMetricRepositoryImpl) buildSamples(
m *TcmMetric,
points *monitor.DataPoint,
) (*TcmSamples, map[string]string, error) {
ql := map[string]string{}
for _, dimension := range points.Dimensions {
if *dimension.Value != "" {
ql[*dimension.Name] = *dimension.Value
}
}
sid, e := GetTcmSeriesId(m, ql)
if e != nil {
return nil, ql, fmt.Errorf("get series id fail")
}
s, ok := m.Series[sid]
if !ok {
return nil, ql, fmt.Errorf("response data point not match series")
}
samples, e := NewTcmSamples(s, points)
if e != nil {
return nil, ql, fmt.Errorf("this instance may not have metric data")
}
return samples, ql, nil
}
只读从库接口中返回的数据如下
"DataPoints": [
{
"Dimensions": [
{
"Name": "InstanceType",
"Value": "3"
},
{
"Name": "InstanceId",
"Value": "cdbro-k2xxxxxx"
}
]
可以看到多了一个InstanceType,导致两次生成的TcmSeries id不匹配。
现在这个exporter似乎只支持当个指标的监控,根据我们定义的tc_metric_name
。然而更多的时候,包括prometheus社区的别的exporter,一般都是针对当个实例,但是支持导出这个实例的多种指标。
比如,虚拟机的exporter就支持同时吐出cpu/内存、网络、io等指标,mysql的exporter就支持同时吐出慢查询、qps等指标。
是否考虑推出针对腾讯云单个实例多指标的exporter?
各大云商也都标榜开源的,这个项目应该继续下去。推荐腾讯云托管的Prometheus监控服务...but我这种传统用户不能友好的支持,前段时间体验了一下自己接入prometheus mongo redis elastic等指标,也不完整。还特意问了一下支持的有没有这样的exporter....希望能更新下去。现在个人也想体验一下接入.....
通过cdn monitor的接口基本上没办法获取到及时的metric信息。一定要通过cdn原生的API接口才行。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.