robustperception / azure_metrics_exporter Goto Github PK
View Code? Open in Web Editor NEWAzure metrics exporter for Prometheus
License: Apache License 2.0
Azure metrics exporter for Prometheus
License: Apache License 2.0
I have this configuration:
azure.yml
credentials:
subscription_id: <secret>
tenant_id: <secret>
client_id: <secret>
client_secret: <secret>
targets:
- resource: "/resourceGroups/<resource_group>/providers/Microsoft.Compute/virtualMachines/<machine_name>"
metrics:
- name: "Percentage CPU"
- name: "Network In"
- name: "Network Out"
"/resourceGroups/<resource_group>/providers/Microsoft.Compute/virtualMachines/<machine_name>"
metrics:
- name: "Http2xx"
- name: "Http5xx"
When i attempt to get metrics i got this output:
$ curl IP_ADDRESS:9276/metrics
# HELP network_in_bytes_average network_in_bytes_average
# TYPE network_in_bytes_average gauge
network_in_bytes_average{resource_group="<resource_group>",resource_name="<machine_name>"} 0
# HELP network_in_bytes_max network_in_bytes_max
# TYPE network_in_bytes_max gauge
network_in_bytes_max{resource_group="<resource_group>",resource_name="<machine_name>"} 0
# HELP network_in_bytes_min network_in_bytes_min
# TYPE network_in_bytes_min gauge
network_in_bytes_min{resource_group="<resource_group>",resource_name="<machine_name>"} 0
# HELP network_in_bytes_total network_in_bytes_total
# TYPE network_in_bytes_total gauge
network_in_bytes_total{resource_group="<resource_group>",resource_name="<machine_name>"} 0
# HELP network_out_bytes_average network_out_bytes_average
# TYPE network_out_bytes_average gauge
network_out_bytes_average{resource_group="<resource_group>",resource_name="<machine_name>"} 0
# HELP network_out_bytes_max network_out_bytes_max
# TYPE network_out_bytes_max gauge
network_out_bytes_max{resource_group="<resource_group>",resource_name="<machine_name>"} 0
# HELP network_out_bytes_min network_out_bytes_min
# TYPE network_out_bytes_min gauge
network_out_bytes_min{resource_group="<resource_group>",resource_name="<machine_name>"} 0
# HELP network_out_bytes_total network_out_bytes_total
# TYPE network_out_bytes_total gauge
network_out_bytes_total{resource_group="<resource_group>",resource_name="<machine_name>"} 0
# HELP percentage_cpu_percent_average percentage_cpu_percent_average
# TYPE percentage_cpu_percent_average gauge
percentage_cpu_percent_average{resource_group="<resource_group>",resource_name="<machine_name>"} 6.2375
# HELP percentage_cpu_percent_max percentage_cpu_percent_max
# TYPE percentage_cpu_percent_max gauge
percentage_cpu_percent_max{resource_group="<resource_group>",resource_name="<machine_name>"} 6.49
# HELP percentage_cpu_percent_min percentage_cpu_percent_min
# TYPE percentage_cpu_percent_min gauge
percentage_cpu_percent_min{resource_group="<resource_group>",resource_name="<machine_name>"} 6.09
# HELP percentage_cpu_percent_total percentage_cpu_percent_total
# TYPE percentage_cpu_percent_total gauge
percentage_cpu_percent_total{resource_group="<resource_group>",resource_name="<machine_name>"} 24.95
And i got this error from azure-metrics-exporter:
azure_metrics_exporter --config.file /opt/azure-exporter/azure.yml --web.listen-address=":9276"
2019/04/05 13:45:53 azure_metrics_exporter listening on port :9276
2019/04/05 13:46:02 Failed to get metrics for target /resourceGroups/<resource_group>/providers/Microsoft.Compute/virtualMachines/<machine_name>: Unable to query metrics API with status code: 400
There is something wrong with my config file?
I'm running the exporter to collect azure application gateway metrics. I face an time offset of 4 minutes for all metrics gotten from azure monitor.
I found the GetTimes function for query delay in code
// GetTimes - Returns the endTime and startTime used for querying Azure Metrics API
func GetTimes() (string, string) {
// Make sure we are using UTC
now := time.Now().UTC()
// Use query delay of 3 minutes when querying for latest metric data
endTime := now.Add(time.Minute * time.Duration(-3)).Format(time.RFC3339)
startTime := now.Add(time.Minute * time.Duration(-4)).Format(time.RFC3339)
return endTime, startTime
}
The exposed metrics endpoint does not provide any timestamp information which might explain the offset in prometheus.
...
HELP throughput_bytespersecond_total throughput_bytespersecond_total
# TYPE throughput_bytespersecond_total gauge
throughput_bytespersecond_total{resource_group="some-rg",resource_name="some-resource"} 74
...
What am I doing wrong?
I've noticed while looking at the config code that there's no counterpart to the ResourceGroup
validation. Is there a reason for this or would a PR to do so be helpful?
azure_metrics_exporter/config/config.go
Lines 80 to 96 in a13576c
Hi,
I would like to collect the memory related metrics (current usage or available memory bytes) whatever available for VMs. Is it possible to do it?
Thanks,
Would it be possible to publish this project as a docker container on a public repo? I understand it's still under development, but it would make it easier for us to kick the tires!
Thanks!
I am wondering if it is possible to add "resource type" in the exposed metrics labels. for example what I have got is as follows. it shows the "resource_group" and "resource_name" but no "resource type".
Resource type might be a useful information when the user want to filter interesting metrics.
# HELP bitsinpersecond_countpersecond_average bitsinpersecond_countpersecond_average
# TYPE bitsinpersecond_countpersecond_average gauge
bitsinpersecond_countpersecond_average{resource_group="bca-dev-westeurope",resource_name="azure-to-c3-mash-up-connection"} 0
bitsinpersecond_countpersecond_average{resource_group="bca-dev-westeurope",resource_name="azure-to-ims-connection"} 0
one of the resource related to above metrics in the azure is like
{
"id": "/subscriptions/15b4d43c-7a12-42ea-8184-cedd7e6f229a/resourceGroups/bca-dev-westeurope/providers/Microsoft.Network/connections/azure-to-ims-connection",
"identity": null,
"kind": null,
"location": "westeurope",
"managedBy": null,
"name": "azure-to-ims-connection",
"plan": null,
"properties": null,
"resourceGroup": "bca-dev-westeurope",
"sku": null,
"tags": null,
"type": "Microsoft.Network/connections"
},
it would be nice to get metric like follows
# HELP bitsinpersecond_countpersecond_average bitsinpersecond_countpersecond_average
# TYPE bitsinpersecond_countpersecond_average gauge
bitsinpersecond_countpersecond_average{resource_group="bca-dev-westeurope", resource_type="connections", resource_name="azure-to-c3-mash-up-connection"} 0
bitsinpersecond_countpersecond_average{resource_group="bca-dev-westeurope", resource_type="connections", resource_name="azure-to-ims-connection"} 0
Currently the client id/secret needs to be put in the configuration file. It would be nice if it was allowed to read this from the environment instead.
If this would be acceptable, I can send in a PR.
Hi!
The exporter default port is currently 9276
.
Unfortunately, this port is already taken by another exporter on the Default port allocations wiki, and azure_metrics_exporter
is not listed there.
Although it would be a breaking change, maybe it would be better to follow the wiki procedure (sooner than later). I propose to take the next available port on the wiki, and then change the default port in the exporter accordingly.
I could take care of that. What do you think @brian-brazil ?
Thanks!
I'm currently facing an issue where the azure-exporter is unable to find the API version for Virtual Machine Scalesets. Perhaps there should be a way to manually specify the API version to use?
Error message:
azure-exporter | 2021/06/03 21:22:32 Failed to get resource info: No api version found for type: RESOURCE_GROUP/providers/virtualMachineScaleSets
My config (minus credentials section) is:
targets:
- resource: "/resourceGroups/RESOURCE_GROUP/providers/Microsoft.Compute/virtualMachineScaleSets/RESOURCE_NAME"
metric_namespace: "Microsoft.Compute/virtualMachineScaleSets"
metrics:
- name: "Percentage CPU"
- name: "Network In Total"
- name: "Network Out Total"
I've also tried this without specifying the metric_namespace, and the same error is returned.
.\azure_metrics_exporter.exe --list.definitions successfuly returns:
Available Metrics:
2021/06/03 17:21:02 - Percentage CPU
2021/06/03 17:21:02 - Network In
2021/06/03 17:21:02 - Network Out
2021/06/03 17:21:02 - Disk Read Bytes
2021/06/03 17:21:02 - Disk Write Bytes
2021/06/03 17:21:02 - Disk Read Operations/Sec
2021/06/03 17:21:02 - Disk Write Operations/Sec
2021/06/03 17:21:02 - CPU Credits Remaining
2021/06/03 17:21:02 - CPU Credits Consumed
2021/06/03 17:21:02 - Data Disk Read Bytes/sec
2021/06/03 17:21:02 - Data Disk Write Bytes/sec
2021/06/03 17:21:02 - Data Disk Read Operations/Sec
2021/06/03 17:21:02 - Data Disk Write Operations/Sec
2021/06/03 17:21:02 - Data Disk Queue Depth
2021/06/03 17:21:02 - Data Disk Bandwidth Consumed Percentage
2021/06/03 17:21:02 - Data Disk IOPS Consumed Percentage
2021/06/03 17:21:02 - Data Disk Target Bandwidth
2021/06/03 17:21:02 - Data Disk Target IOPS
2021/06/03 17:21:02 - Data Disk Max Burst Bandwidth
2021/06/03 17:21:02 - Data Disk Max Burst IOPS
2021/06/03 17:21:02 - Data Disk Used Burst BPS Credits Percentage
2021/06/03 17:21:02 - Data Disk Used Burst IO Credits Percentage
2021/06/03 17:21:02 - OS Disk Read Bytes/sec
2021/06/03 17:21:02 - OS Disk Write Bytes/sec
2021/06/03 17:21:02 - OS Disk Read Operations/Sec
2021/06/03 17:21:02 - OS Disk Write Operations/Sec
2021/06/03 17:21:02 - OS Disk Queue Depth
2021/06/03 17:21:02 - OS Disk Bandwidth Consumed Percentage
2021/06/03 17:21:02 - OS Disk IOPS Consumed Percentage
2021/06/03 17:21:02 - OS Disk Target Bandwidth
2021/06/03 17:21:02 - OS Disk Target IOPS
2021/06/03 17:21:02 - OS Disk Max Burst Bandwidth
2021/06/03 17:21:02 - OS Disk Max Burst IOPS
2021/06/03 17:21:02 - OS Disk Used Burst BPS Credits Percentage
2021/06/03 17:21:02 - OS Disk Used Burst IO Credits Percentage
2021/06/03 17:21:02 - Inbound Flows
2021/06/03 17:21:02 - Outbound Flows
2021/06/03 17:21:02 - Inbound Flows Maximum Creation Rate
2021/06/03 17:21:02 - Outbound Flows Maximum Creation Rate
2021/06/03 17:21:02 - Premium Data Disk Cache Read Hit
2021/06/03 17:21:02 - Premium Data Disk Cache Read Miss
2021/06/03 17:21:02 - Premium OS Disk Cache Read Hit
2021/06/03 17:21:02 - Premium OS Disk Cache Read Miss
2021/06/03 17:21:02 - VM Cached Bandwidth Consumed Percentage
2021/06/03 17:21:02 - VM Cached IOPS Consumed Percentage
2021/06/03 17:21:02 - VM Uncached Bandwidth Consumed Percentage
2021/06/03 17:21:02 - VM Uncached IOPS Consumed Percentage
2021/06/03 17:21:02 - Network In Total
2021/06/03 17:21:02 - Network Out Total
.\azure-metrics-exporter.exe --list.namespaces successfully returns:
2021/06/03 17:29:24 Resource: /resourceGroups/RESOURCE_GROUP/providers/Microsoft.Compute/virtualMachineScaleSets/RESOURCE_NAME
Available namespaces:
2021/06/03 17:29:24 - Microsoft.Compute/virtualMachineScaleSets
In an environment with thousands of metrics names and a variety of exporters and targets it is tough to find out what each metric represents without looking at the job label. So I am using a system of prefixes to identify the metric names easily.
Is there any way to add a custom prefix to the metric names collected by azure_metrics_exporter?
Hi!
When calling ./azure_metrics_exporter --list.definitions
with a resource_group configuration, no output appears. This is "normal" as I see the feature is not implemented for resource group.
I propose to implement it. It will need a small refactor to the Collect function; i.e., extract as a function the logic for include/exclude of the resource groups, in order to avoid code duplication.
I can start to work on that PR next week.
Thanks!
What is the license on this code?
Right now the metric names are a bit all over the place, since we just take what we get from Azure and use that as the name.
Could we set the namespace to azure
? This would be a breaking change of course. As a motivation, the current approach seems to hurt discoverability of the metrics in my environment at least, since the majority of exporters/applications do some prefixing out their names. People could first query for {job="azure-metrics"}
and see what is available, but from what I've seen this doesn't happen.
I can see a previous issue was raised for this under #40
and is marked as fixed, however, I appear to be getting the same issue.
I have the following config for example:
resource_groups:
- resource_group: "test-rg"
resource_types:
- "Microsoft.Sql/servers/databases"
metrics:
- name: "dtu_used"
The above will only return scrapes for a single db and not all databases, also in the logs I am getting a 400 error for the all other databases on the same resource group
In azure.go, there are currently 15 uses of log.Fatalf
. Which of them are reasonable to keep as fatal? For example, partial configuration errors (such as pointing to non-existing resources, which gives a 404) will right now quit the exporter.
Some errors might make sense to quit with (invalid credentials, perhaps), but most not. Happy to do the changes when there is agreement on what is best.
Ping @brian-brazil
Status code changed from 200 to 401, when ExpiredAuthenticationToken error.
(By changing apiVersion to 2018-01-01)
2018/06/12 05:39:47 azure_metrics_exporter listening on port :9276
2018/06/12 05:39:57 GET https://management.azure.com/...
...
2018/06/12 06:45:00 GET https://management.azure.com/...
2018/06/12 06:45:00 Unable to query metrics API with status code: 401
gets down in one hour from the start.
At present this module doesn't appear to support exporting different metrics for resource groups that have multiple resource Ids, for example, we should be able to to the following:
resource_groups:
- resource_group: "test-rg"
resource_types:
- "Microsoft.Sql/servers/databases"
metrics:
- name: "storage_percent"
- name: "cpu_percent"
- name: "allocated_data_storage"
- name: "workers_percent"
- name: "physical_data_read_percent"
- name: "log_write_percent"
- name: "sessions_percent"
- name: "xtp_storage_percent"
- name: "storage"
- name: "connection_successful"
- name: "connection_failed"
- name: "blocked_by_firewall"
- name: "deadlock"
- "Microsoft.Sql/servers/elasticPools"
metrics
-name "cpu_percent"
At the moment the config expects the following:
resource_groups:
- resource_group: "test-rg"
resource_types:
- "Microsoft.Sql/servers/databases"
- "Microsoft.Sql/servers/elasticPools"
metrics:
- name: "storage_percent"
- name: "cpu_percent"
- name: "allocated_data_storage"
- name: "workers_percent"
- name: "physical_data_read_percent"
- name: "log_write_percent"
- name: "sessions_percent"
- name: "xtp_storage_percent"
- name: "storage"
- name: "connection_successful"
- name: "connection_failed"
- name: "blocked_by_firewall"
- name: "deadlock"
resource types have different metrics, and this causes 400's
Hi Guys
I'm having an issue when attempting to pull Metrics of an Azure Appgateway V2.
Specifically when pulling MatchedCount I get the following error:
panic: runtime error: index out of range
goroutine 31 [running]:
main.(*Collector).extractMetrics(0xd11b48, 0xc0000ff800, 0xc000280000, 0x2ba, 0xc000110640, >0x12e, 0xc0000449c0, 0x4, 0x4, 0xc8, ...)
C:/GoProjects/src/github.com/RobustPerception/azure_metrics_exporter/main.go:71 +0xd2d
main.(*Collector).batchCollectResources(0xd11b48, 0xc0000ff800, 0xc000044a40, 0x1, 0x1)
C:/GoProjects/src/github.com/RobustPerception/azure_metrics_exporter/main.go:130 +0x3a6
main.(*Collector).Collect(0xd11b48, 0xc0000ff800)
C:/GoProjects/src/github.com/RobustPerception/azure_metrics_exporter/main.go:206 +0x848
github.com/RobustPerception/azure_metrics_exporter/vendor/github.com/prometheus/client_gola>ng/prometheus.(*Registry).Gather.func2(0xc0001849d0, 0xc0000ff800, 0x991220, 0xd11b48)
>C:/GoProjects/src/github.com/RobustPerception/azure_metrics_exporter/vendor/github.com/prome>theus/client_golang/prometheus/registry.go:383 +0x68
created by >github.com/RobustPerception/azure_metrics_exporter/vendor/github.com/prometheus/client_gola>ng/prometheus.(*Registry).Gather
>C:/GoProjects/src/github.com/RobustPerception/azure_metrics_exporter/vendor/github.com/prome>theus/client_golang/prometheus/registry.go:381 +0x2f0`
My config is as follows:
credentials:
subscription_id:
client_id:
client_secret:
tenant_id:targets:
- resource: "/resourceGroups/group/providers/Microsoft.Network/applicationGateways/gateway"
metrics:
- name: "MatchedCount"`
Hi,
I have an issue with the Azure exporter metric name. My problem is, I want to get, for example, the "cpu_percent" metric value for a DB SQL but also for a DB PostgreSQL. The metric name is the same for both services and this seems the root cause of the issue.
My configuration:
resource_groups:
- resource_group: "rg-test"
resource_types:
- Microsoft.Sql/servers/databases
metrics:
- name: "cpu_percent"
- resource_group: "rg-test"
resource_types:
- Microsoft.DBforPostgreSQL/servers
metrics:
- name: "cpu_percent"
With this configuration, I receive the error:
- collected metric cpu_percent_percent_max label:<name:"resource_group" value:"rg-test" > label:<name:"resource_name" value:"test-postgres" > gauge:<value:0 > was collected before with the same name and label values
But using my PR: #44 (which adds an additional label to avoid this error)
- collected metric storage_percent_percent_max label:<name:"resource_group" value:"rg-test" > label:<name:"resource_name" value:"test-postgres" > gauge:<value:22.86 > has label dimensions inconsistent with previously collected metrics in the same metric family
I've also tried with this configuration(because I was not sure what was the correct syntax):
resource_groups:
- resource_group: "rg-test"
resource_types:
- Microsoft.Sql/servers/databases
- Microsoft.Sql/servers/databases
metrics:
- name: "cpu_percent"
But same issue.
I'm able to get the metric if in my configuration I put only "Microsoft.Sql/servers/databases" or "Microsoft.Sql/servers/databases". I mean, individually, it works but not when both are set together.
The cpu metric is:
# HELP cpu_percent_percent_total cpu_percent_percent_total
# TYPE cpu_percent_percent_total gauge
cpu_percent_percent_total{resource_group="rg-test",resource_name="test-postgres"} 0
Regarding the way to build the metric name, why not doing the same than for the AWS Cloudwatch exporter? Each metric has the service name.
Example: for the RDS database metrics: aws_rds_database_connections_sum
In the case of Azure metric exporter, the name of the DB CPU Percent metric is: cpu_percent_percent_total
Resource Types could have one of this format:
Why not definiting the metric name like this(in my example, the cpu percent metric exists for each resource type):
"Azure" at the beginning is not required but at least we keep the same naming logic with the AWS exporter and it's easier to identify the metric origin when your Prometheus is gathering metrics from multiple providers.
So a regex removing "Microsoft", a replace of "/" by "_" and lowercase the string should be enough IMO.
@brian-brazil What do you think ?
Thanks
Hi,
I am trying to set up the exporter as a kubernetes application and using configmap for creating configuration file. However, when I am deploying the application I am getting an following error.
Failed to get token: Did not get status code 200, got: 404 with body:
Here is the configmap I created for the configuration.
apiVersion: v1
data:
azure.yml: |
active_directory_authority_url: "https://login.microsoftonline.com/"
resource_manager_url: "https://management.azure.com/"
credentials:
client_id: "4xxxxxxx-4xxx-4xxx-8xxx-2xxxxxxxxxxx"
resource_groups:
- resource_group: "my_resource_group_name"
resource_types:
- "Microsoft.Compute/virtualMachines"
metrics:
- name: "Disk Read Bytes"
- name: "Disk Write Bytes"
kind: ConfigMap
metadata:
name: azure-exporter
Any suggestions? Do we have any sample configuration for k8s?
Exporter exiting when no data exists for one metric.
panic: runtime error: index out of range
goroutine 16 [running]:
panic(0x3d3640, 0xc42000a0f0)
/usr/local/Cellar/go/1.7.5/libexec/src/runtime/panic.go:500 +0x1a1
main.(*Collector).Collect(0x678d90, 0xc420327500)
/usr/local/Cellar/go/1.7.5/bin/src/github.com/RobustPerception/azure_metrics_exporter/main.go:71 +0xd97
github.com/RobustPerception/azure_metrics_exporter/vendor/github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func2(0xc420396ad0, 0xc420327500, 0x63b720, 0x678d90)
/usr/local/Cellar/go/1.7.5/bin/src/github.com/RobustPerception/azure_metrics_exporter/vendor/github.com/prometheus/client_golang/prometheus/registry.go:383 +0x63
created by github.com/RobustPerception/azure_metrics_exporter/vendor/github.com/prometheus/client_golang/prometheus.(*Registry).Gather
/usr/local/Cellar/go/1.7.5/bin/src/github.com/RobustPerception/azure_metrics_exporter/vendor/github.com/prometheus/client_golang/prometheus/registry.go:384 +0x326
Config:
- resource: "/resourceGroups/xXXXX-rg/providers/Microsoft.ServiceBus/namespaces/XXXX"
metrics:
- name: "SuccessfulRequests"
- name: "ServerErrors"
- name: "UserErrors"
- name: "IncomingRequests"
- name: "IncomingMessages"
- name: "OutgoingMessages"
- name: "ActiveConnections"
- name: "Size"
- name: "Messages"
- name: "ActiveMessages"
- name: "DeadletteredMessages"
- name: "ScheduledMessages"
- name: "ThrottledRequests"
ThrottledRequests -> This is the guilty guy. If configured only on its own or removed from the list. there is no failure.
2018/11/12 12:55:14 Resource: XXXXX
Available Metrics:
2018/11/12 12:55:14 - SuccessfulRequests
2018/11/12 12:55:14 - ServerErrors
2018/11/12 12:55:14 - UserErrors
2018/11/12 12:55:14 - ThrottledRequests
2018/11/12 12:55:14 - IncomingRequests
2018/11/12 12:55:14 - IncomingMessages
2018/11/12 12:55:14 - OutgoingMessages
2018/11/12 12:55:14 - ActiveConnections
2018/11/12 12:55:14 - Size
2018/11/12 12:55:14 - Messages
2018/11/12 12:55:14 - ActiveMessages
2018/11/12 12:55:14 - DeadletteredMessages
2018/11/12 12:55:14 - ScheduledMessages```
It would be ideal if endpoints could be overloaded for alternate Azure environments like Gov. I believe just the TLD changes to .us (instead of .com), but need to validate that. I'll try to work on a PR, but wanted to mention her as well.
I'm trying to get database metrics using resource group filtering which was added recently.
Targeting the database directly using the following resource:
/resourceGroups/SOMERG/providers/Microsoft.Sql/servers/SERVER/databases/DATABASE
works fine but when I try retrieving the metrics with resource group filtering it fails with:
collected metric storage_percent_percent_total label: label: gauge: was collected before with the same name and label values
I'm using the following configuration:
resource_groups:
I wasn't sure what the resource_types should be and tried specifying different combinations but none was displaying anything except this one.
I have a running exporter retrieving cosmos db metrics. It is running fine every 30 secs (no rate limit is hit). There are strange "gaps" (value 0) when trying to graph some of the metrics like Documents count and Storage/index size.
You can check here: https://imgur.com/49G9x7L
This is for documents count in a cosmosdb instance but the same can be observed for other storage related metrics. The same metrics in Azure portal have no gaps. The number requests metric though have no gaps. Not sure if this is related to the exporter itself or the Azure API. The strange thing is that it usually takes 4-5 minutes to expose again the correct size value. Didn't see any errors in the exporter logs.
The other thing that looks wrong is related to how many database(s) you have in a cosmos db instance and what the exporter retrieves/exposes. In my example I have 2 databases in a single cosmos db instance and the value (for storage size) that the exporter exposes is for one of the databases, not for all of them as I would expect. If there is no label showing what database the value of the metric is for (or any specific configuration targeting specific database in the config file) I think it should expose the total size of all databases not just one of them.
I didn't find a way to retrieve metrics for specific queues/topics in ServiceBus instance and for specific database/collections in CosmosDB. Would be helpful to have this data exported as well.
I noticed that the Azure Portal uses another API than the one used here for fetching metrics, which it might be interesting to use instead: https://management.azure.com/batch?api-version=2017-03-01
This apparently takes a list of urls and methods and returns a list of results. A single-request body looks like this:
{
"requests": [
{
"relativeUrl": "/subscriptions/<subscription-id>/<resource-id>/providers/microsoft.Insights/metrics?...",
"httpMethod": "GET"
}
]
}
It would be interesting to see if this has some performance/quota benefits?
Hi!
In the cloud, ops team will often manage their resources by tag (providing information like "client", "business unit", "environment", etc...). Being able to filter metrics (in graph, silences, etc...) on these would be really useful. Having these tags automatically converted as metrics label would avoid to manually label metrics with this information (and thus, duplicate configuration).
I propose to add a feature to automatically convert resources tags as metrics labels. It would need to work with the 3 configuration mechanisms currently supported by the exporter (by target, by resource group, by tag).
I (or my colleague) could start to work on that next week!
azure.yml配置了前面访问权限的部分,后面target部分不明白如何编写
targets:
resource_groups:
resource_tags:
Hi
As part of an ongoing effort to fetch Azure metrics for multiple providers (CosmosDB, VM scale sets, Web Sites etc), our team (walmart labs) has done a poc for this exporter.
As per the initial analysis and feedback from customers, following are the main concerns:
Multidimensional metrics are missing: For e.g. in case of CosmosDB, most of the available metrics have dimensions (AvailableStorage has e dimensions etc.). It seems there is no provision to fetch multi dimensional metrics as of this moment.
In case we need to fetch metrics from multiple subscriptions, multiple instances of exporter need to be deployed (one for each subscription). Is there any specific reason that we can configure only one subscription in azure.yml.
As per the current workflow all resource groups/resources need to be configured in yml file for which metrics need to be scraped (or tags). Since we have subscriptions which have more than 50 resource groups and multiple resources underneath, the azure yml file would be very tedious to write for all of the subscriptions.
As per the current design, the call to fetch metrics and then publish all metrics to prometheus seems to be serially executed. Would it be better to parallelize the fetch and publishing part, any thoughts on this ?
We are working on an enhancement for multidimensional metrics and making the config file as simple as possible to configure. For e.g. simply specify the list of subscriptions with name and type of providers for which metrics need to be fetched and we’ll dynamically fetch resource groups/resources, fetch the available dimensions for metrics, formulate URLs and fetch all metrics. I will keep on updating this thread as and when we make progress. Please have a look at above mentioned points and share your thoughts.
which way Docker is called and a config file passed?
That's missing.
Would be nice to have Azure managed identities support.
With this feature we will no longer have to create a service principal to connect to the Azure API.
I am getting "no token found" error when I activate some metrics at azure.yml. It looks like this happens with metrics which name has a "/" like "Disk Read Operations/Sec" in the exporter config. Is this a bug in azure metrics exporter?
---
credentials:
subscription_id: xxxx
client_id: xxxx
client_secret: xxxx
tenant_id: xxxx
targets:
- resource: "/resourceGroups/myvm/providers/Microsoft.Compute/virtualMachines/myvmm/"
metrics:
- name: "Percentage CPU"
- name: "Network In"
- name: "Network Out"
- name: "Disk Read Bytes"
- name: "Disk Write Bytes"
# - name: "Disk Read Operations/Sec" **This doesn't work**
# - name: "Disk Write Operations/Sec" **This doesn't work**
- name: "CPU Credits Remaining"
- name: "CPU Credits Consumed"
# - name: "Per Disk Read Bytes/sec" **This doesn't work**
# - name: "Per Disk Write Bytes/sec" **This doesn't work**
# - name: "Per Disk Read Operations/Sec" **This doesn't work**
# - name: "Per Disk Write Operations/Sec" **This doesn't work**
- name: "Per Disk QD"
# - name: "OS Per Disk Read Bytes/sec" **This doesn't work**
# - name: "OS Per Disk Write Bytes/sec" **This doesn't work**
# - name: "OS Per Disk Read Operations/Sec" **This doesn't work**
# - name: "OS Per Disk Write Operations/Sec" **This doesn't work**
- name: "OS Per Disk QD"
Can you please add more information at Readme file on how to install and configure this plugin?
i'm wondering whether a puppet module is already available but could not find anythign as of yet.
Will be working on one otherwise.
Hello,
I am running two docker containers for dev and prod, from docker-compose:
version: "3.6"
services:
azure-metrics-exporter_dev:
image: robustperception/azure_metrics_exporter
network_mode: host
restart: always
command:
- '--config.file=/config/dev.yaml'
volumes:
- "/azure-metrics-exporter/config:/config:rw"
azure-metrics-exporter_prod:
image: robustperception/azure_metrics_exporter
network_mode: host
restart: always
command:
- '--config.file=/config/prod.yaml'
- '--web.listen-address=":9275"'
volumes:
- "/azure-metrics-exporter/config:/config:rw"
Metrics looks good from dev in localhost:9276/metrics , but second docker is restarting all the time. Docker logs output:
azure-metrics-exporter# docker logs b9d7asdecbd0
2021/04/01 10:31:25 azure_metrics_exporter listening on port ":9275"
2021/04/01 10:31:25 Error starting HTTP server: listen tcp: address tcp/9275": unknown port
It looks that is not able to read port. But in docker-compose.yaml I am using command:
- '--web.listen-address=":9275"'
Tested with separated config and SPNs for subscription
The header x-ms-ratelimit-remaining-subscription-reads
contains the remaining allowed read request quota, and would be use very useful. See https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits.
I'm happy to implement this, but would like some feedback on this approach:
AzureClient.getMetricValue
read and parse the header if presentAzureMetricValueResponse
)Collector.Collect
, check the min value returned from all get:s, and finally create a metric from this min.Sounds reasonable?
There is a write quota header too, but it is only returned on writes, so to collect it we'd need to make some form of write and the service principal would need to be allowed to do so. I'd consider this out of scope initially. Main concern for us at least is reads.
We made some open-source and service for about of prometheus exporter to find easy & use via your repository.
If you have and question or improvement request about this, feel free to make an issue ticket to here -> https://github.com/NexClipper/exporterhub.io
Thanks for contribution :)
When picking up the azure resource id from the portal it adds extra information in the begnining, triggering a 404 when calling the api.
e.g:
/subscriptions/xxxx-xxxx-xxxx/resourceGroups/service-rg/providers/Microsoft.ServiceBus/namespaces/servicebus
when it is expected by the exporter like this:
/resourceGroups/service-rg/providers/Microsoft.ServiceBus/namespaces/servicebus
subscription/subscription-id bit is not needed, otherwise we get:
{"message":"No HTTP resource was found that matches the request URI 'https://management.azure.com/subscripti........
Suggestion, doc to be updated.
Would it be possible to add Azure billing monitoring? Would be helpful , if we have billing data.
Microsoft have billing API's for access to the Azure Usage and Rates.
https://docs.microsoft.com/en-us/rest/api/consumption/
https://docs.microsoft.com/en-us/rest/api/consumption/usagedetails/lis
https://docs.microsoft.com/en-us/azure/cost-management-billing/manage/usage-rate-card-overview
Hi,
I'm not able to get the Azure storage account usedcapacity metric, the exporter always gives me back 0(should be 181.8MiB).
I can see differences between the request made by Azure portal to get the usedcapacity metric en the one made by Prometheus.
Azure portal:
"/subscriptions/XXXXXXXXX/resourceGroups/rg-dev/providers/Microsoft.Storage/storageAccounts/devwww/providers/microsoft.Insights/metrics?timespan=2019-11-11T15:35:00.000Z/2019-11-12T15:35:00.000Z&interval=FULL&metricnames=UsedCapacity&aggregation=average&metricNamespace=microsoft.storage%2Fstorageaccounts&validatedimensions=false&api-version=2019-07-01"
Prometheus Exporter:
/subscriptions/XXXXXXXXXXXXXX/resourceGroups/rg-dev/providers/Microsoft.Storage/storageAccounts/devwww/providers/microsoft.insights/metrics?aggregation=Total%2CAverage%2CMinimum%2CMaximum&api-version=2018-01-01&metricnames=UsedCapacity×pan=2019-11-12T15%3A41%3A53Z%2F2019-11-12T15%3A42%3A53Z
Prometheus gives me
# HELP usedcapacity_bytes_average usedcapacity_bytes_average
# TYPE usedcapacity_bytes_average gauge
usedcapacity_bytes_average{resource_group="rg-dev",resource_name="devwww"} 0
# HELP usedcapacity_bytes_max usedcapacity_bytes_max
# TYPE usedcapacity_bytes_max gauge
usedcapacity_bytes_max{resource_group="rg-dev",resource_name="devwww"} 0
# HELP usedcapacity_bytes_min usedcapacity_bytes_min
# TYPE usedcapacity_bytes_min gauge
usedcapacity_bytes_min{resource_group="rg-dev",resource_name="devwww"} 0
# HELP usedcapacity_bytes_total usedcapacity_bytes_total
# TYPE usedcapacity_bytes_total gauge
usedcapacity_bytes_total{resource_group="rg-dev",resource_name="devwww"} 0
I've modify the exporter's code to generate a request with parameters like the Azure portal.
/subscriptions/XXXXXXX/resourceGroups/rg-dev/providers/Microsoft.Storage/storageAccounts/probstgwww/providers/microsoft.Insights/metrics?aggregation=average&api-version=2019-07-01&autoadjusttimegrain=true&interval=PT5M&metricNamespace=microsoft.storage%2Fstorageaccounts&metricnames=UsedCapacity×pan=2019-11-12T15%3A36%3A48Z%2F2019-11-12T15%3A37%3A48Z&validatedimensions=false
But the exporter still gives me 0.
My configuration:
credentials:
subscription_id: XXXXXXX
client_id: XXXXXX
client_secret: XXXXXX
tenant_id: XXXXXX
resource_groups:
- resource_group: "rg-dev"
resource_types:
- Microsoft.Storage/storageAccounts
metrics:
- name: "UsedCapacity"
Any idea?
Thanks
Currently the total, average, min and max aggregations are all always used. Would it make sense to allow choosing which of these are relevant in the config file? If not configured, it would keep returning all of them.
Use case: if a user need a more dynamic configuration strategy, he would need to collect metrics based on resources tags, without having to hardcode resource names or resource groups in configuration.
The feature would add the possibility to use a tag name and a tag value as a filter for the resources to be monitored. The implementation would be similar to the resource group feature.
I would use the filter feature of the resources list API.
I can take care of that and create a PR to add this feature.
I propose to add the feature in the listDefinitions method in a subsequent PR.
For example DBforPostgreSQL
metrics has a metric called io_consumption_percent
. The reported unit is Percent, so the Prometheus metric names are set to io_consumption_percent_percent_average
, etc.
azure_metrics_exporter/main.go
Line 64 in 56371e2
Would a check with strings.HasSuffix
before doing the above be acceptable, or are there better ways to fix it?
I am trying to fetch metrics from multiple resource type using the below config is not working, can you please let me know how to achieve this.
targets:
resource_groups:
- resource_group: "RG-AZURE"
resource_types:
- "Microsoft.Network/loadBalancers"
metrics:
- name: "PacketCount"
resource_groups:
- resource_group: "RG-AZURE"
resource_types:
- "Microsoft.DBforPostgreSQL/servers"
metrics:
- name: "active_connections"
- name: "storage_used"
Hi,
I am unable to add a dimension for Event hub namespace, say a particular event hub entity for the namespace. Is there a way to achieve this?
releases download
Hello,
I am facing issues while trying to run the exporter, I am wondering If I have missed some necessary pre-requisites or configuration changes:
$ ./azure_metrics_exporter --config.file="azure.yml"
2021/01/12 04:53:14 Using managed identity
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x827dd5]
goroutine 1 [running]:
main.(*AzureClient).getAccessToken(0xc0000662c0, 0x0, 0x0)
/home/tsourab/go/src/github.com/RobustPerception/azure_metrics_exporter/azure.go:224 +0x355
main.main()
/home/tsourab/go/src/github.com/RobustPerception/azure_metrics_exporter/main.go:321 +0xdf
azure.yml
---
active_directory_authority_url: "https://login.microsoftonline.com/"
resource_manager_url: "https://management.azure.com/"
credentials:
subscription_id: "<subscription ID here>"
resource_groups:
- resource_group: "tsourab-metric"
resource_types:
- "Microsoft.Compute/virtualMachines"
resource_name_include_re:
- "metrics"
metrics:
- name: "CPU Credits Consumed"
OS: Fedora 33
Kernel: 5.8.15-301.fc33.x86_64
Go version: go version go1.15.6 linux/amd64
Hello Team.
Do you guys have an example of azure.yml to get metrics from PaaS services like Azure SQL, Azure Redis for cache.
All the best.
the total number of metrics is about 60.
if we decrease it to 5-6, that works.
how's your guys experience?
We have a situation where we need to expose the LB Data Path availability of individual ports, but with the current configuration it's showing altogether, metrics are not exposed for individual ports.
Here is our Configuration:
targets:resource_groups: - resource_group: "<%= p('resource_group') %>" resource_types: - "Microsoft.Network/loadBalancers" resource_name_exclude_re: - ".*service-fabrik.*" - ".*pgoutboundconnectionhelper.*" metrics: - name: "PacketCount" - name: "VipAvailability" - name: "DipAvailability" - name: "ByteCount" - name: "SYNCount" - name: "SnatConnectionCount" - name: "AllocatedSnatPorts" - name: "UsedSnatPorts"
can you please let us know if we can adjust our configuration to get the availability of individual ports, or is it something not supported yet ?
Expected:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.