prometheus / cloudwatch_exporter Goto Github PK

View Code? Open in Web Editor NEW

877.0 877.0 323.0 647 KB

Metrics exporter for Amazon AWS CloudWatch

License: Apache License 2.0

Java 99.00% Dockerfile 0.74% Makefile 0.26%

cloudwatch_exporter's People

Contributors

Stargazers

Watchers

Forkers

brian-brazil aravindtj discordianfish slikk66 r4k justone rikardev alars-alit draxly originalrecipe jinty javierzunzunegui craigday jfindley wehkamp tobstarr visarya pete0emerson shidel-dev linearregression sthammana cz-moteki-takahiro jianghaitao1221 anthonywc gozer moshebs yuecong atex-jenkins shencan yusha2016 vaishalisutaria lisahelm mnadel imartyn mobcrush sahilg1 button jamessoubry no3am klaus-placr alkalinecoffee lzaldivargs appop bryantrobbins dbonatto sivakumar077 practo meshuga monikaka1996 rblumen-desk mkurdziel sansible kppullin zfq308 nningego vkalladath yaacov rodrigorfk danielm223 deepsonune mikhailadvani aaronmell alaqelsplk friedrich-at-adobe subuk monzo pluralsight venkatamutyala etsangsplk magictour mjrlee oferziss mornor timerope karlhungus zegl spasam unboundev jwenz723 jimdo pirmintapken wwwkang8 jimbeck nwiizo msiuts micrub shamimgeek nishanth-pinnapareddy kiwivogel steven-aerts vkatsikaros seehait danihodovic danielwhatmuff swestcott noqcks yext redpangzi666 kaisermario moolen

cloudwatch_exporter's Issues

Multiple same HELP/TYPE lines for DynamoDB metrics

We're pulling DynamoDB metrics from CloudWatch, and we're interested in individual table metrics (e.g. ConsumedReadCapacity for a table foo_production), and the global secondary index metrics on a table (e.g. ConsumedReadCapacity for global secondary index bar on the foobar_production table).

Our configuration for cloudwatch-exporter looks like this:

- aws_namespace: "AWS/DynamoDB"
  aws_metric_name: "ProvisionedReadCapacityUnits"
  aws_dimensions:
    - TableName
  aws_dimension_select_regex:
    TableName:
      - "(.*)_production"

- aws_namespace: "AWS/DynamoDB"
  aws_metric_name: "ProvisionedReadCapacityUnits"
  aws_dimensions:
    - TableName
    - GlobalSecondaryIndexName
  aws_dimension_select_regex:
    TableName:
      - "(.*)_production"

However, this leads to two occurrences of the same HELP and TYPE lines for these metrics:

# HELP aws_dynamodb_provisioned_read_capacity_units_sum CloudWatch metric AWS/DynamoDB ProvisionedReadCapacityUnits Dimensions: [TableName] Statistic: Sum Unit: Count
# TYPE aws_dynamodb_provisioned_read_capacity_units_sum gauge
...
# HELP aws_dynamodb_provisioned_read_capacity_units_sum CloudWatch metric AWS/DynamoDB ProvisionedReadCapacityUnits Dimensions: [TableName, GlobalSecondaryIndexName] Statistic: Sum Unit: Count
# TYPE aws_dynamodb_provisioned_read_capacity_units_sum gauge

Prometheus doesn't like this: text format parsing error in line 111: second HELP line for metric name "aws_dynamodb_provisioned_read_capacity_units_sum"

I was hoping #11 would fix this, but of course even with a custom HELP text, the TYPE will still be duplicated, and thus be invalid.

Is there a way to get around this, and have both table and global secondary index data in Prometheus?

Add optional timestamps to exposition data

I made a quick hack which added timestamps to the exposition format (I hope I use the right term). Like in the example below.

# HELP aws_elb_healthy_host_count_average CloudWatch metric AWS/ELB HealthyHostCount Dimensions: [LoadBalancerName] Statistic: Average Unit: Count
# TYPE aws_elb_healthy_host_count_average gauge
aws_elb_healthy_host_count_average{job="aws_elb",load_balancer_name="aaa",} 1.0 1455192180000
aws_elb_healthy_host_count_average{job="aws_elb",load_balancer_name="bbb",} 1.0 1455192180000
aws_elb_healthy_host_count_average{job="aws_elb",load_balancer_name="ccc",} 1.0 1455192180000

The advantage of this is that the timestamps in prometheus will match up with CloudWatch, which makes it easier if only a subset of data is exported and CloudWatch is needed for drilldown. Due to this I think this would be a really good option to be able to configure.

However, collector.MetricFamilySamples.Sample does not support the addition of timestamps, and write004 for instance does not either have support for this.

Would this be a desired feature, and in that case what would be the best way to implement it?

Support filtering by tags

The only way to filter metrics right now is by using aws_dimension_select(_regex), which requires the user to have control over the name of those dimensions and encode the necessary information in it to assign a cloudwatch_exporter.
I'd argue, it's common to have multiple stacks or environment in one AWS account which makes filtering by the existing dimensions cumbersome. More over, if CloudFormation is used, you can't even name some things like ElastiCache (or lose the possibility to update the cluster).

If the cloudwatch_exporter could use tags for filtering, a user can just provide a tag to define which metrics should be scraped by which exporter. Since the CloudWatch API itself doesn't support filtering by tags, the exporter would have to, depending on the metric source ("aws_namespace"). I'll use AWS/ELB as an example:

Get all ELBs in a region
Run describe-tags on the ELBs to find the one with the provided tag(s)
Use the ELB's name as a value for the LoadBalancerName dimension filter

Document aws_dimension_select_regex in README

I was looking for a way to export CloudWatch metrics for a subset of our DynamoDB tables to Prometheus. I had almost given up on this official exporter because it seemed like it could not do that, until I found out about aws_dimension_select_regex by reading the source code.

This is how I now fetch CloudWatch metrics for our production DynamoDB tables:

{
  "region": "us-east-1",
  "metrics": [
    {
      "aws_namespace": "AWS/DynamoDB",
      "aws_metric_name": "ProvisionedReadCapacityUnits",
      "aws_dimensions": [
        "TableName"
      ],
      "aws_dimension_select_regex": {
        "TableName": [ "(.*)_production" ]
      }
    }
  ]
}

It would be sweet if aws_dimension_select_regex was mentioned in the README.

[k8s] mounting into "/" is prohibited and cannot find "java"

Hi,

I tried to put cloudwatch_exporter with k8s at AWS. I hit this error:

$ kubectl logs cloudwatch-556263178-eqd6t 
Timestamp: 2016-08-17 01:03:19.851546984 +0000 UTC
Code: System error

Message: mounting into / is prohibited

I share my configmap, deployment, and service as:

$ cat ./cloudwatch-configmap.yaml | nc termbin.com 9999
http://termbin.com/v80k
$ cat ./cloudwatch-deployment.yaml.share | nc termbin.com 9999
http://termbin.com/ndop
$ cat ./cloudwatch-service.yaml | nc termbin.com 9999
http://termbin.com/6rym

Also, after modifying / recompiling prom/cloudwatch-exporter Dockerfile and re-deploy into AWS/k8s, it keeps complaining cannot find "java". I suspect the default Dockerfile was written couple days back during openjdk still on 1.7. Now it's 1.8 but alternatives does not get proper set up. This is my fix:

$ cat Dockerfile | nc termbin.com 9999
http://termbin.com/856g

BTW, the base image for my k8s is debian jessie.

trustAnchors parameter must be non-empty

Hi, I've done what the readme has said. I am supplying the credentials and calling the jar correctly however this error seems to be causing an issue and there is very little about it. Is this something to do with the java version I am using?

Any help in understanding what is going on and what I may need to do to fix it is appreciated

openjdk version "1.8.0_111"
OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14)
OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)

Jan 12, 2017 4:57:31 PM io.prometheus.cloudwatch.CloudWatchCollector collect
WARNING: CloudWatch scrape failed
com.amazonaws.SdkClientException: Unable to execute HTTP request: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-emp
ty
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:970)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:675)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:649)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:632)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$300(AmazonHttpClient.java:600)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:582)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:446)
        at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.doInvoke(AmazonCloudWatchClient.java:931)
        at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.invoke(AmazonCloudWatchClient.java:907)
        at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.listMetrics(AmazonCloudWatchClient.java:652)
        at io.prometheus.cloudwatch.CloudWatchCollector.getDimensions(CloudWatchCollector.java:179)
        at io.prometheus.cloudwatch.CloudWatchCollector.scrape(CloudWatchCollector.java:312)
        at io.prometheus.cloudwatch.CloudWatchCollector.collect(CloudWatchCollector.java:377)
        at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:73)
        at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:88)
        at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:58)
        at java.util.Collections.list(Collections.java:5240)
        at io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:17)
        at io.prometheus.client.exporter.MetricsServlet.doGet(MetricsServlet.java:41)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
        at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:648)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:365)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
        at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
        at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
        at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:627)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:51)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Thread.java:745)

Cannot connect via proxy

Hi,

I'm trying to use this exporter but cannot connect via the proxy. I have set the environment variables http_proxy/https_proxy and supplied the proxy as arguments to java but it still doesn't work:

Mar 21, 2016 1:25:58 PM com.amazonaws.http.AmazonHttpClient executeHelper
INFO: Unable to execute HTTP request: connect timed out
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
..
Mar 21, 2016 1:25:58 PM io.prometheus.cloudwatch.CloudWatchCollector collect
WARNING: CloudWatch scrape failed
com.amazonaws.AmazonClientException: Unable to execute HTTP request: connect timed out
..

Starting with:
$ java -Dhttp.useProxy=true -Dhttps.useProxy=true -Dhttps.proxyHost=http://proxy-dev.abc.com -Dhttps.proxyPort=3128 -Dhttp.proxyHost=http://proxy-dev.abc.com -Dhttp.proxyPort=3128 -jar ./target/cloudwatch_exporter-0.2-SNAPSHOT-jar-with-dependencies.jar 9106 config.yml

Surely I am not the only one using a proxy? How can I fix this?

Thanks
Stef

CloudWatch scrape failed cannot be cast to java.util.List

Dec 22, 2016 8:39:36 PM io.prometheus.cloudwatch.CloudWatchCollector collect
WARNING: CloudWatch scrape failed
java.lang.ClassCastException: java.lang.String cannot be cast to java.util.List
at io.prometheus.cloudwatch.CloudWatchCollector.metricIsInAwsDimensionSelectRegex(CloudWatchCollector.java:240)
at io.prometheus.cloudwatch.CloudWatchCollector.useMetric(CloudWatchCollector.java:207)
at io.prometheus.cloudwatch.CloudWatchCollector.getDimensions(CloudWatchCollector.java:187)
at io.prometheus.cloudwatch.CloudWatchCollector.scrape(CloudWatchCollector.java:312)
at io.prometheus.cloudwatch.CloudWatchCollector.collect(CloudWatchCollector.java:377)
at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:73)
at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.(CollectorRegistry.java:65)
at io.prometheus.client.CollectorRegistry.metricFamilySamples(CollectorRegistry.java:56)
at io.prometheus.client.exporter.MetricsServlet.doGet(MetricsServlet.java:41)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)

failed working in AWS china region

2016-03-28 09:22:09.887:INFO:oejs.Server:jetty-8.y.z-SNAPSHOT
2016-03-28 09:22:09.935:INFO:oejs.AbstractConnector:Started [email protected]:9106
Mar 28, 2016 9:22:14 AM com.amazonaws.http.AmazonHttpClient executeHelper
INFO: Unable to execute HTTP request: monitoring.cn-north-1.amazonaws.com: Name or service not known
java.net.UnknownHostException: monitoring.cn-north-1.amazonaws.com: Name or service not known
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:922)
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
    at java.net.InetAddress.getAllByName0(InetAddress.java:1269)
    at java.net.InetAddress.getAllByName(InetAddress.java:1185)
    at java.net.InetAddress.getAllByName(InetAddress.java:1119)
    at com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
    at com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.resolveHostname(DefaultClientConnectionOperator.java:259)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:159)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:304)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:611)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:446)
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:769)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:506)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:318)
    at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.invoke(AmazonCloudWatchClient.java:886)
    at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.listMetrics(AmazonCloudWatchClient.java:665)
    at io.prometheus.cloudwatch.CloudWatchCollector.getDimensions(CloudWatchCollector.java:161)
    at io.prometheus.cloudwatch.CloudWatchCollector.scrape(CloudWatchCollector.java:294)
    at io.prometheus.cloudwatch.CloudWatchCollector.collect(CloudWatchCollector.java:359)
    at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:73)
    at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.<init>(CollectorRegistry.java:65)
    at io.prometheus.client.CollectorRegistry.metricFamilySamples(CollectorRegistry.java:56)
    at io.prometheus.client.exporter.MetricsServlet.doGet(MetricsServlet.java:41)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:648)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at org.eclipse.jetty.server.Server.handle(Server.java:365)
    at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
    at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
    at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
    at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:627)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:51)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
    at java.lang.Thread.run(Thread.java:745)

[Question] Exporting SQS metric

---
region: ap-south-1
metrics:
- aws_namespace: AWS/SQS
  aws_metric_name: ApproximateNumberOfMessagesVisible
  aws_dimensions: [QueueName]
  aws_statistics: [Average]

Deployed prometheus cloudwatch exporter in kubernetes using above config. I am trying to look for aws_sqs_approximate_number_of_messages_visible_average in the prometheus UI but it does not show there.
The logs shows no errors and IAM permission are correctly configured. I am able to access metrics from aws cli

Please suggest how to debug the issue.

Configured metrics not being scraped

I'm using a configuration that looks like the attached exhibit 1 below, and get back the metrics in exhibit 2. The lambda and dynamo metrics are missing. I would expect that the logs would show why the metrics are not being gathered, but there is no further information.

Why is this config breaking?

What part of the code would you add logging to, so that it's easier to fix broken configs in the future?
I'd happily submit a PR, but have no idea where to start.

Exhibit 1:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cloudwatch-config
data:
  config.yml: |
    region: us-west-2
    metrics:
    - aws_namespace: AWS/Lambda
      aws_metric_name: Errors
      aws_dimensions: [FunctionName, Resource, Version, Alias]
      aws_statistics: [Count]

    - aws_namespace: AWS/Lambda
      aws_metric_name: "Dead Letter Error"
      aws_dimensions: [FunctionName, Resource, Version, Alias]
      aws_statistics: [Count]

    - aws_namespace: AWS/Lambda
      aws_metric_name: Throttles
      aws_dimensions: [FunctionName, Resource, Version, Alias]
      aws_statistics: [Count]

    - aws_namespace: AWS/Lambda
      aws_metric_name: Duration
      aws_dimensions: [FunctionName, Resource, Version, Alias]
      aws_statistics: [Milliseconds]

    - aws_namespace: AWS/ES
      aws_metric_name: FreeStorageSpace
      aws_dimensions: [DomainName, ClientId]
      aws_statistics: [Minimum]

    - aws_namespace: AWS/ES
      aws_metric_name: SearchableDocuments
      aws_dimensions: [DomainName, ClientId]
      aws_statistics: [Sum, Minimum, Maximum]

    - aws_namespace: AWS/ES
      aws_metric_name: CPUUtilization
      aws_dimensions: [DomainName, ClientId]
      aws_statistics: [Maximum, Average]

    - aws_namespace: AWS/ES
      aws_metric_name: "ClusterStatus.yellow"
      aws_dimensions: [DomainName, ClientId]
      aws_statistics: [Minimum, Maximum]

    - aws_namespace: AWS/ES
      aws_metric_name: "ClusterStatus.red"
      aws_dimensions: [DomainName, ClientId]
      aws_statistics: [Minimum, Maximum]

    - aws_namespace: AWS/DynamoDB
      aws_metric_name: ReadThrottleEvents
      aws_dimensions: [TableName, GlobalSecondaryIndexName]
      aws_statistics: [Sum, SampleCount]

Exhibit 2:

 HELP cloudwatch_requests_total API requests made to CloudWatch
# TYPE cloudwatch_requests_total counter
cloudwatch_requests_total 81855.0
# HELP aws_es_free_storage_space_minimum CloudWatch metric AWS/ES FreeStorageSpace Dimensions: [DomainName, ClientId] Statistic: Minimum Unit: Megabytes
# TYPE aws_es_free_storage_space_minimum gauge
aws_es_free_storage_space_minimum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 212726.246
# HELP aws_es_searchable_documents_sum CloudWatch metric AWS/ES SearchableDocuments Dimensions: [DomainName, ClientId] Statistic: Sum Unit: Count
# TYPE aws_es_searchable_documents_sum gauge
aws_es_searchable_documents_sum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 1.2435096795E10
# HELP aws_es_searchable_documents_minimum CloudWatch metric AWS/ES SearchableDocuments Dimensions: [DomainName, ClientId] Statistic: Minimum Unit: Count
# TYPE aws_es_searchable_documents_minimum gauge
aws_es_searchable_documents_minimum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 1.243503844E9
# HELP aws_es_searchable_documents_maximum CloudWatch metric AWS/ES SearchableDocuments Dimensions: [DomainName, ClientId] Statistic: Maximum Unit: Count
# TYPE aws_es_searchable_documents_maximum gauge
aws_es_searchable_documents_maximum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 1.24351936E9
# HELP aws_es_cpuutilization_maximum CloudWatch metric AWS/ES CPUUtilization Dimensions: [DomainName, ClientId] Statistic: Maximum Unit: Percent
# TYPE aws_es_cpuutilization_maximum gauge
aws_es_cpuutilization_maximum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 6.0
# HELP aws_es_cpuutilization_average CloudWatch metric AWS/ES CPUUtilization Dimensions: [DomainName, ClientId] Statistic: Average Unit: Percent
# TYPE aws_es_cpuutilization_average gauge
aws_es_cpuutilization_average{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 4.714285714285714
# HELP aws_es_cluster_status_yellow_minimum CloudWatch metric AWS/ES ClusterStatus.yellow Dimensions: [DomainName, ClientId] Statistic: Minimum Unit: Count
# TYPE aws_es_cluster_status_yellow_minimum gauge
aws_es_cluster_status_yellow_minimum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 0.0
# HELP aws_es_cluster_status_yellow_maximum CloudWatch metric AWS/ES ClusterStatus.yellow Dimensions: [DomainName, ClientId] Statistic: Maximum Unit: Count
# TYPE aws_es_cluster_status_yellow_maximum gauge
aws_es_cluster_status_yellow_maximum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 0.0
# HELP aws_es_cluster_status_red_minimum CloudWatch metric AWS/ES ClusterStatus.red Dimensions: [DomainName, ClientId] Statistic: Minimum Unit: Count
# TYPE aws_es_cluster_status_red_minimum gauge
aws_es_cluster_status_red_minimum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 0.0
# HELP aws_es_cluster_status_red_maximum CloudWatch metric AWS/ES ClusterStatus.red Dimensions: [DomainName, ClientId] Statistic: Maximum Unit: Count
# TYPE aws_es_cluster_status_red_maximum gauge
aws_es_cluster_status_red_maximum{job="aws_es",instance="",domain_name="es-hp-id-dev-us-west-2",client_id="871386769552",} 0.0
# HELP cloudwatch_exporter_scrape_duration_seconds Time this CloudWatch scrape took, in seconds.
# TYPE cloudwatch_exporter_scrape_duration_seconds gauge
cloudwatch_exporter_scrape_duration_seconds 0.382327836
# HELP cloudwatch_exporter_scrape_error Non-zero if this scrape failed.
# TYPE cloudwatch_exporter_scrape_error gauge
cloudwatch_exporter_scrape_error 0.0

Not getting EC2 data

I have setup the cloudwatch exporter using the default template and I can get ELB & Elasticache data but I don't get any EC2 data when I add the following:

- aws_namespace: AWS/EC2
  aws_metric_name: CPUUtilization
  aws_dimensions: [InstanceId]

I do get data if I run: aws cloudwatch list-metrics --namespace AWS/EC2 --metric-name CPUUtilization

So what I am doing wrong?

Also beside the example.yml, is there a template that cover most/all of the other name space?

Add a health check endpoint

As a user of the CloudWatch exporter I want to be able to deploy the exporter on ECS behind an ELB without burning money at huge scale

Current situation

The ELB needs an health check endpoint which responds with HTTP200
The only exposed path in the exporter is /metrics
Every request against /metrics does CloudWatch calls
Health checks (every 10s N CloudWatch requests) are expensive

Expected situation

The ELB needs an health check endpoint which responds with HTTP200
There is an exposed path /status which responds with HTTP200 reporting everything is fine (at least the Java application is available to respond)
Health check requests do not burn money
(I don't need to explain the huge AWS bill to my boss 😄 )

Expose period_seconds as a metric

I am currently exporting some DynamoDB metrics with the exporter:

     - aws_namespace: AWS/DynamoDB
       aws_metric_name: ConsumedReadCapacityUnits
       aws_dimensions: [Tablename]
       aws_statistics: [Sum]
     - aws_namespace: AWS/DynamoDB
       aws_metric_name: ConsumedWriteCapacityUnits
       aws_dimensions: [TableName]
       aws_statistics: [Sum]

In my charts i actually want to display the average amount of consumed capacity. Unfortunately just pulling the average value for these metrics from cloudwatch is not really helpful, as this gives incorrect data.
In my charts i now have to divide the Sum value by the period_seconds value (600 for my case). As this might change, it would be very helpful to have the period_seconds exposed as a metric to use it in the grafana dashboards.

What is the cpu / memory / size footprint of this program ?

Just wondering, because its java backed, what's the memory use and Docker image size for this program ?

Exception in thread "main" java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer

I have a config like this:

{
  "region": "eu-west-1",
  "metrics": [
    {"aws_namespace": "AWS/Kinesis", "aws_metric_name": "IncomingRecords",
     "aws_dimensions": ["StreamName"], "aws_statistics": ["Sum"],
     "period_seconds": 300}
  ]
}

When starting the exporter it results in:

Exception in thread "main" java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
2015-02-24 14:56:05,359 DEBG 'cloudwatchexporter' stderr output:

    at io.prometheus.cloudwatch.CloudWatchCollector.<init>(CloudWatchCollector.java:110)
    at io.prometheus.cloudwatch.CloudWatchCollector.<init>(CloudWatchCollector.java:50)
    at io.prometheus.cloudwatch.WebServer.main(WebServer.java:15)

AWS ALB

Seems the target group dimension is not working?

aws_namespace: AWS/ApplicationELB
aws_metric_name: TargetResponseTime
aws_dimensions: [TargetGroup]
aws_dimension_select:
TargetGroup: [targetgroup/example/example-arn]
aws_statistics: [Average]

Rule help text override is ignored

According to the code it's possible to add a help tag to a JSON metrics object, which – I guess – should then be used in the metrics output for Prometheus. For example:

{
  "metrics": [
    {
      "help": "test",
      "aws_namespace": "AWS/DynamoDB",
      "aws_metric_name": "ProvisionedReadCapacityUnits",
      "aws_dimensions": [ ... ],
      "aws_dimension_select_regex": { ... }
    }
  ]
}

However, when compiling the final help text for Prometheus, this override is ignored. Instead of returning rule.help, a generated help string is always returned.

Scraping stops working from time to time

We have deployed the last version of the cloudwatch-exporter. We noticed that it stops getting logs from AWS sometimes and never recovers, having to restart it to fix it. What could be the cause? Maybe the size of the response? I included some logs below:

WARNING: CloudWatch scrape failed
Message: Read timed out). Response Code: 200, Response Text: OK
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1525)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:721)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:704)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:672)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:654)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:518)
	at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.doInvoke(AmazonCloudWatchClient.java:965)
	at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.listMetrics(AmazonCloudWatchClient.java:684)
	at io.prometheus.cloudwatch.CloudWatchCollector.getDimensions(CloudWatchCollector.java:188)
	at io.prometheus.cloudwatch.CloudWatchCollector.scrape(CloudWatchCollector.java:329)
	at java.util.Collections.list(Collections.java:3688)
	at io.prometheus.client.exporter.MetricsServlet.doGet(MetricsServlet.java:40)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
	at org.eclipse.jetty.server.Server.handle(Server.java:365)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
	at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:51)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
	at java.lang.Thread.run(Thread.java:745)
	at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:599)
	at com.amazonaws.transform.StaxUnmarshallerContext.nextEvent(StaxUnmarshallerContext.java:220)
com.amazonaws.SdkClientException: Unable to unmarshall response (ParseError at [row,col]:[1039,14]
	at com.amazonaws.services.cloudwatch.model.transform.ListMetricsResultStaxUnmarshaller.unmarshall(ListMetricsResultStaxUnmarshaller.java:30)
	at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:101)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1501)
	... 41 more
Sep 24, 2017 1:45:24 AM io.prometheus.cloudwatch.CloudWatchCollector collect
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1222)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:747)
	at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.invoke(AmazonCloudWatchClient.java:941)
	at io.prometheus.cloudwatch.CloudWatchCollector.collect(CloudWatchCollector.java:410)
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:143)
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:158)
	at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:128)
	at io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:22)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:648)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:627)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1039,14]
Message: Read timed out
	at com.sun.xml.internal.stream.XMLEventReaderImpl.peek(XMLEventReaderImpl.java:275)
	at com.amazonaws.services.cloudwatch.model.transform.DimensionStaxUnmarshaller.unmarshall(DimensionStaxUnmarshaller.java:40)
	at com.amazonaws.services.cloudwatch.model.transform.ListMetricsResultStaxUnmarshaller.unmarshall(ListMetricsResultStaxUnmarshaller.java:54)
	at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:43)
	at com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)

Thanks!

Export empty metrics as 0

Thanks for this exporter, it's really great to consolidate all metrics in one Prometheus instance!

I'm having an issue while setting up alerts, though: if there are no 5XX errors, that metrics isn't exported at all.

I was trying to calculate the error percentage by doing something like aws_elb_httpcode_backend_5_xx_sum / aws_elb_request_count_sum, but as the metric isn't exported, i don't get any data - not even 0.

Is this something that can be fixed? Thanks!

Unable to use Docker image behind corporate proxy

I have trying to run the Docker image behind a corporate proxy. How can i pass in proxy username and proxy port to the docker image? I tried setting HTTP_PROXY and http_proxy environment variables, but they do not seem to work.
Thus, the container is not able to talk to the AWS api.

Scraping fails when trying to scrape a certain metric from more then one target of the same type

I'm trying to use the cloudwatch exporter in order to monitor several ec2 instances that are divided to 2 logic groups.
The cloudwatch configuration looks something like this:

...
...
# EC2 group 1
- aws_namespace: AWS/EC2
  aws_metric_name: CPUUtilization
  aws_dimensions: [InstanceId]
  aws_dimension_select:
   InstanceId: [i-X, i-Y, i-Z]
...
...
# EC2 group 2
- aws_namespace: AWS/EC2
  aws_metric_name: CPUUtilization
  aws_dimensions: [InstanceId]
  aws_dimension_select:
   InstanceId: [i-A, i-B, i-C]

After starting the container and looking at the prometheus I get the following error:

format parsing error in line xxx: second HELP line for metric name "aws_ec2_cpuutilization_sum"

and indeed the scraped data has 2 HELP lines per metric.

Shouldn't the exporter merge the 2 metrics to same area in the scraped data and have a single help line for the block?

Thanks!

Export empty instance label

Assuming one isn't already set, we should export and empty instance label. This means that when the cloudwatch exporter moves machines, the time series names won't change.

Scrape duration

Hi,

I've got an issue with scraping for Network metric of the Instances. A scrape takes a very long time, 18 seconds or more. Scraping other metrics, like cpu utilization or scraping the same metric but for an AutoScalingGroupName works fine.
Also I cannot get metrics for an individual instance, as if the select filter doesn't work. It only gives me all instances, even if the select filter is applied for an individual instance.

I've attached a file with the configuration of the scraper and the output.

Any ideas?
Thanks
Stef

scraping.txt

Docker images don't have tags

It looks like prom/cloudwatch-exporter in Hub is set to autobuild from master on this repository, and to tag the image as latest: https://hub.docker.com/r/prom/cloudwatch-exporter/builds/

If latest is the only tag available, there's no way to guarantee that two containers using the "latest" tag are actually running the same code. This is also a problem if you build your own container using "FROM prom/cloudwatch-exporter" - no way to guarantee you're starting from the same version.

It would be great if you could set up Hub to also autobuild based on repo tag (this is an option in the Type dropdown on the Build Settings tab in Hub) and then add version tags to this repo (v0.1, v0.2).

This isn't an official Docker request, just asking because we're using this container and it would make deploying it more deterministic. Thanks!

Deploying to kubernetes

Docs/yamls needed for easy kubernetes deployment

ES metrics not appearing

Hi - I am using the following config point, but see nothing for the ES metrics appearing when calling the metrics URL.

- aws_namespace: AWS/ES
      aws_metric_name: Nodes
      aws_dimensions: [ClientId, DomainName]
      aws_dimension_select:
        ClientId: [ 123456789 ]
        DomainName: [ myEsDomain ]
      aws_statistics: [ SampleCount ]
      period_seconds: 3600

I can query the metrics via aws cli, but using the same detail in the config still doesn't net anything.

Am I missing something or is this possibly a AWS issue? I am currently successfully pulling EC2 and RDS metrics.

ELB/ALB metrics not available

The current version doesn't support the new Application Load Balancer metrics. We migrated our Load Balancers from the Classic version to the new ALB and all of our metrics disappeared.

I used the following config which is similar to the example in the Readme.

region: us-east-1
metrics:

aws_namespace: AWS/ELB
aws_metric_name: RequestCount
aws_dimensions: [AvailabilityZone, LoadBalancerName]

I do have data in the Cloudwatch dashboard.

Adding the new TargetGroup dimension results in no data being returned even Classic Load Balancer metrics.

Export data to use in a graph

Not an issue, but it's more a question.
Currently when a specific event occurs in my application, a metric is recorded in CloudWatch with the value of 1.
I have thousands of events a day. And I want to display graph data about how many times this event occurs per hour (or per day) in Prometheus.
I thought I could use this tool for that, but I only see a single value per dimension (as stated in the docs: a gauge). Am I doing something wrong?

Bad idea to export label job

Due to how relabeling works, job is not available during relabeling, so we as users cannot do anything with the label. It cannot be mutated or copied into another label. It simply appears as label exported_job into Prometheus master.

Suggest changing the name of the exported job label to "namespace" which is both more appropriate and does not conflict at all with the built in autolabeling of metrics.

Thanks.

Unnecessarily hard to override config file location.

The Dockerfile specifies the path to the config file as part of the entrypoint:

ENTRYPOINT [ "java", "-jar", "/cloudwatch_exporter.jar", "9106", "/config.yml" ]

I run kubernetes and would like to mount the config as a volume; however, I can't mount the volume on the root as that would make the rest of the file system unreadable. I could override the entrypoint in the location, but then I'd have to specify the entire entrypoint command.

Suggestion 1: Use /etc/cloudwatch-exporter as the default path, analogous with other prometheus binaries.
Suggestion 2: Move /config.yml to a CMD statement instead, so it can more easily be overridden in kubernetes 'args' config, ie:

ENTRYPOINT [ "java", "-jar", "/cloudwatch_exporter.jar", "9106" ]
CMD [ "/config.yml" ]

(Dockerfile) /config.yml breaks kubernetes!

You cannot mount a ConfigMap into a file, only a directory in kubernetes.

This means that you can only run your config for cloudwatch in a ConfigMap if you override the entrypoint!

This should be overridable by an environment variable and ideally you shouldn't force people to need a config file.

https://medium.com/@kelseyhightower/12-fractured-apps-1080c73d481c

Getting metrics for dead ELBs might cost a lot of money

This might be also relative to other resources, but I felt the pain with ELBs.

Consider the following configuration:

aws_namespace: AWS/ELB
aws_metric_name: UnHealthyHostCount
aws_dimensions: [AvailabilityZone, LoadBalancerName]
aws_dimension_select_regex:
LoadBalancerName: [myELB-.*]

What happens in CloudWatchCollector is that ELBs that do not exist at the moment of scraping are also returned as part of the available metrics in the results of getDimensions(). Then the scrape() method calls an API per ELB and rules out the resources that have too old data points. You pay fr each API you call, so you just paid for an API that brought you metrics for an ELB that is dead already.

Imagine a development environment, where ELBs are constantly created and destroyed. The list gets longer with time and the costs go higher.

I handled this issue by forking the project and using the ELB API to filter out dead ELBs before the metrics API is called, and I will be happy to create a merge request for that, but the solution is specific to ELBs, and I guess the problem might be relevant to other resource types as well.

Thanks.

AWS/ELB metrics do not work

why I can't get metric HTTPCode_Backend_2XX, HTTPCode_Backend_4XX and HTTPCode_Backend_5XX data, thanks.

my config file:

---
region: us-west-1
metrics:
- aws_namespace: AWS/ELB
  aws_metric_name: HealthyHostCount
  aws_dimensions: [AvailabilityZone, LoadBalancerName]
  aws_statistics: [Average]

- aws_namespace: AWS/ELB
  aws_metric_name: UnHealthyHostCount
  aws_dimensions: [AvailabilityZone, LoadBalancerName]
  aws_statistics: [Average]

- aws_namespace: AWS/ELB
  aws_metric_name: RequestCount
  aws_dimensions: [AvailabilityZone, LoadBalancerName]
  aws_statistics: [Sum]

- aws_namespace: AWS/ELB
  aws_metric_name: Latency
  aws_dimensions: [AvailabilityZone, LoadBalancerName]
  aws_statistics: [Average]

- aws_namespace: AWS/ELB
  aws_metric_name: HTTPCode_Backend_2XX
  aws_dimensions: [AvailabilityZone, LoadBalancerName]
  aws_statistics: [Sum]

- aws_namespace: AWS/ELB
  aws_metric_name: HTTPCode_Backend_4XX
  aws_dimensions: [AvailabilityZone, LoadBalancerName]
  aws_statistics: [Sum]

- aws_namespace: AWS/ELB
  aws_metric_name: HTTPCode_Backend_5XX
  aws_dimensions: [AvailabilityZone, LoadBalancerName]
  aws_statistics: [Sum]

An issue gathering metrics from SQL queues

After configure a bunch of SQL metrics (ApproximateNumberOfMessagesVisible) for some queues, trying to recover these metrics, our Prometheus server show this message as a result of the CloudWatch exporter:

text format parsing error in line 54: second HELP line for metric name "aws_sqs_approximate_number_of_messages_visible_sum"

Here's a sample of these metrics

 53 aws_sqs_approximate_number_of_messages_visible_sum{job="aws_sqs",instance="",queue_name="provisioning-notify-queue-protesis",} 0.0
 54 # HELP aws_sqs_approximate_number_of_messages_visible_sum CloudWatch metric AWS/SQS ApproximateNumberOfMessagesVisible Dimensions: [QueueName] Statistic: Sum Unit: Count
 55 # TYPE aws_sqs_approximate_number_of_messages_visible_sum gauge

and here's a snippet of the metric configuration file:

 - aws_namespace: AWS/SQS
  aws_metric_name: ApproximateNumberOfMessagesVisible
  aws_dimensions:
  - QueueName
  aws_dimension_select:
    SQSName:
    - survela-provisioning-notify
    - survela-provisioning-check
  aws_statistics:
  - Sum
- aws_namespace: AWS/SQS
  aws_metric_name: ApproximateNumberOfMessagesVisible
  aws_dimensions:
  - QueueName
  aws_dimension_select:
    SQSName:
    - provisioning-notify-queue-pro
    - provisioning-check-queue-pro
  aws_statistics:
  - Sum

any help would be fantastic, thanks in advance

Outdated dependency in pom.xml

snakeyaml 1.17 appears to no longer be available, on 1.18 now.
https://oss.sonatype.org/content/repositories/snapshots/org/yaml/snakeyaml/

"AWS/Billing" metrics do not appear

Hi,

first of all thanks for your work! While testing the exporter, it turned out, that I was not able to fetch estimated charges (using a valid iam role). cloudwatch_exporter does not show any error in logs or scrape_error counter.

Here is a sample config:


---
region: us-east-1
metrics:
 - aws_namespace: AWS/Billing
   aws_metric_name: EstimatedCharges
   aws_dimensions: [ServiceName, LinkedAccount, Currency]
   aws_dimension_select:
     Currency: [USD]
   aws_statistics: [Sum]

Lots of requests for simple metrics

We get these:

---
region: ap-southeast-1
metrics:
 - aws_namespace: AWS/ECS
   aws_metric_name: CPUUtilization
   aws_dimensions: [ClusterName, ServiceName]
   aws_statistics: [SampleCount, Sum, Minimum, Maximum, Average]
 - aws_namespace: AWS/ECS
   aws_metric_name: MemoryUtilization
   aws_dimensions: [ClusterName, ServiceName]
   aws_statistics: [SampleCount, Sum, Minimum, Maximum, Average]
 - aws_namespace: AWS/ECS
   aws_metric_name: CPUReservation
   aws_dimensions: [ClusterName]
   aws_statistics: [Average]
 - aws_namespace: AWS/ECS
   aws_metric_name: MemoryReservation
   aws_dimensions: [ClusterName]
   aws_statistics: [Average]

This causes 58 requests upon every scrape.

Is there no way these can be batched?

add a CMD instruction for the dockerfile like the `prom/prometheus` or `prom/alertmanger` containers

AWS/S3 metrics do not work

With a barebones config file:


---
region: us-east-1
metrics:
- aws_namespace: AWS/S3
  aws_metric_name: BucketSizeBytes

I get no results. In fact, it doesn't even claim it's talking to AWS at all, and the request counter stays at zero.

curl -s 0:29992/metrics
# HELP cloudwatch_requests_total API requests made to CloudWatch
# TYPE cloudwatch_requests_total counter
cloudwatch_requests_total 0.0
# HELP cloudwatch_exporter_scrape_duration_seconds Time this CloudWatch scrape took, in seconds.
# TYPE cloudwatch_exporter_scrape_duration_seconds gauge
cloudwatch_exporter_scrape_duration_seconds 0.588527659
# HELP cloudwatch_exporter_scrape_error Non-zero if this scrape failed.
# TYPE cloudwatch_exporter_scrape_error gauge
cloudwatch_exporter_scrape_error 0.0

If I come back in 5 or ten seconds, cloudwatch_requests_total will increment by 1. But no errors, and no data.
I have fiddled about with various settings for dimensions, but never does it report any datas.

I can get this working fine for AWS/ELB resources, so I know I can talk to AWS.

Is aws_dimensions really optional?

The README suggests that the aws_dimensions value is optional, but all examples include at least one dimension, and in the case of AWS/ApplicationELB, I received no metrics until a value of "LoadBalancer" was included.

If the value is "optional" in the sense that it is not required for every namespace, can this be clarified in the docs?

Cheers,
...Bryan

Custom Metrics not being exported

I´ve configured a Custom Cloudwatch Metric like this:
metrics:
``

aws_namespace: Custom/PageStats
aws_metric_name: Page_Generation_Time
aws_statistics: [Average]
``

And its not being exported.

Is this supported?

Can't specify region for a metric

Would be nice if it was possible, as I am trying to get at AWS/Billing metrics from a region different from us-east-1

I can easily run 2 cloudwatch_exporters, but wouldn't it be nice to be able to just do:

region: us-west-2
metrics:
  # ELB Metrics
- aws_namespace: AWS/ELB
  aws_metric_name: HealthyHostCount
  aws_dimensions: [LoadBalancerName]
  aws_statistics: [Average,Maximum]

  [...]

   # Billing is strange, only exists in us-east-1
 - aws_namespace: AWS/Billing
   region: us-east-1
   range_seconds: 21600
   aws_metric_name: EstimatedCharges
   aws_dimensions: [ServiceName,Currency]
   aws_dimension_select:
     Currency: [USD]

Read timeout

Hi Team

I trued running the exporter and got the below error

Sep 05, 2017 12:30:12 PM io.prometheus.cloudwatch.CloudWatchCollector collect
WARNING: CloudWatch scrape failed
com.amazonaws.SdkClientException: Unable to unmarshall response (ParseError at [row,col]:[560,40]
Message: Read timed out). Response Code: 200, Response Text: OK
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1525)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1222)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:747)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:721)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:704)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:672)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:654)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:518)
at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.doInvoke(AmazonCloudWatchClient.java:965)
at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.invoke(AmazonCloudWatchClient.java:941)
at com.amazonaws.services.cloudwatch.AmazonCloudWatchClient.listMetrics(AmazonCloudWatchClient.java:684)
at io.prometheus.cloudwatch.CloudWatchCollector.getDimensions(CloudWatchCollector.java:188)
at io.prometheus.cloudwatch.CloudWatchCollector.scrape(CloudWatchCollector.java:329)
at io.prometheus.cloudwatch.CloudWatchCollector.collect(CloudWatchCollector.java:410)
at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:143)
at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:158)
at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:128)
at java.util.Collections.list(Collections.java:5240)
at io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:22)
at io.prometheus.client.exporter.MetricsServlet.doGet(MetricsServlet.java:40)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:648)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:365)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:627)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:51)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:748)

Any help please...

polling frequency

I don't seem to see any configuration for the polling frequency. Is this something which is actually modifiable? If not, is this the sort of pull request that would be welcome?

Dynamic config reloading

Any plans for this app to support reloading the config file during runtime, similar to prometheus' curl -X POST http://127.0.0.1:443/-/reload endpoint?

S3 metrics not being pulled

I've been using the docker container for a while and I thought everything was working correctly. I just noticed I'm not getting S3 metrics. I see data for EC2, ELB, Elasticache, RDS, etc. If I go into the cloudwatch console on AWS I can see S3 metrics. It just looks like the S3 numbers are not being pulled in.

Here is my test cloudwatch.yml file: