Coder Social home page Coder Social logo

prometheus-ecs-discovery's Introduction

Prometheus Amazon ECS discovery

Prometheus has native Amazon EC2 discovery capabilities, but it does not have the capacity to discover ECS instances that can be scraped by Prometheus. This program is a Prometheus File Service Discovery (file_sd_config) integration that bridges said gap.

Help

Run prometheus-ecs-discovery --help to get information.

The command line parameters that can be used are:

  • -config.cluster (string): the name of a cluster to scrape (defaults to scraping all clusters)
  • -config.scrape-interval (duration): interval at which to scrape the AWS API for ECS service discovery information (default 1m0s)
  • -config.scrape-times (int): how many times to scrape before exiting (0 = infinite)
  • -config.write-to (string): path of file to write ECS service discovery information to (default "ecs_file_sd.yml")
  • -config.role-arn (string): ARN of the role to assume when scraping the AWS API (optional)
  • -config.server-name-label (string): Docker label to define the server name (default "PROMETHEUS_EXPORTER_SERVER_NAME")
  • -config.job-name-label (string): Docker label to define the job name (default "PROMETHEUS_EXPORTER_JOB_NAME")
  • -config.path-label (string): Docker label to define the scrape path of the application (default "PROMETHEUS_EXPORTER_PATH")
  • -config.filter-label (string): docker label (and optional value) to filter on "NAME_OF_LABEL[=VALUE]".
  • -config.port-label (string): Docker label to define the scrape port of the application (if missing an application won't be scraped) (default "PROMETHEUS_EXPORTER_PORT")

Usage

First, build this program using the usual go get mechanism.

Then, run it as follows:

  • Ensure the program can write to a directory readable by your Prometheus master instance(s).
  • Export the usual AWS_REGION, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY into the environment of the program, making sure that the keys have access to the EC2 / ECS APIs (IAM policies should include ECS:ListClusters, ECS:ListTasks, ECS:DescribeTask, EC2:DescribeInstances, ECS:DescribeContainerInstances, ECS:DescribeTasks, ECS:DescribeTaskDefinition, ECS:DescribeClusters). If the program needs to assume a different role to obtain access, this role's ARN may be passed in via the --config.role-arn option. This option also allows for cross-account access, depending on which account the role is defined in.
  • Start the program, using the command line option -config.write-to to point the program to the specific folder that your Prometheus master can read from.
  • Add a file_sd_config to your Prometheus master:
scrape_configs:
- job_name: ecs
  file_sd_configs:
    - files:
      - /path/to/ecs_file_sd.yml
      refresh_interval: 10m
  # Drop unwanted labels using the labeldrop action
  metric_relabel_configs:
    - regex: task_arn
      action: labeldrop

To scrape the containers add following docker labels to them:

  • PROMETHEUS_EXPORTER_PORT specify the container port where prometheus scrapes (mandatory)
  • PROMETHEUS_EXPORTER_SERVER_NAME specify the hostname here, per default ip is used (optional)
  • PROMETHEUS_EXPORTER_JOB_NAME specify job name here (optional)
  • PROMETHEUS_EXPORTER_PATH specify alternative scrape path here (optional)
  • PROMETHEUS_EXPORTER_SCHEME specify an alternative scheme here, default is http (optional)

By docker labels one means dockerLabels map in ECS task definition JSONs like that:

{
  ...
  "containerDefinitions": [
    {
      ...
      "dockerLabels": {
        "PROMETHEUS_EXPORTER_PORT": "5000"
      }
    }
  ]
  ...
}

That's it. You should begin seeing the program scraping the AWS APIs and writing the discovery file (by default it does that every minute, and by default Prometheus will reload the file the minute it is written). After reloading your Prometheus master configuration, this program will begin informing via the discovery file of new targets that Prometheus must scrape.

prometheus-ecs-discovery's People

Contributors

akuntsch avatar angulito avatar brennoo avatar bsingr avatar chicofranchico avatar dcelasun avatar ethervoid avatar filipeestacio avatar henninge avatar houqp avatar hugowetterberg avatar itays333 avatar johscheuer avatar lfranchi avatar maxaf avatar nidhi-ag avatar ptqa avatar rhowe avatar rudd-o avatar vincepii avatar wachino avatar waderobson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prometheus-ecs-discovery's Issues

Can there be a release?

Hi,

This may not be the correct place to ask but would it possible to have a new release considering the last one was in Dec 2019?

Thanks

Need to change path of ecs_file_sd.yml ?

Hi ,

I need to change default path of ecs_file_sd.yml , by default it set on root location i read docs and get to know that it can change by -config.write-to.

But i am getting error , could you please let me know the full docker command for the same.

It would be so helpful for me.

Is there a way to pass a list of clusters to scrape ?

Hello,
I am wondering if there is already a way to pass a list of clusters instead of only one.
It is possible with aws cli, not sure about the go sdk implementation.
aws ecs describe-clusters --clusters cluster1 cluster2

I tried a bit but no luck for now.

Error when fetching data for Fargate task

Hi!

First of all thanks for filling the gap prometheus has with ECS.

In work we are trialling Fargate as way to release to production but of course we can't do w/o monitoring.

I just tried your code via https://hub.docker.com/r/mthenw/prometheus-ecs-discovery/~/dockerfile/

But when I try to run it, it crashes when fetching data for Fargate taks/services

2017/12/16 20:32:53 Described task definition arn:aws:ecs:us-east-1:1234567890:task-definition/web-xxx-xxxx:5
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x6f5261]

goroutine 1 [running]:
main.AddContainerInstancesToTasks(0xc4200800a8, 0xc4200800b8, 0xc420196500, 0xf, 0x10, 0xf, 0x10, 0x0, 0x0, 0x0)
	/go/src/github.com/teralytics/prometheus-ecs-discovery/main.go:304 +0x1a1
main.GetAugmentedTasks(0xc4200800a8, 0xc4200800b8, 0xc42017e1b0, 0x6, 0x6, 0x2, 0x2, 0xc420115f6c, 0x2, 0x2)
	/go/src/github.com/teralytics/prometheus-ecs-discovery/main.go:455 +0x309
main.main.func1()
	/go/src/github.com/teralytics/prometheus-ecs-discovery/main.go:474 +0xa3
main.main()
	/go/src/github.com/teralytics/prometheus-ecs-discovery/main.go:504 +0x212

Can you help?

Thanks in advance.

[Feature request]: Scrape from dynamic ports

We are using dynamic ports for ECS, having more than one instance of the app on the same EC2 (not mandatory but it is possible and depends on the autoscaling group).

Could you please add a feature to be able to somehow discover on what port the application is running and do not type the port statically inside the Docker Label?

Thank you!

documentation miss: Docker label PROMETHEUS_EXPORTER_PORT should be defined in ECS Task definition

First of all: thanks, it works.

But the documentation lacks information how to set docker label(s).
I tried to to set up PROMETHEUS_EXPORTER_PORT in Dockerfile like

LABEL "PROMETHEUS_EXPORTER_PORT"="5000"

but still failed to write non-empty ecs_file_sd.yml . The log still featured

2021/06/12 04:00:51 Writing 0 discovered exporters to ecs_file_sd.yml

After poking with ECS services I managed to add

"dockerLabels": {
        "PROMETHEUS_EXPORTER_PORT": "5000"
      }

to the ECS task definition JSONs and it is what really fixed the issue.

I'm somewhat new to AWS ECS service stuff and maybe propagating real docker labels can be made, but current documentation is misleading, e.g. the words

To scrape the containers add following docker labels to them:
    PROMETHEUS_EXPORTER_PORT specify the container port where prometheus scrapes (mandatory)

Multiple container ports

In what ways we can use multiple container ports i.e_PROMETHEUS_EXPORTER_PORT_ for same service?

Returning wrong port if multiple ports are exposed

If there is a service that exposes multiple ports, sometimes wrong port is returned. The issue is that NetworkBindings are sometimes returned in a different order than the ports defined in task definition.

https://github.com/teralytics/prometheus-ecs-discovery/blob/master/main.go#L184

It means that extracting exposed port based on the index in the returned array is not reliable. The solution for that is to use port number instead of port index:

PROMETHEUS_EXPORTER_PORT=4000

assuming that 4000 is a port exposed by the container.

Unable to install due to missing dep on aws module

go.mod requires https://github.com/aws/aws-sdk-go-v2/releases, but complains about missing awserr

go get github.com/teralytics/prometheus-ecs-discovery
build github.com/teralytics/prometheus-ecs-discovery: cannot load github.com/aws/aws-sdk-go-v2/aws/awserr: module
github.com/aws/aws-sdk-go-v2@latest found (v1.2.0), but does not contain package github.com/aws/aws-sdk-go-v2/aws/awserr

Seems like the only version with awserr in the aws release is v2.0.0-preview.5 (2018-09-27). So I am confused how to make this work!

AWS ECS Permission error

When i want add prometheus-ecs-discovery to AWS ECS Cluster. Create policy with README.md

IAM Policy:
ECS:ListClusters
ECS:ListTasks
ECS:DescribeTask
EC2:DescribeInstances
ECS:DescribeContainerInstances
ECS:DescribeTasks
ECS:DescribeTaskDefinition

But when i want to use prometheus-ecs-discovery to Monitor.
Error Log :

AccessDeniedException: xxxxx ecs:DescribeClusters x
status code: 400

Currently README.md does not show this permission

Getting ECS cluster (instance/container) metrics into the prometheus dashbord

Hi,

I just got able to build the "prometheus-ecs-discovery" container by using the "teralytics/prometheus-ecs-discovery" code. I have given necessary AWS credentials, once I run the docker image.

Eg:

docker run -idt --name ecs-discovery -e AWS_REGION=eu-west-1 -e AWS_ACCESS_KEY_ID=XXXXXXXXXX -e AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXX ecs-discovery

And I got able to successfully start the container and I got able to ssh into the container also.
Then once I run the "prometheus-ecs-discovery" inside the container, It has shown all the cluster details list.

I have gone through the many online documents to get the cluster instance/container metrics into the Prometheus dashbord, but I haven't been able to find a proper way to get the cluster instance/container metrics into the prometheus dashbord as a prometheus target.

Please support me to achieve this target.

Thanks

Filtering based on labels

Any chance to add the ability to filter based on a label. The problem is that my prometheus endpoints are not always the same so I would need to have separate jobs using seperate sd files generated using this script.

using PROMETHEUS_EXPORTER_PORT other than 9090 became port 0 on file_sd_config

I've following configuration of docker-compose:

version: '3'
services:
  app:
    [...]
    ports:
      - "9090:9090"
    labels:
      PROMETHEUS_EXPORTER_PORT: "9090"

this correctly produces a file_sd_config

- targets:
  - 172.30.8.250:9090
  labels:
    [...]
    container_name: app
    

if instead of port 9090 I use any other, e.g. 8090

version: '3'
services:
  app:
    [...]
    ports:
      - "8090:9090"
    labels:
      PROMETHEUS_EXPORTER_PORT: "8090"

entry in file_sd_config is

- targets:
  - 172.30.8.250:0
  labels:
    [...]
    container_name: app
    

so 0 instead of port 8090, this prevents any metrics ingestion using ecs-discovery.
If I try to wget 172.30.8.250:8090 from ecs-discovery container it works returning the expected metric.

/prometheus # wget 172.30.8.250:8090 -O -
Connecting to 172.30.8.250:8090 (172.30.8.178:8090)
writing to stdout
# HELP system_connection Total connection result
# TYPE system_connection counter
# HELP queue_size Total of element in the queue to process
# TYPE queue_size gauge
queue_size{type="fdh",system="S3LAB.P3TW_AQO_AP",app="s3lab-ap-producer",} 0.0

Have you any insights?

custom metrics paths

Use of the docker label PROMETHEUS_EXPORTER_PORT on ecs tasks implies the tool is oriented to discovering exporter tasks. Presumably the path is the standard host:port/metrics.

Can it also discover application scrapper endpoints on custom paths? My web services incorporate micrometer (via spring boot actuator framework) to publish prometheus scrapper endpoints on their regular operational ports, but at the path {service-name}/management/prometheus. Is there a mechanism in the discovery tool to customize the prometheus metrics_path?

Docker Hub integration?

I'd like to use this project, but would prefer not to have to maintain my own image or use a version someone else has put together. Would you be interested in the idea of having an official image on Docker Hub for this project? Could certainly be setup as an automated build, allowing us to get updates pretty easily.

prometheus-ecs-discovery exits with 2 on SIGTERM

When I send TERM to prometheus-ecs-discovery, it exits with the status 2. An exit status between 1 and 127 usually indicates an error, so we report this.

When a program ends because of a signal, it should exit with the exit code 128 + Signal number, see https://wiki.jenkins.io/display/JENKINS/Job+Exit+Status. When Docker wants to stop a container, it sends the TERM signal (TERM = 15) to the process in the container, so that process should terminate with code 143.

It would be great if prometheus-ecs-discovery would exit with 143 when receiving TERM instead of 2, so we don't get reports of failed processes just because we deployed it for example.

Docker Labels doesn't follow the recommendations

The Docker labels used for the Prometheus discovery are not conforment with the official recommendations for Docker labels see: https://docs.docker.com/config/labels-custom-metadata/#key-format-recommendations

  • PROMETHEUS_EXPORTER_PORT should be io.prometheus.port
  • PROMETHEUS_EXPORTER_PATH should be io.prometheus.path
  • PROMETHEUS_EXPORTER_SERVER_NAME should be io.prometheus.server-name

At least we should support to change the Docker labels via flags.

Output file is empty - no discovered exporters

Not sure why this is failing to write anything:

2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff1, found 2 tasks
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff1, found 2 tasks
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff2, found 1 tasks
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff3, found 1 tasks
2018/03/01 12:33:54 Described 1 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff2
2018/03/01 12:33:54 Described 1 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff3
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff5, found 1 tasks
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff4, found 2 tasks
2018/03/01 12:33:54 Described 1 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff5
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff5, found 2 tasks
2018/03/01 12:33:54 Described 2 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff1
2018/03/01 12:33:54 Described 2 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff1
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff2, found 2 tasks
2018/03/01 12:33:54 Described 2 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff4
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff4, found 1 tasks
2018/03/01 12:33:54 Inspected cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff3, found 1 tasks
2018/03/01 12:33:54 Described 1 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff4
2018/03/01 12:33:54 Described 1 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff3
2018/03/01 12:33:54 Described 2 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/int_stuff5
2018/03/01 12:33:54 Described 2 tasks in cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff2
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/qa_service_task_definition:22
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/int_task_definition:40
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/int_definition:33
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/qa_task_definition:36
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/qa_task_definition:30
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/qa_task_definition:29
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/int_task_definition:47
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/qa_task_definition:30
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/int_task_definition:38
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/int_task_definition:42
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/int_stuff3_task_definition:45
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/int_task_definition:41
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/int_task_definition:48
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/qa_stuff3_task_definition:30
2018/03/01 12:33:54 Described task definition arn:aws:ecs:eu-west-2:123456789:task-definition/qa_task_definition:29
2018/03/01 12:33:54 Described 1 container instances in cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff4
2018/03/01 12:33:54 Described 1 container instances in cluster arn:aws:ecs:eu-west-2:123456789:cluster/qa_stuff3
2018/03/01 12:33:55 Described 10 EC2 reservations
2018/03/01 12:33:55 Writing 0 discovered exporters to /tmp/ecs_file_sd.yml

InvalidSignatureException

Hi,

Cant figure out why I keep getting the following error. I have verified the credentials several times.
Any ideas ?

2018/03/10 10:44:24 InvalidSignatureException: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.
status code: 400, request id: ffe87282-244f-11e8-8ec2-cdb807547e0f

Thanks

[feature request] add container/task labels to target labels

first things first: I love your work, thanks.

now to the feature request:
We would like to filter the metrics based on various tags we attached to the AWS ECS TaskDefinition or the docker containers themselves. An example would be the environment name for example (we run several near-identical test environments in one test account and need to distinguish statistics between them).

I envision the target to look similar to this:

- targets:
  - 123.123.123.123:12345
  labels:
    task_arn: arn:aws:ecs:re-gion-1:1234567890:task/SomeTaskName/876535678765
    task_name: SomeTaskName
    job: OurJobName
    task_revision: "2"
    task_group: service:AwesomeServiceName-763456784
    cluster_arn: arn:aws:ecs:re-gion-1:1234567890:cluster/ClusterName
    container_name: main
    container_arn: arn:aws:ecs:re-gion-1:1234567890:container/09876543-12345678-09876543-123456789
    docker_image: image-name:v0.12.1
    docker_label_environment: test1
    docker_label_somethingelse: anothervalue
    task_tag_tagkey: tagvalue
    task_tag_environment: test1
    __metrics_path__: /metrics

Cannot connect to AWS: AccessDeniedException

Hi
I have an IAM policy with permission to some ECS clusters:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "ecs:DescribeCapacityProviders",
                "ecs:ListTagsForResource",
                "ecs:ListTasks",
                "ecs:DescribeServices",
                "ecs:DescribeTaskSets",
                "ecs:DescribeContainerInstances",
                "ecs:DescribeTasks",
                "ecs:DescribeClusters"
            ],
            "Resource": [
                "arn:aws:ecs:ap-southeast-1:111:cluster/aaa",
                "arn:aws:ecs:ap-southeast-1:111:cluster/bbb"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "ecs:ListAccountSettings",
                "ecs:DescribeTaskDefinition",
                "ecs:ListClusters"
            ],
            "Resource": "*"
        }
    ]
}

I have exported the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY and AWS_REGION, but when I run the binary, I get these errors:

2023/02/13 09:58:58 Error listing tasks of cluster arn:aws:ecs:ap-southeast-1:111:cluster/ccc: operation error ECS: ListTasks, https response error StatusCode: 400, RequestID: abc, api error AccessDeniedException: User: arn:aws:sts::111:assumed-role/monitoring/i-0abc is not authorized to perform: ecs:ListTasks on resource: * because no identity-based policy allows the ecs:ListTasks action
2023/02/13 09:58:58 failed to call service: ECS, operation: ListTasks, error: https response error StatusCode: 400, RequestID: abc, api error AccessDeniedException: User: arn:aws:sts::111:assumed-role/monitoring/i-0abc is not authorized to perform: ecs:ListTasks on resource: * because no identity-based policy allows the ecs:ListTasks action
2023/02/13 09:58:58 Error listing tasks of cluster arn:aws:ecs:ap-southeast-1:111:cluster/aaa: operation error ECS: ListTasks, https response error StatusCode: 400, RequestID: abc, api error AccessDeniedException: User: arn:aws:sts::111:assumed-role/monitoring/i-0abc is not authorized to perform: ecs:ListTasks on resource: * because no identity-based policy allows the ecs:ListTasks action
2023/02/13 09:58:58 Error listing tasks of cluster arn:aws:ecs:ap-southeast-1:111:cluster/ddd: operation error ECS: ListTasks, https response error StatusCode: 400, RequestID: abc, api error AccessDeniedException: User: arn:aws:sts::111:assumed-role/monitoring/i-0abc is not authorized to perform: ecs:ListTasks on resource: * because no identity-based policy allows the ecs:ListTasks action
2023/02/13 09:58:58 Error listing tasks of cluster arn:aws:ecs:ap-southeast-1:111:cluster/eee: operation error ECS: ListTasks, https response error StatusCode: 400, RequestID: abc, api error AccessDeniedException: User: arn:aws:sts::111:assumed-role/monitoring/i-0abc is not authorized to perform: ecs:ListTasks on resource: * because no identity-based policy allows the ecs:ListTasks action
2023/02/13 09:58:58 Error listing tasks of cluster arn:aws:ecs:ap-southeast-1:111:cluster/fff: operation error ECS: ListTasks, https response error StatusCode: 400, RequestID: abc, api error AccessDeniedException: User: arn:aws:sts::111:assumed-role/monitoring/i-0abc is not authorized to perform: ecs:ListTasks on resource: * because no identity-based policy allows the ecs:ListTasks action
2023/02/13 09:58:58 Error listing tasks of cluster arn:aws:ecs:ap-southeast-1:111:cluster/bbb: operation error ECS: ListTasks, https response error StatusCode: 400, RequestID: abc, api error AccessDeniedException: User: arn:aws:sts::111:assumed-role/monitoring/i-0abc is not authorized to perform: ecs:ListTasks on resource: * because no identity-based policy allows the ecs:ListTasks action
2023/02/13 09:58:58 Error listing tasks of cluster arn:aws:ecs:ap-southeast-1:111:cluster/ggg: operation error ECS: ListTasks, https response error StatusCode: 400, RequestID: abc, api error AccessDeniedException: User: arn:aws:sts::111:assumed-role/monitoring/i-0abc is not authorized to perform: ecs:ListTasks on resource: * because no identity-based policy allows the ecs:ListTasks action

Service exits when fails to call ecs service

I've encountered a problem

2021/07/12 06:43:18 RequestError: send request failed
caused by: Post https://ecs.eu-west-1.amazonaws.com/: dial tcp 52.95.116.181:443: connect: connection refused

Which caused the service to exit. I believe that RequestError could be caught and handled so the service does not close when AWS ECS is temporarily unavailable

main.go file bug

ECS discovery build errors during following commands on gitlab runner, any idea what argument I am missing or is it a bug in main.go file, it was working fine previously.

################################
FROM golang:1.10-alpine as builder
RUN apk add --update git
RUN go get -u github.com/teralytics/prometheus-ecs-discovery

use prometheus as base

FROM prom/prometheus:v2.8.0
########################################

Step 3/17 : RUN go get -u github.com/teralytics/prometheus-ecs-discovery
---> Running in c02efb7885cc

github.com/teralytics/prometheus-ecs-discovery

src/github.com/teralytics/prometheus-ecs-discovery/main.go:91:28: not enough arguments in call to req.Send
have ()
want (context.Context)
src/github.com/teralytics/prometheus-ecs-discovery/main.go:297:25: not enough arguments in call to req.Send
have ()
want (context.Context)
src/github.com/teralytics/prometheus-ecs-discovery/main.go:373:27: not enough arguments in call to req.Send
have ()
want (context.Context)
src/github.com/teralytics/prometheus-ecs-discovery/main.go:422:27: not enough arguments in call to req.Send
have ()
want (context.Context)
src/github.com/teralytics/prometheus-ecs-discovery/main.go:493:30: not enough arguments in call to req.Send
have ()
want (context.Context)
src/github.com/teralytics/prometheus-ecs-discovery/main.go:507:42: not enough arguments in call to reqDescribe.Send
have ()
want (context.Context)
src/github.com/teralytics/prometheus-ecs-discovery/main.go:601:11: not enough arguments in call to svc.DescribeClustersRequest(&ecs.DescribeClustersInput literal).Send
have ()
want (context.Context)
The command '/bin/sh -c go get -u github.com/teralytics/prometheus-ecs-discovery' returned a non-zero code: 2

PROMETHEUS_EXPORTER_SCHEME not showing up in output file

If I understand correctly, this service generates a file containing "targets" that will be referred from a "job" in Prometheus config (not generated by this service).
Since the ecs_file_sd.yml contain only targets, how is PROMETHEUS_EXPORTER_JOB_NAME relevant here?
Also, looks like the scheme config is at the job level. So again, since this service writes only targets, how is the PROMETHEUS_EXPORTER_SCHEME relevant ?

the flag config.filter-label not work

{
        name      = "prometheus-ecs-discovery"
        command   = ["-config.write-to=/output/ecs_file_sd.yml -config.filter-label=ENV_TAG=mydev"]
        cpu       = 0
        essential = true
        image     = "tkgregory/prometheus-ecs-discovery:latest"
        logConfiguration = {
          logDriver = "awslogs"
          options = {
            awslogs-group         = "/ecs/eve-recommendation-service"
            awslogs-region        = "eu-west-1"
            awslogs-stream-prefix = "ecs"
          }
        }
        environment = [{
          name  = "AWS_REGION"
          value = "eu-west-1"
        }]

        memoryReservation = 1024
        mountPoints = [
          {
            sourceVolume : "config",
            containerPath : "/output"
          }
        ]
        portMappings = []
        volumesFrom  = []
      }
 dockerLabels = {
          PROMETHEUS_EXPORTER_PATH = "/metrics"
          PROMETHEUS_EXPORTER_PORT = "2112"
          ENV_TAG                  = "mydev"
        }

Cross account prometheus monitoring not working as expected.

We have an Account A where prometheus along with prometheus-ecs-discovery are installed and working properly. We need to achieve monitoring in different accounts (B,C,D ...) from account A and I guess -config.role-arn would help us to do so.

It only worked for us between account A and other account B. Could not find a way to monitor C and D.

What I need to achieve is the following:

"command": [
               "-config.write-to=/etc/prometheus/data/ecs_file_sd.yml",
               "-config.role-arn=arn:aws:iam::Account_A_ID:role/ecs-discover-role"]

ecs-discover-role is trusted on account B, C and D however it's not able to see the clusters and if I pass the arn of a remote cluster B it would output and error InvalidParameterException InvalidParameterException: Identifier is Account_A_ID

It just work if I pass the -config.role-arn=arn:aws:iam::Account_B_ID:role/service-role so it's assumed buy the role in Account A and by this I can pass the arn of the remote cluster of account B and it would be able to discover it and update the ecs_file_sd

Multiple metric endpoints per task

We are running a side car container along side our application container. The application and side car container have metric endpoints. Is there a way through docker labels to specify multiple metric endpoints for a single task definition?

setup dockerLabels in ECS task definition

@chicofranchico @muravjov
I deployed the prometheus-ecs-discovery container in one of my AWS EC2 instance. It is running fine but getting the same "Writing 0 discovered exporters to /tmp/ecs_file_sd.yml" errors.

The confusion part is here,
I tried to setup dockerLabels in my ECS task definition but showing a huge number of existing task definition in my AWS account and multiple cluster as well.
Do I need to modify each task definition with the above dockerLabels config? or create a new one and again deploy the new container "prometheus-ecs-discovery"?

Please help me.

Originally posted by @gopkris2000 in #68 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.