buildkite-plugins / ecr-buildkite-plugin Goto Github PK
View Code? Open in Web Editor NEW🔐 Login to an AWS ECR registry
License: MIT License
🔐 Login to an AWS ECR registry
License: MIT License
I should start off by saying I'm not sure if this belongs here or in the docker plugin repo, but here is my best guess.
I am running into a very strange issues using the env vars set by this plugin (namely ${DOCKER_CONFIG}
) in my pipeline.
Judging by the fact no one else appears to have had this issue**, I'm guessing that mine is an unusual use case, but what I'm trying to do is mount my docker credentials from ECR into an instance.
By all appearances, a build step like this should work fine:
- name: ":docker: Building image"
plugins:
- improbable-eng/metahook:
pre-command: |
echo $DOCKER_CONFIG
echo \$DOCKER_CONFIG
- ecr#v2.4.0:
login: true
account-ids: '[ACCOUNT NUMBER]'
region: us-east-1
no-include-email: true
assume_role:
role_arn: "arn:aws:iam::[ACCOUNT NUMBER]:role/[ROLE NAME]"
- docker#v3.8.0:
image: moby/buildkit:master
privileged: true
userns: host
always-pull: true
propagate-environment: true
mount-checkout: true
mount-buildkite-agent: false
tmpfs:
- '/tmp'
volumes:
- '${HOME}/.aws:/home/user/.aws'
- '${DOCKER_CONFIG}:/home/user/.docker'
environment:
- 'DOCKER_CONFIG=/home/user/.docker'
tty: true
debug: true
entrypoint: buildctl-daemonless.sh
command: [ 'build', '...more args here' ]
(Note: the improbable-eng/metahook
plugin is for debugging and not needed to reproduce.)
What seems to be happening (and I have precisely zero guesses as to why) is that DOCKER_CONFIG
has two different values: it holds the path to one temporary folder when the variables in the pipeline.yml
are interpolated, and is reset somewhere to an entirely different folder when the command is invoked.
The result is that the improbable-eng/metahook
pre-command
in my build yields two completely different values, as the command appears to undergo expansion twice.
After spending far too much time trying to figure out what was going on here, I noticed this in the docs:
https://buildkite.com/docs/pipelines/secrets#anti-pattern-referencing-secrets-in-your-pipeline-yaml. (Which was not helpfully named, given I was just trying to reference a filesystem path.)
In any case, the suggestion in the documentation did not work: changing the volume mapping to '$${DOCKER_CONFIG}:/home/user/.docker' results in it becoming (the literal) ${DOCKER_CONFIG}
when docker run
is executed, yielding the error:
docker: Error response from daemon: create ${DOCKER_CONFIG}: "${DOCKER_CONFIG}" includes invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed. If you intended to pass a host directory, use absolute path.
So I'm just really confused.
Why is ${DOCKER_CONFIG}
getting changed when it's already been set?
Why can I literally just reference an environment variable (set in the eponymous environment
stage) in what is both a later stage AND a plugin defined after the ECR one?
If this isn't a bug (which would be upsetting), how would I work around it? The barriers I see are:
docker
plugin is such that there's no ability to run commands outside the container in the same stepimprobable-eng/metahook
seems a bit hacky.** This may actually be the underlying cause of #32, but I'm not sure and that issue is stale (2+ years old), so I'm creating this new one.
I see since version 2.0.0 this ECR plugin hooks into the step lifecycle at the environment
stage. This breaks the integration with the cultureamp/aws-assume-role
plugin.
To be specific we have many build steps that follow this pattern:
plugins:
- cultureamp/aws-assume-role#v0.1.0:
role: "arn:aws:iam::$AWS_ACCOUNT_ID:role/$AWS_ROLE"
- ecr#v1.2.0:
login: true
account_ids: "$AWS_ACCOUNT_ID"
- docker-compose#v2.5.1:
…
First the aws-assume-role
plugin will assume a role that has access to the AWS ECR. Then the ecr
plugin will log in and gain access to the ECR. Later the docker-compose
plugin will pull and push to the ECR.
This pipeline fails when the ecr
plugin is upgraded to version 2.0.0: the docker-compose
plugin is refused access to the ECR. As the ecr
plugin now runs at the environment
phase it always runs before the aws-assume-role
plugin, which runs at the pre-command
phase, thus breaking the authorisation.
There's not much information shared in #26 to describe the reason or benefit in this change.
Hi there! We are extensive users of this plugin and like it a lot :)
We did recently notice something, however, that has caused us to pin all our pipelines to version 1.1.4. We noticed that the current master branch version of the plugin echo
's the docker login command, meaning the token it uses is displayed in our BuildKite job logs. For us this isn't ideal as it would allow a developer with BuildKite access to authenticate on their dev machine to the respective ECR repository for up to 12 hours after the build step was run.
Would it be possible to either remove the echo
or put it behind a configuration option so we can hide it?
ECR Public is a good alternative to Docker Hub to get around the rate-limiting on image pulls.
However, ECR Public has separate rate limits for unauthenticated (guest) and authenticated image pulls:
It'd be good for the plugin to support an option that can run the following:
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws
(Note that the region needs to always be us-east-1
)
!/bin/bash
on macOS is still 3.2.x and doesn't include mapfile command. This prevents bk local run
on macOS with version 2.1 of this plugin.
Readme seems to use two different forms for the same option:
In the example:
account_ids: "0015615400570"
Later:
Options
...
account-ids (optional)
A consistent -
or _
would be great :)
This is the documentation for the region
option:
Set a specific region for ECR, defaults to the current
What does current mean here? My guess was that it would use whatever I had configured in my ~/.aws/config
file. But I wasn't able to get this plugin to work until I specified the option in the pipeline config. Not a big deal, I'm just curious how this works.
It would be handy if the readme mentioned that this option does not apply to AWS CLI versions greater than or equal to 1.17.10.
Hi, there.
I am facing the following error along with 150 parallel agents with ecr-buildkite-plugin#v1.1.2.
Authenticating with AWS ECR | 6s
| An error occurred (ThrottlingException) when calling the GetAuthorizationToken operation (reached max retries: 4): Rate exceeded
| 🚨 Buildkite Error: The global pre-command hook exited with a status of 255
http://docs.aws.amazon.com/AmazonECR/latest/userguide/common-errors.html
I couldn't find how to increase retry times of aws ecr get-login
command.
How about implement sleep and retry if aws ecr get-login
is failed.
There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.
Error type: Preset name not found within published preset config (monorepo:angularmaterial). Note: this is a nested preset so please contact the preset author if you are unable to fix it yourself.
Attempting to use this plug-in on a Windows-based agent appears to result in the following:
Running plugin github.com/buildkite-plugins/ecr-buildkite-plugin#v1.1.4 pre-command hook
> C:\buildkite-agent\plugins\github-com-buildkite-plugins-ecr-buildkite-plugin-v1-1-4\hooks\pre-command
Expected buildkite-agent to be in the form x.y.z
🚨 Error: The plugin github.com/buildkite-plugins/ecr-buildkite-plugin#v1.1.4 pre-command hook exited with status 1
I'm running buildkite-agent
version 3.5.2 on Windows Server 1803. The AWS CLI and Git for Windows are installed, and with the exception of some custom tags, the buildkite-agent
configuration file is the stock one generated by the automated PowerShell installer script (so cmd.exe
is the default shell, etc.).
Role ARN assumption, as implemented, changes env vars for the duration of the job.
This can be an issue if a role is available to be assumed just for authenticating with ECR.
A workaround that is backwards compatible would be to introduce support for using AWS_PROFILE on the job step so that a profile can be configured with role_arn
on the agent and used to authenticate with ECR.
Hello, apologies if this question is somewhat dumb as I am new to buildkite. I am trying to use this ecr plugin, and for some reason it is unable to git checkout any of the version tags. The error I am getting is as follows:
# Plugin "github.com/buildkite-plugins/ecr-buildkite-plugin" will be checked out to "/var/lib/buildkite-agent/plugins/github-com-buildkite-plugins-ecr-buildkite-plugin-2-6-0"
--
# Switching to the temporary plugin directory
$ cd /var/lib/buildkite-agent/plugins/github-com-buildkite-plugins-ecr-buildkite-plugin-2-6-03887800207
$ git clone -v --recursive -- https://github.com/buildkite-plugins/ecr-buildkite-plugin .
Cloning into '.'...
POST git-upload-pack (175 bytes)
POST git-upload-pack (gzip 1402 to 662 bytes)
remote: Enumerating objects: 653, done.
| remote: Counting objects: 100% (99/99), done.
| remote: Compressing objects: 100% (57/57), done.
| remote: Total 653 (delta 43), reused 74 (delta 32), pack-reused 554
| Receiving objects: 100% (653/653), 161.76 KiB \| 12.44 MiB/s, done.
| Resolving deltas: 100% (296/296), done.: 0% (0/296)
# Checking out `2.6.0`
$ git checkout -f 2.6.0
error: pathspec '2.6.0' did not match any file(s) known to git
My buildkite steps are written as follows:
steps:
- label: "build-and-push-ECR"
command: "build_and_push.sh"
plugins:
- ecr#2.6.0:
login: true
account-ids:
- "public.ecr.aws"
I've tried both #2.6.0
and #2.7.0
, but both fail to git checkout.
This plugin uses aws ecr get-login
which was quietly deprected in aws/aws-cli@b83b3a5 which landed in 1.17.10 on 5th Feb (less than a month ago), and completely removed in awscli v2, which is released but not in many distribution channels (package managers etc) yet.
$ aws --version
aws-cli/2.0.2 Python/3.7.3 Linux/5.5.6-arch1-1 botocore/2.0.0dev6
$ aws ecr get-login
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
...
aws: error: argument operation: Invalid choice, valid choices are:
...
This plugin should instead use aws ec2 get-login-password
. However unlike get-login
, get-login-password
doesn't give us the ECR registry URL (including AWS account ID & region) which should be passed to the docker login
command. Deriving it will add some complexity.
ecr-buildkite-plugin/hooks/environment
Lines 147 to 150 in 838688c
Our build have failure occasionally when ECR failed to authenticate. It seems the retry doesn't seems to works because there is no logs of Login failed on attempt
.
Authenticating with AWS ECR :ecr: :docker:
[2021-09-07T21:18:59Z] ^^^ Authenticating with AWS ECR in *** for *** :ecr: :docker:
[2021-09-07T21:19:14Z] Error response from daemon: Get https://***.dkr.ecr.***.amazonaws.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I'm likely missing something, but this plugin doesn't appear to play nice with the docker-compose
plugin. I'm basically trying to replace a previous use of the Docker Login plugin with ECR (moved a docker image from Docker Hub to a private AWS ECR).
- ecr#v2.0.0:
login: true
account_ids: "<my_aws_account_id>"
no-include-email: true
region: us-east-1
- docker-compose#v2.5.1:
# Intentionally don't include image-repository, as we will
# manually update the Docker image to use on the ECR as needed.
config: buildkite/docker-compose.yml
build: <some_name>
cache-from: <my_aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/<image-name>:latest
From the logs, I can see that the ECR plugin successfully logs in (so awscli
and the required AWS keys are OK). This is indicated by "Login Succeeded" in the "Authenticating with AWS ECR to <my_aws_account_id>" log item.
However, we then hit ""pull access denied for latest, repository does not exist or may require 'docker login'"
I would have thought that this plugin replaced the need for docker login
, but perhaps that isn't the case?
This plugin requires access to IAM permissions for the ECR repo(s) being used, it'd be nice to briefly note in the README what it'll need, or link to the relevant part of the AWS docs for ECR
I posted in buildkite community slack as well but it was not addressed. https://buildkite-community.slack.com/archives/C7KTYPZ5W/p1624281329003200
This is really impacting velocity as CI builds randomly hang forever.
Running plugin ecr environment hook 0s
[2021-06-22T16:12:01Z] $ /var/lib/buildkite-agent/plugins/github-com-buildkite-plugins-ecr-buildkite-plugin-v2-3-0/hooks/environment
Authenticating with AWS ECR
[2021-06-22T16:12:02Z] ^^^ Authenticating with AWS ECR in us-west-2 for 12345 :ecr: :docker:
^ It will then remain there for hours — until the build is stopped manually, or timeout seconds for the step is reached.
Appears to be coming from these lines:
ecr-buildkite-plugin/hooks/environment
Lines 146 to 150 in a7e99dc
It is intermittent and seemingly random. Sometimes it hangs, sometimes it moves along instantly. We are not doing anything on our end that should impact this.
After hanging forever, if the step or build is restarted, it has a chance of succeeding.
For buildkite staff, here's an example of a build that I killed after hanging there for 13 minutes https://buildkite.com/vydia/web/builds/8804#450f8bce-7c43-4bb9-ab2f-cbff1c4ecb6c Then upon rebuild, it has continued along without any issues https://buildkite.com/vydia/web/builds/8805#514ac531-d26a-4c78-9fd8-e32285f6b76f — Both builds are for the same commit and with no changes in configuration, and no difference between them.
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
These updates have all been created already. Click a checkbox below to force a retry/rebase of any.
.buildkite/pipeline.yml
docker-compose v4.3.0
plugin-linter v3.0.0
shellcheck v1.3.0
docker-compose.yml
buildkite/plugin-tester latest@sha256:476a1024936901889147f53d2a3d8e71e99d76404972d583825514f5608083dc
Any chance of cutting a 1.1.3 release with the region selection stuff from this in there?
Thanks!
Specifying the region. In the docs the tag is region
, but in the code it looks for [prefix]_REGISTRY_REGION
.
Hi, the region option does not work with the latest release version (v1.1.3).
What works is the undocumented (and deprecated?) registry_region
option.
Neither the master documentation nor the v1.1.3 documentation mention registry_region
.
I think the simplest solution would be to release a new version and recommend to use this new tagged version instead.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.