testflows / testflows-github-hetzner-runners Goto Github PK
View Code? Open in Web Editor NEWAutoscaling GitHub Actions Runners Using Hetzner Cloud.
Home Page: https://testflows.com
License: Other
Autoscaling GitHub Actions Runners Using Hetzner Cloud.
Home Page: https://testflows.com
License: Other
Instead of type-...
, labels should have the option to include a prefix to the label, e.g. prefix-type-...
.
๐ Hi! First of all, nice project! I've tried to deploy the service using your documentation. I can see the VM provisioned correctly but it's failing to register as GitHub self-hosted runner. Here's the log from github-hetzner-runners cloud status
:
14:39:19 > May 24 12:37:53 github-hetzner-runners github-hetzner-runners[3647]: 12:37:53 ๐ Found new potential zombie server github-hetzner-runner-9224036990-25378510349
14:39:19 > May 24 12:38:00 github-hetzner-runners github-hetzner-runners[3647]: 12:38:00 ๐ Trying to connect to [email protected]
14:39:19 > May 24 12:38:08 github-hetzner-runners github-hetzner-runners[3647]: 12:38:08 ๐ Trying to connect to [email protected]
14:39:19 > May 24 12:38:13 github-hetzner-runners github-hetzner-runners[3647]: 12:38:13 ๐ Trying to connect to [email protected]
14:39:19 > May 24 12:38:15 github-hetzner-runners github-hetzner-runners[3647]: 12:38:15 > github-hetzner-runner-9224036990-25378510349
14:39:19 > May 24 12:38:15 github-hetzner-runners github-hetzner-runners[3647]: 12:38:15 ๐ Getting registration token for the runner
14:39:19 > May 24 12:38:15 github-hetzner-runners github-hetzner-runners[3647]: 12:38:15 โ TypeError: cannot unpack non-iterable NoneType object
14:39:19 > May 24 12:38:35 github-hetzner-runners github-hetzner-runners[3647]: 12:38:35 ๐ Logging in to GitHub
14:39:19 > May 24 12:38:35 github-hetzner-runners github-hetzner-runners[3647]: 12:38:35 ๐ Checking current API calls consumption rate
14:39:19 > May 24 12:38:36 github-hetzner-runners github-hetzner-runners[3647]: 12:38:36 ๐ Consumed 4 calls in 60 sec, 4965 calls left, reset in 3317 sec
Version
(infrastructure) โ infrastructure git:(main) โ github-hetzner-runners -v
1.7.240516.1143322
Hi, thanks a lot for this project!
I've created an instance of the controller using github-hetzner-runners cloud deploy
using the latest version (1.5.231020.1122452
) without any configuration beyond the credentials. A job with runs-on: [self-hosted, type-cax11]
hangs and in the logs I can find
Dec 15 16:11:40 github-hetzner-runners github-hetzner-runners[2873]: 16:11:40 โ APIException: image has incompatible architecture
I can fix this by adding the image-arm-system-ubuntu-22.04
label, but it would be nice if I didn't have to. The README of this project also shows an example using an Ampere server without this label.
Is it possible to run in multiple repositories or an entire organization?
I've try to start the github-actions-runner process multiple times, once per GitHub repository but they seem to be stepping on each other's toes when they're in the same HCloud project (this is conjecture).
Running into api limits on the heztner end APIException: limit of 3600 requests per hour reached
- https://docs.hetzner.cloud/#rate-limiting.
Sometimes exception says APIException: limit of 7200 requests per hour reached
.
Not sure why there's two different ones. The github api usage is ok.
When using github-hetzner-runners --github-repository acme/repo
instead of GITHUB_REPOSITORY=acme/repo github-hetzner-runners
, it seems the GitHub Actions runner created by the controller to run with GITHUB_REPOSITORY=None
, causing the default setup script to run ./config.sh --url https://github.com/None ...
.
Add support to provide Heztner server pricing info in config.py. Instead of randomly killing recyclable servers, optimize by price.
Add support to include UID in server names to allow cost traceability when servers are recycled.
New names could look like:
github-hetzner-runners-<uid>-<workflow_id>-<job_id>
and the <uid>
needs to be preserved when the server is moved to be recyclable and
it should be propagated to the new server name.
Hi!
We're using the library now in production, and it has been extremely useful for us! (our config is very simple: --max-runners 40, recycling on).
The only problem we're facing is: when we're startuing let's say 5 or 10 in parallel, then 30-50% of the jobs are marked as "failed job".
I was trying to look at the logs but the only error message I'm finding is this one:
06:24:20 scale_down ERROR โ APIException: cannot perform operation because server is locked
I'm now trying to look in depth whether it's some kind of instance creation timeout, while keeping the page open before to see what is happening before github just declares it a "failed job", hopefully have an update soon!
Hello! Thanks again for this nice tool.
I have pretty standard installation of the application (i.e. not many overwritten config options, besides the startup script and default image, as well as increasing the time to cleanup powered off servers). I kicked off a bunch of CI jobs on my repository earlier through commits, now it seems it has accumulated 10 powered off servers (which seems to be the default max) and it doesn't progress.
I.e. it doesn't spin up new ones (which makes sense as per the default worker limit) but also doesn't reuse the powered off runners.
These are the config values I override:
image: ghcr.io/kraken-build/github-runners:${tag}
command:
- --startup-x64-script=/opt/startup-x64.sh
- --startup-arm64-script=/opt/startup-arm64.sh
- --max-unused-runner-time=3000 # 50 minutes
- --max-powered-off-time=3000 # 50 minutes
# NOTE: We can not set a default image per architecture, so this will be invalid for arm servers.
# See https://github.com/testflows/TestFlows-GitHub-Hetzner-Runners/issues/10
- --default-image=x86:app:docker-ce
environment:
- GITHUB_TOKEN=${github_token}
- GITHUB_REPOSITORY=${repo.name}
- HETZNER_TOKEN=${repo.hetzner_token}
volumes:
- ./startup-x64.sh:/opt/startup-x64.sh:ro
- ./startup-arm64.sh:/opt/startup-arm64.sh:ro
- /root/.ssh/id_rsa:/root/.ssh/id_rsa:ro
- /root/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub:ro
restart: always
Aside from this it's very vanilla, see the Dockerfile:
FROM python:3.10 as builder
RUN pip install --upgrade pip && \
# See https://github.com/yaml/pyyaml/issues/736
echo 'Cython < 3.0' > /tmp/constraint.txt && \
pip install pex && \
PIP_CONSTRAINT=/tmp/constraint.txt pex testflows.github.hetzner.runners==1.5.231020.1122452 \
-c github-hetzner-runners -o /usr/local/bin/github-hetzner-runners
FROM python:3.10
COPY --from=builder /usr/local/bin/github-hetzner-runners /usr/local/bin/github-hetzner-runners
ENTRYPOINT [ "/usr/local/bin/github-hetzner-runners" ]
Screenshot of a currently pending job:
Maybe relevant screenshot from two of the VMs that I think should get reused:
There's been no changes to the hetzner-runners configuration in the last week.
In the same repo where we subscribe to hetzner's self hosted runners, we also use buildjet AND standard github runners, but around 50% of the time, the hetzner runners are picking up these jobs even if:
runs-on: ubuntu-latest
is specified.
Can you help us debug / resolve this?
Thank you!
A flag to disable running custom setup/startup scripts when using an app or a snapshot.
Because Heztner snapshots can take a while to load (5+ min), it is more convenient to run the setup script. But this interferes with existing apps. For example, I have a setup script that adds docker, but when using the Hetzner docker app, it causes issues on launch.
On Ubuntu 24.04 with Python 3.12 you will get the following error when trying to install python modules using pip command.
error: externally-managed-environment
ร This environment is externally managed
โฐโ> To install Python packages system-wide, try apt install
python3-xyz, where xyz is the package you are trying to
install.
If you wish to install a non-Debian-packaged Python package,
create a virtual environment using python3 -m venv path/to/venv.
Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
sure you have python3-full installed.
If you wish to install a non-Debian packaged Python application,
it may be easiest to use pipx install xyz, which will manage a
virtual environment for you. Make sure you have pipx installed.
See /usr/share/doc/python3.11/README.venv for more information.
note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
hint: See PEP 668 for the detailed specification.
We either need to do:
--break-system-packages
flagsudo mv /usr/lib/python3.11/EXTERNALLY-MANAGED /usr/lib/python3.11/EXTERNALLY-MANAGED.old
Hi! we'd love to use this, thank you for making it available!
We're trying to use this command:
github-hetzner-runners cloud deploy -c <filename>.yaml
but getting the following error message: github-hetzner-runners: error: unrecognized arguments: -c
It looks like cloud deploy is not wired up correctly? also the README mentions "deploy" which is not available.
github-hetzner-runners -v
1.5.231005.1001309
Thanks!:)
Recycled servers count against the Heztner D.vCPU limit and prevent new servers from being created.
For example, if the the D.vCPU limit is 32 and a job that uses ccx53 follows a job that uses ccx43, the ccx43 instance is going to sit in a powered off state waiting to be recycling, preventing the ccx53 from being created.
The config dataclasses in testflows use mutable default values, causing an error like the following in Python 3.11+:
File "/root/.pex/installed_wheels/601f4b6502885652ae5175b207ca02309d8e405ba63406ece0addf0dcc9f3919/testflows.github.hetzner.runners-1.5.231020.1122452-py3-none-any.whl/.prefix/bin/github-hetzner-runners", line 38, in <module>
from testflows.github.hetzner.runners.scale_up import scale_up
File "/root/.pex/installed_wheels/601f4b6502885652ae5175b207ca02309d8e405ba63406ece0addf0dcc9f3919/testflows.github.hetzner.runners-1.5.231020.1122452-py3-none-any.whl/testflows/github/hetzner/runners/scale_up.py", line 27, in <module>
from .config import Config, check_image
File "/root/.pex/installed_wheels/601f4b6502885652ae5175b207ca02309d8e405ba63406ece0addf0dcc9f3919/testflows.github.hetzner.runners-1.5.231020.1122452-py3-none-any.whl/testflows/github/hetzner/runners/config/__init__.py", line 15, in <module>
from .config import Config
File "/root/.pex/installed_wheels/601f4b6502885652ae5175b207ca02309d8e405ba63406ece0addf0dcc9f3919/testflows.github.hetzner.runners-1.5.231020.1122452-py3-none-any.whl/testflows/github/hetzner/runners/config/config.py", line 81, in <module>
@dataclass
^^^^^^^^^
File "/usr/local/lib/python3.11/dataclasses.py", line 1230, in dataclass
return wrap(cls)
^^^^^^^^^
File "/usr/local/lib/python3.11/dataclasses.py", line 1220, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dataclasses.py", line 958, in _process_class
cls_fields.append(_get_field(cls, name, type, kw_only))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dataclasses.py", line 815, in _get_field
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'testflows.github.hetzner.runners.config.config.deploy'> for field deploy is not allowed: use default_factory
Hi!
We have a deployed instance and trying to access the logs with:
github-hetzner-runners cloud log -f
But getting the following error message:
Using config file: /home/ubuntu/.github-hetzner-runners/config.yaml
14:49:03 scale_up ERROR โ Error: TypeError count_present() got an unexpected keyword argument 'server'
14:49:11 api_watch INFO ๐ Logging in to GitHub
14:49:11 api_watch INFO ๐ Checking current API calls consumption rate
14:49:11 api_watch INFO ๐ Consumed 0 calls in 60 sec, 5000 calls left, reset in 3599 sec
14:49:18 scale_up INFO ๐ Checking standby runner pool
14:49:18 scale_up ERROR โ TypeError: count_present() got an unexpected keyword argument 'server'
14:49:18 scale_up ERROR โ Error: TypeError count_present() got an unexpected keyword argument 'server'
with config.yaml:
config:
standby_runners:
- labels:
- type-cpx51
count: 2
replenish_immediately: false
Thanks for looking into it!!
Add support to use AWS servers instead of Heztner.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.