Coder Social home page Coder Social logo

testflows / testflows-github-hetzner-runners Goto Github PK

View Code? Open in Web Editor NEW
25.0 3.0 1.0 9.57 MB

Autoscaling GitHub Actions Runners Using Hetzner Cloud.

Home Page: https://testflows.com

License: Other

Python 98.30% Shell 1.70%
actions github hetzner-cloud runners

testflows-github-hetzner-runners's People

Contributors

naulius avatar vzakaznikov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

myrotk

testflows-github-hetzner-runners's Issues

TypeError: cannot unpack non-iterable NoneType object

๐Ÿ‘‹ Hi! First of all, nice project! I've tried to deploy the service using your documentation. I can see the VM provisioned correctly but it's failing to register as GitHub self-hosted runner. Here's the log from github-hetzner-runners cloud status:

14:39:19    > May 24 12:37:53 github-hetzner-runners github-hetzner-runners[3647]: 12:37:53 ๐Ÿ€ Found new potential zombie server github-hetzner-runner-9224036990-25378510349
14:39:19    > May 24 12:38:00 github-hetzner-runners github-hetzner-runners[3647]: 12:38:00 ๐Ÿ€ Trying to connect to [email protected]
14:39:19    > May 24 12:38:08 github-hetzner-runners github-hetzner-runners[3647]: 12:38:08 ๐Ÿ€ Trying to connect to [email protected]
14:39:19    > May 24 12:38:13 github-hetzner-runners github-hetzner-runners[3647]: 12:38:13 ๐Ÿ€ Trying to connect to [email protected]
14:39:19    > May 24 12:38:15 github-hetzner-runners github-hetzner-runners[3647]: 12:38:15    > github-hetzner-runner-9224036990-25378510349
14:39:19    > May 24 12:38:15 github-hetzner-runners github-hetzner-runners[3647]: 12:38:15 ๐Ÿ€ Getting registration token for the runner
14:39:19    > May 24 12:38:15 github-hetzner-runners github-hetzner-runners[3647]: 12:38:15 โŒ TypeError: cannot unpack non-iterable NoneType object
14:39:19    > May 24 12:38:35 github-hetzner-runners github-hetzner-runners[3647]: 12:38:35 ๐Ÿ€ Logging in to GitHub
14:39:19    > May 24 12:38:35 github-hetzner-runners github-hetzner-runners[3647]: 12:38:35 ๐Ÿ€ Checking current API calls consumption rate
14:39:19    > May 24 12:38:36 github-hetzner-runners github-hetzner-runners[3647]: 12:38:36 ๐Ÿ€ Consumed 4 calls in 60 sec, 4965 calls left, reset in 3317 sec

Version

(infrastructure) โžœ  infrastructure git:(main) โœ— github-hetzner-runners -v
1.7.240516.1143322

`APIException: image has incompatible architecture` using `runs-on: [self-hosted, type-cax11]`

Hi, thanks a lot for this project!

I've created an instance of the controller using github-hetzner-runners cloud deploy using the latest version (1.5.231020.1122452) without any configuration beyond the credentials. A job with runs-on: [self-hosted, type-cax11] hangs and in the logs I can find

Dec 15 16:11:40 github-hetzner-runners github-hetzner-runners[2873]: 16:11:40 โŒ APIException: image has incompatible architecture

I can fix this by adding the image-arm-system-ubuntu-22.04 label, but it would be nice if I didn't have to. The README of this project also shows an example using an Ampere server without this label.

Run for multiple repositories or an entire organization

Is it possible to run in multiple repositories or an entire organization?

I've try to start the github-actions-runner process multiple times, once per GitHub repository but they seem to be stepping on each other's toes when they're in the same HCloud project (this is conjecture).

Hetzner API rate limit

Running into api limits on the heztner end APIException: limit of 3600 requests per hour reached - https://docs.hetzner.cloud/#rate-limiting.

Sometimes exception says APIException: limit of 7200 requests per hour reached.

Not sure why there's two different ones. The github api usage is ok.

`--github-repository` option does not propagate into setup script

When using github-hetzner-runners --github-repository acme/repo instead of GITHUB_REPOSITORY=acme/repo github-hetzner-runners, it seems the GitHub Actions runner created by the controller to run with GITHUB_REPOSITORY=None, causing the default setup script to run ./config.sh --url https://github.com/None ....

When starting lots of jobs in parallel with github actions strategy matrix, 30-50% of them are not picked up

Hi!

We're using the library now in production, and it has been extremely useful for us! (our config is very simple: --max-runners 40, recycling on).
The only problem we're facing is: when we're startuing let's say 5 or 10 in parallel, then 30-50% of the jobs are marked as "failed job".
image

I was trying to look at the logs but the only error message I'm finding is this one:

06:24:20 scale_down ERROR โŒ APIException: cannot perform operation because server is locked

I'm now trying to look in depth whether it's some kind of instance creation timeout, while keeping the page open before to see what is happening before github just declares it a "failed job", hopefully have an update soon!

Runners are not being re-used

Hello! Thanks again for this nice tool.

I have pretty standard installation of the application (i.e. not many overwritten config options, besides the startup script and default image, as well as increasing the time to cleanup powered off servers). I kicked off a bunch of CI jobs on my repository earlier through commits, now it seems it has accumulated 10 powered off servers (which seems to be the default max) and it doesn't progress.

I.e. it doesn't spin up new ones (which makes sense as per the default worker limit) but also doesn't reuse the powered off runners.

These are the config values I override:

    image: ghcr.io/kraken-build/github-runners:${tag}
    command:
      - --startup-x64-script=/opt/startup-x64.sh
      - --startup-arm64-script=/opt/startup-arm64.sh
      - --max-unused-runner-time=3000  # 50 minutes
      - --max-powered-off-time=3000  # 50 minutes
      # NOTE: We can not set a default image per architecture, so this will be invalid for arm servers.
      #       See https://github.com/testflows/TestFlows-GitHub-Hetzner-Runners/issues/10
      - --default-image=x86:app:docker-ce
    environment:
      - GITHUB_TOKEN=${github_token}
      - GITHUB_REPOSITORY=${repo.name}
      - HETZNER_TOKEN=${repo.hetzner_token}
    volumes:
      - ./startup-x64.sh:/opt/startup-x64.sh:ro
      - ./startup-arm64.sh:/opt/startup-arm64.sh:ro
      - /root/.ssh/id_rsa:/root/.ssh/id_rsa:ro
      - /root/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub:ro
    restart: always

Aside from this it's very vanilla, see the Dockerfile:

FROM python:3.10 as builder

RUN pip install --upgrade pip && \
    # See https://github.com/yaml/pyyaml/issues/736
    echo 'Cython < 3.0' > /tmp/constraint.txt && \
    pip install pex && \
    PIP_CONSTRAINT=/tmp/constraint.txt pex testflows.github.hetzner.runners==1.5.231020.1122452 \
        -c github-hetzner-runners -o /usr/local/bin/github-hetzner-runners

FROM python:3.10
COPY --from=builder /usr/local/bin/github-hetzner-runners /usr/local/bin/github-hetzner-runners
ENTRYPOINT [ "/usr/local/bin/github-hetzner-runners" ]

Screenshot of a currently pending job:

image

Maybe relevant screenshot from two of the VMs that I think should get reused:

image image

There's been no changes to the hetzner-runners configuration in the last week.

Disable custom scripts for apps and snapshots.

A flag to disable running custom setup/startup scripts when using an app or a snapshot.

Because Heztner snapshots can take a while to load (5+ min), it is more convenient to run the setup script. But this interferes with existing apps. For example, I have a setup script that adds docker, but when using the Hetzner docker app, it causes issues on launch.

Add support for Ubuntu 24.04 with Python 3.12 to resolve "error: externally-managed-environment".

On Ubuntu 24.04 with Python 3.12 you will get the following error when trying to install python modules using pip command.

error: externally-managed-environment

ร— This environment is externally managed
โ•ฐโ”€> To install Python packages system-wide, try apt install
    python3-xyz, where xyz is the package you are trying to
    install.

    If you wish to install a non-Debian-packaged Python package,
    create a virtual environment using python3 -m venv path/to/venv.
    Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
    sure you have python3-full installed.

    If you wish to install a non-Debian packaged Python application,
    it may be easiest to use pipx install xyz, which will manage a
    virtual environment for you. Make sure you have pipx installed.

    See /usr/share/doc/python3.11/README.venv for more information.

note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
hint: See PEP 668 for the detailed specification.

We either need to do:

  • use --break-system-packages flag
  • sudo mv /usr/lib/python3.11/EXTERNALLY-MANAGED /usr/lib/python3.11/EXTERNALLY-MANAGED.old

can't pass in config file to `cloud deploy`

Hi! we'd love to use this, thank you for making it available!
We're trying to use this command:
github-hetzner-runners cloud deploy -c <filename>.yaml
but getting the following error message: github-hetzner-runners: error: unrecognized arguments: -c
It looks like cloud deploy is not wired up correctly? also the README mentions "deploy" which is not available.

github-hetzner-runners -v
1.5.231005.1001309

Thanks!:)

Removing recycle server instances that use dedicated vCPU

Recycled servers count against the Heztner D.vCPU limit and prevent new servers from being created.

For example, if the the D.vCPU limit is 32 and a job that uses ccx53 follows a job that uses ccx43, the ccx43 instance is going to sit in a powered off state waiting to be recycling, preventing the ccx53 from being created.

mutable default for field deploy is not allowed: use default_factory

The config dataclasses in testflows use mutable default values, causing an error like the following in Python 3.11+:

  File "/root/.pex/installed_wheels/601f4b6502885652ae5175b207ca02309d8e405ba63406ece0addf0dcc9f3919/testflows.github.hetzner.runners-1.5.231020.1122452-py3-none-any.whl/.prefix/bin/github-hetzner-runners", line 38, in <module>
    from testflows.github.hetzner.runners.scale_up import scale_up
  File "/root/.pex/installed_wheels/601f4b6502885652ae5175b207ca02309d8e405ba63406ece0addf0dcc9f3919/testflows.github.hetzner.runners-1.5.231020.1122452-py3-none-any.whl/testflows/github/hetzner/runners/scale_up.py", line 27, in <module>
    from .config import Config, check_image
  File "/root/.pex/installed_wheels/601f4b6502885652ae5175b207ca02309d8e405ba63406ece0addf0dcc9f3919/testflows.github.hetzner.runners-1.5.231020.1122452-py3-none-any.whl/testflows/github/hetzner/runners/config/__init__.py", line 15, in <module>
    from .config import Config
  File "/root/.pex/installed_wheels/601f4b6502885652ae5175b207ca02309d8e405ba63406ece0addf0dcc9f3919/testflows.github.hetzner.runners-1.5.231020.1122452-py3-none-any.whl/testflows/github/hetzner/runners/config/config.py", line 81, in <module>
    @dataclass
     ^^^^^^^^^
  File "/usr/local/lib/python3.11/dataclasses.py", line 1230, in dataclass
    return wrap(cls)
           ^^^^^^^^^
  File "/usr/local/lib/python3.11/dataclasses.py", line 1220, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dataclasses.py", line 958, in _process_class
    cls_fields.append(_get_field(cls, name, type, kw_only))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dataclasses.py", line 815, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'testflows.github.hetzner.runners.config.config.deploy'> for field deploy is not allowed: use default_factory

`cloud log -f` error message

Hi!

We have a deployed instance and trying to access the logs with:
github-hetzner-runners cloud log -f

But getting the following error message:

Using config file: /home/ubuntu/.github-hetzner-runners/config.yaml
14:49:03 scale_up       ERROR    โ— Error: TypeError count_present() got an unexpected keyword argument 'server'
14:49:11 api_watch      INFO     ๐Ÿ€ Logging in to GitHub
14:49:11 api_watch      INFO     ๐Ÿ€ Checking current API calls consumption rate
14:49:11 api_watch      INFO     ๐Ÿ€ Consumed 0 calls in 60 sec, 5000 calls left, reset in 3599 sec
14:49:18 scale_up       INFO     ๐Ÿ€ Checking standby runner pool
14:49:18 scale_up       ERROR    โŒ TypeError: count_present() got an unexpected keyword argument 'server'
14:49:18 scale_up       ERROR    โ— Error: TypeError count_present() got an unexpected keyword argument 'server'

with config.yaml:

config:
   standby_runners:
      - labels:
         - type-cpx51
        count: 2
        replenish_immediately: false

Thanks for looking into it!!

AWS Support.

Add support to use AWS servers instead of Heztner.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.