Coder Social home page Coder Social logo

k8s-fah's Introduction

k8s-fah Docker Travis-CI Status CircleCI

Run folding@home on Kubernetes.

The folding@home project added support for the Corona virus (2019-nCoV).

This deployment lets you run folding@home on Kubernetes, should you have any spare cluster-power you'd like to donate.

Note: COVID-19 work units are currently being prioritized, however the folding@home client is liable to select jobs for other diseases too.

If/when they add an option to work only on COVID-19, I will update the deployment here to do so (until the pandemic is over).

 

Overview

There are options to run this on CPU, GPU or a combination of both.

To use these deployment sets that uses GPU's to fold with, I assume that you have a working k8s cluster that have nodes with either 1 or more NVIDIA GPUs in them. (AMD have not been tested).

We are using the same prerequisites as the k8s-device-plugin

  • NVIDIA drivers ~= 384.81
  • nvidia-docker version > 2.0 (see how to install and it's prerequisites)
  • docker configured with nvidia as the default runtime.
  • Kubernetes version >= 1.10

 

Installation modes

Only CPU

The default install deploys 2 replicas, limited to using 1 CPU core each.

kubectl apply -f https://raw.githubusercontent.com/richstokes/k8s-fah/master/folding-cpu.yaml

Only GPU (Nvidia)

The default install deploys 2 replicas, limited to using 1 GPU in each pod.

kubectl apply -f https://raw.githubusercontent.com/richstokes/k8s-fah/master/folding-gpu.yaml

Both CPU & GPU (Nvidia)

kubectl apply -f https://raw.githubusercontent.com/richstokes/k8s-fah/master/folding-gpu-cpu.yaml

Minikube (CPU only mode)

Runs 6x replicas, 1 CPU core each.

kubectl apply -f https://raw.githubusercontent.com/richstokes/k8s-fah/master/folding-minikube.yaml

I like to give my minikube cluster extra resources with:

minikube config set cpus 6
minikube config set memory 8192
minikube delete && minikube start

 

Tested GPU's:

  • NVIDIA
    • NVIDIA GeForce GTX 1080
    • GeForce RTX 2080
    • Tesla K40m
    • Tesla K80
    • V100
  • AMD
    • ... If you have tested this on AMD GPU's, please make a PR accordingly and update the list!

 

Rancher

If you have Rancher, you can easily install by searching for "folding" in your Rancher app catalog.

 

DaemonSet

You can also run this as a DaemonSet (runs one replica per node) with:

kubectl apply -f https://raw.githubusercontent.com/richstokes/k8s-fah/master/folding-daemonset.yaml

There is a tolerations section in this .yaml you can uncomment in order to also run FAHClient on master nodes if you wish.

To enable GPU with the daemon set, uncomment the nvidia.com/gpu: "1" lines from folding-daemonset.yaml before applying.

 

Customizing

Set the replica count and resource limit as appropriate depending on how much CPU you wish to donate. In my testing, memory load has been reasonably low (<512Mi).

I've also added the framework for a PriorityClass, so that K8s may preemptively evict folding@home pods if a higher-priority pod needs resources.

 

config.xml

The most compatible way to edit the config.xml is by modifying it's values and creating your own Docker image.

You can override/mount as a configMap in Kubernetes (you can see the scaffolding for this inside the manifests), however FAHClient seems to what to copy/move this file around, which doesn't work if the file is mounted.

You'll get a bunch of errors from the FAHClient if you do this - there may be a better way to manage the config file - PRs welcome!

 

Credits

Special thanks to Bendik for his work on supporting GPUs and general tweaks to the configs.

k8s-fah's People

Contributors

richstokes avatar skandix avatar yhaenggi avatar kaovilai avatar davidsouther avatar jsloyer avatar saipathuri avatar kbruner avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.