Coder Social home page Coder Social logo

astro / skyflake Goto Github PK

View Code? Open in Web Editor NEW
156.0 8.0 2.0 2.91 MB

NixOS Hyperconverged Infrastructure on Nomad/NixOS

Home Page: https://astro.github.io/skyflake/

License: MIT License

Nix 100.00%
microvm nixos cluster flakes hyperconverged-infrastructure nomad ceph nix

skyflake's Introduction

Skyflake: Hyperconverged Infrastructure for NixOS

  • No Docker, no Kubernetes
  • Hosts run NixOS, payloads are NixOS in microvm.nix
  • Static hosts, dynamic virtual machines managed by Nomad
  • Deploy machines by git push your Nix Flake

Running the example cluster

  • Have a bridge virbr0.

  • Provide Internet access.

  • Have 3x 4 GB RAM.

  • Have 3x 20 GB disk.

  • Put your SSH public key into example-server.nix

  • Run MicroVMs in parallel:

    nix run .#example1
    nix run .#example2
    nix run .#example3
  • Login and check for the IP address.

  • Next, create your user flake:

    {
      outputs = { self, nixpkgs }: {
        nixosConfigurations =
          let
            mkHost = hostName:
              nixpkgs.lib.nixosSystem {
                modules = [ {
                  system.stateVersion = "22.11";
                  networking = { inherit hostName; };
                  services.openssh = {
                    enable = true;
                    permitRootLogin = "yes";
                  };
                  users.users.root.password = "";
                } ];
                system = "x86_64-linux";
              };
          in {
            skytest1 = mkHost "skytest1";
            skytest2 = mkHost "skytest2";
            skytest3 = mkHost "skytest3";
            skytest4 = mkHost "skytest4";
          };
      };
    }
  • Finally, deploy by pushing to a branch by hostname:

    git push [email protected]:example \
      HEAD:skytest1 HEAD:skytest2 \
      HEAD:skytest3 HEAD:skytest4

How it works

The central component is a nixosModule that is configured for servers to be part of a cluster.

Users have a flat hierarchy of flake repositories they can push to. Their ssh interaction is forced into a custom script that lets only git push, triggering a hook that does the following:

  1. Builds the NixOS system
  2. Copies the result into a cluster-shared binary cache
  3. Runs the job on the cluster through Nomad

Server configuration options

The nixosModule for the servers that make up the cluster provides the following knobs:

TODO

Deployment customization

Network setup, storage integration and more options of the MicroVMs must be customized for the environment.

See default-customization.nix

skyflake's People

Contributors

astro avatar spacekitteh avatar supersandro2000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

skyflake's Issues

var-lib-skyflake-binary\x2dcache.mount won't start due to missing `kmod` in environment

Oct 09 16:06:43 cluster-fancynode systemd[1]: Mounting /var/lib/skyflake/binary-cache...
Oct 09 16:06:43 cluster-fancynode mount[15031]: sh: line 1: modprobe: command not found
Oct 09 16:08:13 cluster-fancynode systemd[1]: var-lib-skyflake-binary\x2dcache.mount: Mounting timed out. Terminating.
Oct 09 16:08:13 cluster-fancynode systemd[1]: var-lib-skyflake-binary\x2dcache.mount: Mount process exited, code=killed, status=15/TERM
Oct 09 16:08:13 cluster-fancynode systemd[1]: var-lib-skyflake-binary\x2dcache.mount: Failed with result 'timeout'.
Oct 09 16:08:13 cluster-fancynode systemd[1]: var-lib-skyflake-binary\x2dcache.mount: Unit process 15027 (mount.ceph) remains running after unit stopped.
Oct 09 16:08:13 cluster-fancynode systemd[1]: Failed to mount /var/lib/skyflake/binary-cache.
warning: error(s) occurred while switching to the new configuration

[Feature request] Use `ceph-volume` to manage OSDs

So, I'm converting my existing Ceph cluster to use Skyflake. One issue that is stopping me from using it to deploy OSDs with, however, is that I require full disk encryption for each OSD.

I deployed my OSDs via ceph-volume create, with the --dmcrypt flag. This sets up each OSD to be encrypted via LUKS. At boot, ceph-volume activate grabs the LUKS keys from a Ceph monitor, passes them to dm-crypt, and then proceeds as normal in activating the OSDs.

Existing OSDs can be easily converted to use ceph-volume; so is it possible to change the OSD management logic to use ceph-volume instead? I suspect it would greatly simplify the implementation.

Exchanging Ceph for SeaweedFS

Hello @astro !
Also cc @spacekitteh

I am a member of HacDC, an american hackerspace inspired by the CCC. I had seen @astro's NixCon talk regarding microvm.nix, and i am searching for how to run the skyflake code in my company's lab - also in the hackerspace soon if possible.

As the talk ends, @astro says 'One does not simply do a major-version upgrade of Ceph'. This is similar to my experience, and Ceph seems to be a large and cumbersome dependency for this code.

Since i had no luck launching Ceph locally on my 3 nodes, i am choosing to use [SeaweedFS]; the commands to run a HA cluster are super simple and i accomplished them in less than an hour. My seaweed systemd config is here:
https://base.bingo/code/build/src/branch/mesh/nixos/process.nix#L31
[SeaweedFS]: https://github.com/seaweedfs/seaweedfs

Can you help me remake skyflake to run on the much simpler seaweedFS, and drop ceph as a dependency?

I looked around the nix modules, and my guess is the host nodes only need shared volumes for the nix store and the garbage collection roots. Are these volumes also shared inside the microVMs, or only used during the original vm builds? There should also be persistent VM storage, and that is a secondary concern.

I plan to spend some more days on this though I'm in a hurry to be back to application programming; i could use as much help as possible. Your approach seems much more dependable than kubernetes and i think reducing your dependency on Ceph can make your program much more accessible.

Glad for your help!
Callie

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.