Coder Social home page Coder Social logo

aurae-runtime / auraed Goto Github PK

View Code? Open in Web Editor NEW
60.0 12.0 11.0 212 KB

Secure mTLS and gRPC backed runtime daemon. Alternative to systemd. Written in Rust.

Home Page: https://aurae.io/auraed

License: Apache License 2.0

Makefile 7.45% Rust 68.02% Dockerfile 2.88% Shell 21.65%
daemon grpc mtls rust

auraed's Introduction

Aurae Daemon

The Aurae Daemon (auraed) is the main daemon that powers Aurae.

The Aurae Daemon runs as a gRPC server which listens over a unix domain socket by default.

/var/run/aurae/aurae.sock

Running Auraed

Running as /init is currently under active development.

To run auraed as a standard library server you can run the daemon alongside your current init system.

sudo -E auraed

Additional flags are listed below.

USAGE:
    auraed [OPTIONS]

OPTIONS:
        --ca-crt <CA_CRT>            [default: /etc/aurae/pki/ca.crt]
    -h, --help                       Print help information
    -s, --socket <SOCKET>            [default: /var/run/aurae/aurae.sock]
        --server-crt <SERVER_CRT>    [default: /etc/aurae/pki/_signed.server.crt]
        --server-key <SERVER_KEY>    [default: /etc/aurae/pki/server.key]
    -v, --verbose                    
    -V, --version                    Print version information

Building from source

We suggest using the aurae repository for building all parts of the project.

If you intend on building this repository directly you can leverage the Makefile in this repository.

make

or using Cargo directly

cargo clippy
cargo install --debug --path .

auraed's People

Contributors

bpmooch avatar future-highway avatar j0shgrant avatar krisnova avatar maltej avatar taniwha3 avatar vincinator avatar wesen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

auraed's Issues

Prevent exit to prevent kernel panic

Via MalteJ, "auraed must not exit if something bad happens. When running as pid 1, we get a kernel panic, when pid 1 exits. We could trigger a reboot instead."

A panic in the daemon will only crash the thread, not the program, as everything is in a thread handled by tokio. As long as tokio's crash/exit is handled, then auraed can be prevented from exiting, hopefully.

A simple loop to restart the daemon may be an appropriate solution:

#[tokio::main]
async fn main() {
    loop {
        let exit_code = daemon().await;
        println!("daemon stopped with exit code: {}", exit_code);
        println!("restarting daemon...");
    }

    panic!("auraed should never exit!");
}

replacing

async fn main() -> Result<(), Box<dyn std::error::Error>> {

Configurable mount points

The user space initialization mounts devfs, procfs and sysfs. Not sure yet, but we may want to mount storage as well.
A mount config (semantically similar to fstab) should be parsed by the initialization of the rootfs.

  • A configuration should specify what should be mounted where
  • This configuration should allow to omit mounting targets (e.g. not mounting procfs)

Kernel logging

We should expose Kernel logging via the API, so it can be picked up by a client. As we will also need stdout and stderr streams of all processes auraed is spawning, it would probably be wise to use the same logging pipeline also for Kernel logs.

We can use klogctl to collect Kernel logs.

Waiting for network link to become available

When setting a network device up (via netlink) the function returns before the network device is actually in state up.
Assigning routes to a network device (via netlink) requires the network device to be up.

I think the caller should safely assume that .await for the async function init/network.rs:set_link_up assures that the link is up, so that follow up network configuration steps can safely assume that the link is really up after the call.

In addition, I would like to provide access to the netdevice state during network configuration, so that a config functions relying on a link state can implement safety checks.

Graceful shutdown

We need to implement a graceful shutdown flow.
This flow must be executed when auraed gets a SIGTERM signal, the power button is pressed (if executed as pid 1) or when auraed gets a reboot or shutdown request via GRPC.

  1. An event has to be sent to all GRPC clients to inform them about the imminent shutdown.
  2. No new workloads (processes, containers, VMs, ...) may be scheduled.
  3. All threads, processes, containers, VMs, MicroVMs must be shut down gracefully (e.g. sending SIGTERM to processes, waiting for x seconds and then sending SIGKILL if they have not shut down; sending ACPI shutdown to VMs, power off after timeout).
  4. The GRPC API needs to be stopped.
  5. The API socket needs to be deleted.

Objects and Code Generation

The Aurae project is based around the concept of objects.

Giving that an RPC Message is the lowest common definition for our data structures, we will need to be able to generate a substantial amount of code and boilerplate for each object.

Developers (and maybe one day consumers, clients, and end-users) should have an easy way of creating and expressing new and generic objects in the codebase.

For each object we define as an RPC message we will need to do the following.

  • Establish a database table, and corresponding schema.
  • Establish a source of truth for Rust structures
  • Establish client code
  • Establish server code
  • Establish Authz (authorization) style traits which can be implemented to bring authz to each object and corresponding functions
  • Establish AuraeScript definitions with corresponding getters and setters such that the objects can quickly be expressed in AuraeScript

network interface configuration

When running as pid 1, auraed has to care about network interface configuration.
By using netlink and the Linux kernel we can configure static IPv4 and IPv6 addresses and SLAAC IPv6.
To support DHCP provided IPv4 and IPv6 addresses, we need to integrate a DHCP client.

Also, we have to think about how to expose the runtime API. Currently auraed creates a Unix socket. This socket is not accessible from outside of the machine. Can we change this to an IP based socket? Or make it configurable?

Detect Power Button Devices for all supported systems

Implement a method to get a list of power button devices, so that events from all power buttons of the system can be handled by auraed.

  • multiple power buttons are possible (depends on hardware)
  • detect the reboot button and handle it with a reboot instead
  • linux specific:
    • devices are listed in /proc/bus/input/devices
    • The integer X in /dev/input/eventX will vary. Thus, another reason why the detection is required.

Note a linux specific listener implementation: #31. BSD, OSX or other may handle input devices differently (I don't know yet).

Unwrap

As we are still in the sandbox phase of building Aurae we are using unwrap statements in the code. We should replace these with safer and more idiomatic systems in Rust.

Additionally we should build a linting system that prevents code like this from entering the project.

Avoid kernel panic in case auraed stops

When auraed is started as pid1, it has no parent process in user space. It is then also called init process.
In case the init process stops, the kernel does not know what to do - so it panics.

I think we shouldn't let the kernel panic, and handle the two cases:

  1. regular stop of auraed
  2. unhandled rust panic / aka auraed crashes

A simple solution is to just shutdown the system via e.g.

 syscall_reboot(libc::LINUX_REBOOT_CMD_POWER_OFF);

Trigger reboot/shutdown from grpc

it should be possible to trigger a system reboot or a shutdown via the grpc API.
If auraed has a process ID > 1 this action should result in a graceful shutdown and exit of auraed instead.

See also #36

Using thiserror for custom error type

The thiserror crate is a common library used for creating custom library errors.

Code similar to the below can be placed in the lib.rs file. I added some comments to help explain the crates usage, but the docs page gives great and more advanced examples. While the docs do say a struct can be used instead of an enum, I prefer an enum.

// Define our own result type where the error is our custom error type
// Usage example: fn i_may_fail() -> crate::Result<DidntFailType> { ... }
type Result<T> = std::result::Result<T, Error>;

// Custom error made using thiserror crate
#[derive(thiserror::Error, Debug)]
pub enum Error {
    // If we want to pass through all io errors the from attribute will make this easier by
    // automatically implementing .into() (impl of from gives impl of into for free)
    #[error("an io related error occurred: {0}")]
    Io(#[from] io::Error),
    // This is our first custom type, where the error is just a String.
    // We can't use from because CustomErrorVariant2 also takes just a String
    #[error("CustomErrorVariant1: {0}")]
    CustomErrorVariant1(String),
    #[error("CustomErrorVariant2: {0}")]
    CustomErrorVariant2(String),
    // I'm not sure how I feel about a catch-all variant, but it is an option too.
    // transparent isn't exclusive to the catch-all variant, we could have used it on Io too, for example
    #[error(transparent)]
    Other(#[from] anyhow::Error)
}

Hypervisor Implementation Detail

We need to decide what approach we want to take to create and manage virtual machines from the daemon.

https://github.com/aurae-runtime/api/pull/1 Calls out the following interface for managing VMs.

  rpc RegisterVirtualMachine(RegisterVirtualMachineRequest) returns (RegisterVirtualMachineResponse) {}

  rpc StartVirtualMachine(StartVirtualMachineRequest) returns (StartVirtualMachineResponse) {}

  rpc StopVirtualMachine(StopVirtualMachineRequest) returns (StopVirtualMachineResponse) {}

  rpc DestroyVirtualMachine(DestroyVirtualMachineRequest) returns (DestroyVirtualMachineResponse) {}

Which can be simplified to the following 4 pieces of functionality for virtual machines.

  • Register()
  • Start()
  • Stop()
  • Destroy()

Generate README markdown from rustdoc

We need to start taking advantage of rustdoc in the .proto files.

Ideally we can generate the documentation with a Make target or similar and have it write directly to the /stdlib directory.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.