Coder Social home page Coder Social logo

maestro's Introduction

Hey there 👋

Welcome to my corner of GitHub!
I'm squerez, just a regular tech enthusiast diving into this area for about 5 years now.

About me

I've always been captivated by tech and programming.
When I first joined GitHub, just building a simple Python script to print messages in text boxes was quite the challenge for me. Fast-forward to this day, I've made some progress, but hey, there's always more to explore and learn.

My contributions

So far, I've dipped my toes into several projects, mainly personal, hitting these milestones:

  • I've committed 172 times to a bunch of different repos;
  • Opened 10 issues and pushed 43 pull requests;
  • My personal "pet" projects have scored around 4 stars, spread across 20 repositories;
  • And contributed in 1 public repositories.

Languages I use

My playground involves various languages, but these are my go-tos:

Python HTML Scala PowerShell Java Shell HCL Other

Tech I'm learning

I'm on a quest to level up my skills in:

  • Rust;
  • Go;
  • Ansible;
  • Flux.

When I'm not here, you might find me wandering in a forest of repositories, where I've gotten so lost, I've set up campsites in unfinished projects, in each a sign saying, ⚠️ under construction (forever).

You know, just your typical getaway from the world of 'done'.

Anyways, if you're still curious, check out my commit snake animation right here in my profile.
It's a cool visual of my coding journey, constantly changing with each contribution.

github contribution grid snake animation

My creative collection

Behold, my digital creations:

  • maestro - a Python-based data engineering metadata-driven framework. Currently in a testing phase, evolving as I learn more in the DE field.;
  • shamir - a simple OTP API designed for team use, born from my curious mind exploring API design concepts.;
  • rustsnake - a customizable snake game crafted in Rust, wWhere I've been playing with both the language and game development.;
  • init.lua - my tangled web of Neovim configurations, ever-evolving to suit my whims as a lazy dev.

Feel free to explore and see what I've been up to.


Thanks for stopping by and checking out my arts & crafts. If you've got questions, ideas, or want to collaborate, drop a line via an issue on this repo.

Until next time! 👋

maestro's People

Stargazers

 avatar  avatar

Watchers

 avatar

maestro's Issues

Refactor - CLI

The CLI, the main user interface, should allow a user to:

  • Start and stop tasks
  • Get the status of tasks
  • See the state of machines (i.e. the workers)
  • Start the manager
  • Start the worker

image
image

Refactor - job

The job is an aggregation of tasks. It has one or more tasks that typically form a larger logical grouping of tasks to perform a set of functions. In Kubernetes, the job has type of job - maybe to be implemented in the future?

A job should specify details at a high level and will apply to all tasks it defines:

  • Each task that makes up the job
    
  • How many instances of each task should run
    
  • The type of the job (should it be running continuously or will it run to completion and stop?)
    

Refactor - task

Implement a new class called Task - a task is the smallest unit of work in an orchestration system.

A task should specify the following:

  • The amount of memory, CPU, and disk it needs to run effectively (need to refine this);
    
  • What the orchestrator should do in case of failures, typically called a restart policy;
    
  • The name of the container image used to run the task (need to refine this)
    

Task definitions may specify additional details, but these are the core requirements.

image

The first thing we want to think about is the states a task will go through during its life.

First, a user submits a task to the system. At this point, the task has been enqueued but is waiting to be scheduled. Let’s call this initial state Pending.

Once the system has figured out where to run the task, we can say it has been moved into a state of Scheduled. The scheduled state means the system has determined there is a machine that can run the task, but it is in the process of sending the task to the selected machine or the selected machine is in the process of starting the task.

Next, if the selected machine successfully starts the task, it moves into the Running state.

Upon a task completing its work successfully, or being stopped by a user, the task moves into a state of Completed.

If at any point the task crashes or stops working as expected, the task then moves into a state of Failed.

image

In order to run our tasks as containers, they need a configuration. For a task in our orchestration system, we’ll describe its configuration using the Config class. This class encapsulates all the necessary bits of information about a task’s configuration:

  • The Name field will be used to identify a task in our orchestration system, and it will perform double duty as the name of the running container.
  • The Image field, as you probably guessed, holds the name of the image the container will run. Remember, an image can be thought of as a package: it contains the collection of files and instructions necessary to run a program.
  • The Memory and Disk fields will serve two purposes. The scheduler will use them to find a node in the cluster capable of running a task. They will also be used to tell the Docker daemon the amount of resources a task requires.
  • The Env field allows a user to specify environment variables that will get passed in to the container.
  • Finally, the RestartPolicy field tells the Docker daemon what to do in the event a container dies unexpectedly. This field is one of the mechanisms that provides resilience in our orchestration system. As you can see from the comment, the acceptable values are an empty string, always, unless-stopped, or on-failure. Setting this field to always will, as its name implies, restart a container if it stops. Setting it to unless-stopped will restart a container unless it has been stopped (e.g. by docker stop). Setting it to on-failure will restart the container if it exits due to an error (i.e. a non-zero exit code).

Refactor - worker

The worker provides the muscles of an orchestrator.

It is responsible for running the tasks assigned to it by the manager. If a task fails for any reason, it must attempt to restart the task. The worker also makes metrics about its tasks and its overall machine health available for the manager to poll.

The worker is responsible for the following:

  • Running tasks as Docker containers.
    
  • Accepting tasks to run from a manager.
    
  • Providing relevant statistics to the manager for the purpose of scheduling tasks.
    
  • Keeping track of its tasks and their state.
    

image

Like the manager, it too has an API, though it serves a different purpose. The primary user of this API is the manager. The API provides the means for the manager to send tasks to the worker, to tell the worker to stop tasks, and to retrieve metrics about the worker’s state. Next, the worker has a task runtime, which in our case will be Docker. Like the manager, the worker also keeps track of the work it is responsible for, which is done in the Task Storage layer. Finally, the worker provides metrics about its own state, which it makes available via its API.

Refactor - manager

The manager is the brain of an orchestrator and the main entry point for users. I

n order to run jobs in the orchestration system, users submit their jobs to the manager. The manager, using the scheduler, then finds a machine where the job’s tasks can run. The manager also periodically collects metrics from each of its workers, which are used in the scheduling process.

The manager should do the following:

  • Accept requests from users to start and stop tasks.
    
  • Schedule tasks onto worker machines.
    
  • Keep track of tasks, their states, and the machine on which they run.
    

image

We will also need to implement the API - The API is the primary mechanism for interacting with maestro.
Users submit jobs and request jobs be stopped via the API. A user can also query the API to get information about job and worker status.

We will also need to implement some kind of storage. The manager must keep track of all the jobs in the system in order to make good scheduling decisions, as well as to provide answers to user queries about job and worker statuses. The manager also needs to keep track of worker metrics, such as the number of jobs a worker is currently running, how much memory it has available, how much load is the CPU under, and how much disk space is free. This data, like the data in the job storage layer, is used for scheduling.

Refactor - scheduler

The scheduler decides what machine can best host the tasks defined in the job.
We will need to define the decision-making process:

  • The decision-making process can be as simple as selecting a node from a set of machines in a round-robin fashion, or as complex as the EPVM scheduler (used as part of Google’s Borg scheduler), which calculates a score based on a number of variables and then selects a node with the "best" score.

The scheduler should perform these functions:

  • Determine a set of candidate machines on which a task could run.
    
  • Score the candidate machines from best to worst.
    
  • Pick the machine with the best score.
    

A scheduler contains 3 main phases that represent the order in which the scheduler moves through the process of scheduling task onto workers: feasibility, scoring, and picking.

  • **Feasability**: This phase assesses whether it’s even possible to schedule a task onto a worker. There will be cases where a task cannot be scheduled onto any worker; there will also be cases where a task can be scheduled but only onto a subset of workers. We can think of this phase similar to choosing which car to buy. My budget is $10,000, but depending on which car lot I go to all the cars on the lot could cost more than $10,000 or there may only be subset of cars that fit into my price range.
    
  • **Scoring**: This phase takes the workers identified by the feasability phase and gives each one a score. This stage is the most important and can be accomplished any number of ways. For example, to continue our car purchase analogy, I might give a score for each of three cars that fit within my budget based on variables like fuel efficiency, color, and safety rating.
    
  • **Picking**: The phase is the simplest. From the list of scores, the scheduler picks the best one. This will be either the highest or lowest score.
    

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.