amrutha00 / yacs Goto Github PK

View Code? Open in Web Editor NEW

A Python-based simulation of a Hadoop 1.0 centralized job scheduler and resource allocator. Implements features like task dependency resolution, fault tolerance, and support for randomization in a distributed system. Utilizes multi-threading, sockets, and synchronization for efficient scheduling of MapReduce jobs.

Shell 0.52% Python 99.48%

yacs's Introduction

YACS: Yet Another Centralized Scheduler

About

This project aims to simulate a classic job scheduler in a distributed system. The architecture consists of one master machine and several worker machines.

The Master machine listens for job requests, parses the various the various tasks that make up the job and schedules these tasks to the workers based on the given algorithm. The master is responsible for resolving dependencies among the tasks in a job and keeping track of jobs running in parallel in the worker machines.

Three sheduling algorithms were implemented namely random, round-robin and least-loaded (worst fit)

The worker machines listen for tasks from the master, run the given tasks and update the master whenever a task completes.

Steps to run the project:

cd into the 'local_files' directory and run start-all.sh to set up the master and the worker files.
Run the following command: python3 requests.py 5
This code sends five job requests to the master. To change the number of jobs, simply change the number passed as an argument to the requests.py
To stop all the processes, run end-all.sh.

Submission Files:

The files required for submission are present in the src folder.

master.py
worker.py
analysis.py
BD_0210_0416_1879_2057_report.pdf

Note:

To change the number of workers in the cluster, modify the start-all.sh script.
To change the port numbers on which the workers listen to, modify the config.json and start-all.sh scripts.
Each worker has its own log, named worker<worker_id>.log
The scheduling algorithm can be changed by modifying the start-all.sh script
To get more insight from the log files, run python3 analysis.py.To view the plots,navigate to the Visualisation folder.

Recommend Projects

amrutha00 / yacs Goto Github PK

yacs's Introduction

YACS: Yet Another Centralized Scheduler

About

Steps to run the project:

Submission Files:

Note:

yacs's People

Contributors

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent