Coder Social home page Coder Social logo

tobylooth / genie Goto Github PK

View Code? Open in Web Editor NEW

This project forked from netflix/genie

0.0 0.0 0.0 205.39 MB

Distributed Big Data Orchestration Service

Home Page: https://netflix.github.io/genie

License: Apache License 2.0

Shell 0.35% JavaScript 1.30% Python 0.35% Java 71.90% Groovy 23.22% CSS 1.98% HTML 0.03% PLpgSQL 0.80% Dockerfile 0.05% Vim Snippet 0.01%

genie's Introduction

Genie

License Issues NetflixOSS Lifecycle

Introduction

Genie is a federated Big Data orchestration and execution engine developed by Netflix.

Genie’s value is best described in terms of the problem it solves.

Big Data infrastructure is complex and ever-evolving.

Data consumers (Data Scientists or other applications) need to jump over a lot of hurdles in order to run a simple query:

  • Find, download, install and configure a number of binaries, libraries and tools
  • Point to the correct cluster, using valid configuration and reasonable parameters, some of which are very obscure
  • Manually monitor the query, retrieve its output

What works today, may not work tomorrow. The cluster may have moved, the binaries may no longer be compatible, etc.

Multiply this overhead times the number of data consumers, and it adds up to a lot of wasted time (and grief!).

Data infrastructure providers face a different set of problems:

  • Users require a lot of help configuring their working setup, which is not easy to debug remotely
  • Infrastructure upgrades and expansion require careful coordination with all users

Genie is designed to sit at the boundary of these two worlds, and simplify the lives of people on either side.

A data scientist can “rub the magic lamp” and just say “Genie, run query ‘Q’ using engine SparkSQL against production data”. Genie takes care of all the nitty-gritty details. It dynamically assembles the necessary binaries and configurations, execute the job, monitors it, notifies the user of its completion, and makes the output data available for immediate and future use.

Providers of Big data infrastructure work with Genie by making resources available for use (clusters, binaries, etc) and plugging in the magic logic that the user doesn’t need to worry about: which cluster should a given query be routed to? Which version of spark should a given query be executed with? Is this user allowed to access this data? etc. Moreover, every job’s details are recorded for later audit or debugging.

Genie is designed from the ground up to be very flexible and customizable. For more details visit the official documentation

Builds

Genie builds are run on Travis CI here.

Branch Build Coverage (coveralls.io)
master (4.2.x) Build Status Coverage Status
4.1.x Build Status Coverage Status
4.0.x Build Status Coverage Status

Project structure

genie-app

Self-contained Genie service server.

genie-agent-app

Self-contained Genie CLI job executor.

genie-client

Genie client interact with the service via REST API.

genie-web

The main server library, can be re-wrapped to inject and override server components.

genie-agent

The main agent library, can be re-wrapped to inject and override components.

genie-common, genie-common-internal, genie-common-external

Internal components libraries shared by the server, agent, and client modules.

genie-proto

Protobuf messages and gRPC services definition shared by server and agent. This is not a public API meant for use by other clients.

genie-docs, genie-demo

Documentation and demo application.

genie-test, genie-test-web

Testing classes and utilities shared by other modules.

genie-ui

JavaScript UI to search and visualize jobs, clusters, commands.

genie-swagger

Auto-configuration of Swagger via Spring Fox. Add to final deployment artifact of server to enable.

Artifacts

Genie publishes to Maven Central and Docker Hub

Refer to the demo section of the documentations for examples. And to the setup section for more detailed instructions to set up Genie.

Python Client

The Genie Python client is hosted in a different repository.

Further info

For a detailed explanation of Genie architecture, use cases, API documentation, demos, deployment and customization guides, and more, visit the Genie documentation.

Contact

To contact Genie developers with questions and suggestions, please use GitHub Issues

genie's People

Contributors

tgianos avatar mprimi avatar amitsharmaak avatar irontablee avatar sriramkrishnan avatar cabhishek avatar ajoymajumdar avatar rpalcolea avatar stephen-mw avatar sumitnetflix avatar bhou2 avatar enicloom avatar rspieldenner avatar chali avatar xiao-chen avatar skwslide avatar jkschneider avatar dtrebbien avatar nvhoang avatar kruegerb-rv avatar charsmith avatar sghill avatar piaozhexiu avatar jmnarloch avatar rgbkrk avatar mikegrima avatar natadzen avatar rmeshenberg avatar z1000 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.