Coder Social home page Coder Social logo

๐Ÿง™ A modern replacement for Airflow.

Documentationย ย ย ๐ŸŒช๏ธย ย ย  Get a 5 min overviewย ย ย ๐ŸŒŠย ย ย  Play with live toolย ย ย ๐Ÿ”ฅย ย ย  Get instant help

Give your data team magical powers

Integrate and synchronize data from 3rd party sources

Build real-time and batch pipelines to transform data using Python, SQL, and R

Run, monitor, and orchestrate thousands of pipelines without losing sleep


1๏ธโƒฃ ๐Ÿ—๏ธ

Build

Have you met anyone who said they loved developing in Airflow?
Thatโ€™s why we designed an easy developer experience that youโ€™ll enjoy.

Easy developer experience
Start developing locally with a single command or launch a dev environment in your cloud using Terraform.

Language of choice
Write code in Python, SQL, or R in the same data pipeline for ultimate flexibility.

Engineering best practices built-in
Each step in your pipeline is a standalone file containing modular code thatโ€™s reusable and testable with data validations. No more DAGs with spaghetti code.

โ†“

2๏ธโƒฃ ๐Ÿ”ฎ

Preview

Stop wasting time waiting around for your DAGs to finish testing.
Get instant feedback from your code each time you run it.

Interactive code
Immediately see results from your codeโ€™s output with an interactive notebook UI.

Data is a first-class citizen
Each block of code in your pipeline produces data that can be versioned, partitioned, and cataloged for future use.

Collaborate on cloud
Develop collaboratively on cloud resources, version control with Git, and test pipelines without waiting for an available shared staging environment.

โ†“

3๏ธโƒฃ ๐Ÿš€

Launch

Donโ€™t have a large team dedicated to Airflow?
Mage makes it easy for a single developer or small team to scale up and manage thousands of pipelines.

Fast deploy
Deploy Mage to AWS, GCP, or Azure with only 2 commands using maintained Terraform templates.

Scaling made simple
Transform very large datasets directly in your data warehouse or through a native integration with Spark.

Observability
Operationalize your pipelines with built-in monitoring, alerting, and observability through an intuitive UI.

๐Ÿง™ Intro

Mage is an open-source data pipeline tool for transforming and integrating data.

  1. Install
  2. Demo
  3. Tutorials
  4. Documentation
  5. Features
  6. Core design principles
  7. Core abstractions
  8. Contributing

๐Ÿƒโ€โ™€๏ธ Install

The recommended way to install the latest version of Mage is through Docker with the following command:

docker pull mageai/mageai:latest

You can also install Mage using pip or conda, though this may cause dependency issues without the proper environment.

pip install mage-ai
conda install -c conda-forge mage-ai

Looking for help? The fastest way to get started is by checking out our documentation here.

Looking for quick examples? Open a demo project right in your browser or check out our guides.

๐ŸŽฎ Demo

Live demo

Build and run a data pipeline with our demo app.

WARNING

The live demo is public to everyone, please donโ€™t save anything sensitive (e.g. passwords, secrets, etc).

Demo video (5 min)

Mage quick start demo

Click the image to play video


๐Ÿ‘ฉโ€๐Ÿซ Tutorials

Fire mage


๐Ÿ”ฎ Features

๐ŸŽถ Orchestration Schedule and manage data pipelines with observability.
๐Ÿ““ Notebook Interactive Python, SQL, & R editor for coding data pipelines.
๐Ÿ—๏ธ Data integrations Synchronize data from 3rd party sources to your internal destinations.
๐Ÿšฐ Streaming pipelines Ingest and transform real-time data.
โŽ dbt Build, run, and manage your dbt models with Mage.

A sample data pipeline defined across 3 files โž

  1. Load data โž
    @data_loader
    def load_csv_from_file():
        return pd.read_csv('default_repo/titanic.csv')
  2. Transform data โž
    @transformer
    def select_columns_from_df(df, *args):
        return df[['Age', 'Fare', 'Survived']]
  3. Export data โž
    @data_exporter
    def export_titanic_data_to_disk(df) -> None:
        df.to_csv('default_repo/titanic_transformed.csv')

What the data pipeline looks like in the UI โž

data pipeline overview

New? We recommend reading about blocks and learning from a hands-on tutorial.

Ask us questions on Slack


๐Ÿ”๏ธ Core design principles

Every user experience and technical design decision adheres to these principles.

๐Ÿ’ป Easy developer experience Open-source engine that comes with a custom notebook UI for building data pipelines.
๐Ÿšข Engineering best practices built-in Build and deploy data pipelines using modular code. No more writing throwaway code or trying to turn notebooks into scripts.
๐Ÿ’ณ Data is a first-class citizen Designed from the ground up specifically for running data-intensive workflows.
๐Ÿช Scaling is made simple Analyze and process large data quickly for rapid iteration.

These are the fundamental concepts that Mage uses to operate.

Project Like a repository on GitHub; this is where you write all your code.
Pipeline Contains references to all the blocks of code you want to run, charts for visualizing data, and organizes the dependency between each block of code.
Block A file with code that can be executed independently or within a pipeline.
Data product Every block produces data after it's been executed. These are called data products in Mage.
Trigger A set of instructions that determine when or how a pipeline should run.
Run Stores information about when it was started, its status, when it was completed, any runtime variables used in the execution of the pipeline or block, etc.

๐Ÿ™‹โ€โ™€๏ธ Contributing and developing

Add features and instantly improve the experience for everyone.

Check out the contributing guide to set up your development environment and start building.


๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Community

Individually, weโ€™re a mage.

๐Ÿง™ Mage

Magic is indistinguishable from advanced technology. A mage is someone who uses magic (aka advanced technology). Together, weโ€™re Magers!

๐Ÿง™โ€โ™‚๏ธ๐Ÿง™ Magers (/หˆmฤjษ™r/)

A group of mages who help each other realize their full potential! Letโ€™s hang out and chat together โž

Hang out on Slack

For real-time news, fun memes, data engineering topics, and more, join us on โž

Twitter Twitter
LinkedIn LinkedIn
GitHub GitHub
Slack Slack

๐Ÿค” Frequently Asked Questions (FAQs)

Check out our FAQ page to find answers to some of our most asked questions.


๐Ÿชช License

See the LICENSE file for licensing information.

Water mage casting spell


Mage's Projects

assets icon assets

Media assets used in repository documentation.

dbt-core icon dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt-mysql icon dbt-mysql

dbt-mysql contains all of the code enabling dbt to work with MySQL and MariaDB

delta-rs icon delta-rs

A native Rust library for Delta Lake, with bindings into Python and Ruby.

docker icon docker

Dockerfile and Docker compose templates

etl-demo icon etl-demo

Mage ELT demo for pulling data from an API, performing transformations, and writing to a local DuckDB database.

machine_learning icon machine_learning

The definitive end-to-end machine learning (ML lifecycle) guide and tutorial for data engineers.

mage-ai icon mage-ai

๐Ÿง™ Build, run, and manage data pipelines for integrating and transforming data.

mage-ai-cdk icon mage-ai-cdk

AWS Cloud Development Kit (CDK) scripts for deploying Mage to AWS

mage-zoomcamp icon mage-zoomcamp

This repository will contain all of the resources for the Mage component of the Data Engineering Zoomcamp: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main

mage_demo_project icon mage_demo_project

Demo project containing data integration pipelines and batch transformation pipelines.

magic-devcontainer icon magic-devcontainer

A demo instance of mage for pulling sample data from a public Google pub/sub topic and transforming with dbt.

platform_template icon platform_template

Mage project platform template for using multiple projects and other non-Mage projects in 1 Mage ultra project.

reaflow icon reaflow

๐ŸŽฏ React library for building workflow editors, flow charts and diagrams. Maintained by @goodcodeus.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.