Coder Social home page Coder Social logo

dagster-example's Introduction

What it is

This is an example that pipes data from MySQL to S3. MySQL and S3 are just examples. They can be replaced by any source and sink as long as they match the interface.

There are 2 ways to pipe a table: fully or in batches. Piping in batches is required for large tables.

Piping in batches has the following problem: the first operation that outputs batches must finish before the nest ops can start. If the table is large, all these batches will be piped out of the original table into the local storage. This is not scaleable.

The solution is to extract 1 batch and then rerun the pipeline to extract the next batch etc.

To do so the pipeline must:

  • remember where it stopped last time
  • use Dagster Sensor

Code

First you must set up the environment by running:

. ./setup.sh

The entry point is in src/main.py. Learn the code. To run it, you'll need to create a conf.yaml file with configurations.

Then run the Dagster Daemon:

dagster-daemon run &

If you omit & the daemon will stay in teh foreground. This is useful for learnig Dagster.

Then run the UI:

dagit -w workspace.yaml

Browse to the UI and enable the sensor.

dagster-example's People

Contributors

danielnuriyev avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.