Coder Social home page Coder Social logo

singer-runner's Introduction

singer-runner

Singer Runner manages tap and target processes, as well as metrics, state, and configuration.

Features

  • Run a tap or target
  • Pass run options via CLI paramters or JSON/YAML config file.
  • Use local file system or S3 for piping the singer stream, storing state, and storing metrics.
  • Metric storage, piping, and state storage can be extended / customized by inheriting from the base classes.

Usage

Requires python 3, tested with python 3.7

Install

pip install singer-runner

Run

$ singer-runner 
Usage: singer-runner [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  run-tap
  run-target

Concepts

  • Pipes
    • Pipes move a stream of Singer messages from tap to target. A pipe could be as simple as a local file, a file in S3, or Kafka.
  • State Storage
  • Metrics Storage

Programmatic Usage

Singer runner can be used within any python application. The primary functions are in singer_runner.runner including:

  • run_tap runs a tap
  • run_target runs a target

Classes in the singer_runner.metrics, singer_runner.pipes, and singer_runner.state can be used as arguemnts, along with catalog/config.

singer-runner's People

Contributors

awm33 avatar robmoore avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

singer-runner's Issues

Additional modules required

I followed the instructions in the readme to install from this repo but had to install these modules as well:

  • click
  • dataclasses

After doing so I was able to run singer-runner successfully.

CLI: Exit with error code on failure in tap or target

The current behavior is that the exit code is always 0. Scripts that depend on this code to determine whether to exit or continue fail to exit as a result.

This is appears to be specific to the CLI. The main functions do return success / failure. The desired behavior is either to exit 1 explicitly or raise an exceptions, which should cause an exit 1.

Webhook Server

In order to support singer-io/getting-started#50 a webserver must accept HTTP requests and pass them on as WEBHOOK messages.

Proposal for how that could work:

  • A singer-runner webhook-server command would start an HTTP server.
  • POST /webhooks/:pipe-name/:token
    • pipe-name maps to a specific pipe instance ex mailchimp. When a request is successful, a WEBHOOK message is placed on that pipe.
    • token maps to a single token passed in the config or environment. Perhaps a function or class could be passed in to allow for more complex auth.
  • run_tap would need to be changed to allow for input pipes.

Unwanted datatype conversion

I am using the singer-runner to run a tap and target combination programmatically. However, by executing it that way, a type conversion happens somewhere along the way.

tap_command = './tap-quickbooks-reports/tap_quickbooks_reports/__init__.py'
p = StdInOutPipe()

run_tap(
            logger = LOGGER,
            tap_command = tap_command,
            tap_config = tap_config,
            pipe = p
)

I've tried to use the StdInOutPipe as well as the FilePipe.

When running my tap in the CLI it works as desired and writes messages in the desired type. When running it through the code above, a float value will be written (with both pipes) as a String value. When modifying my singer schema it works, but I need a float value in the destination. I was not able yet to figure out where/why the conversion happens. I tested my code again, and the problem seems to be introduced in the singer-runner code (I might use it wrong though).

The output of the resulting field looks like this:
"subt_nat_amount": "38.72"

Any idea why this happens?

File path timestamps / templating

For parts of the config / filepaths, allow timestamps and run IDs to be passed through.

Eg

   - s3://my-data-bucket/singer/messages-{timestamp}.txt
   - s3://my-data-bucket/singer/messages-{run_id}.txt

Run ID should be a CLI option that defaults to a UUID if not provided

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.