Coder Social home page Coder Social logo

foundry-transforms-api-csv's Introduction

Install

pipenv install git+ssh://github.com/blakehawkins/foundry-transforms-api#egg=transformsbase
pipenv install git+ssh://github.com/blakehawkins/foundry-transforms-api-csv#egg=transforms

Usage

from transforms.api import Input, Output, transform_df, TRANSFORMS_CSV_MAP

# Mock out shrinkwrap/catalog
global TRANSFORMS_CSV_MAP

TRANSFORMS_CSV_MAP["out"] = "out.csv"
TRANSFORMS_CSV_MAP["in"] = "in.csv"  # Just contains `id\n1\n2\n`

out = Output("out")

# Define xform
@transform_df(out, in_=Input("in"))
def myxform(in_):
  return in_

# Run it manually
myxform()

# Contents are also written to out.csv
assert(out.get().count() == 2)

If you want a transform to work cross-platform between csv and foundry, you can instead write a pipeline.py file that calls into your transform -- see below.

foundryxform.py:

from transforms.api import transform_df, Input, Output

import pyspark.sql.functions as F


@transform_df(
    Output("ri.foundry.main.dataset.6e9e9ed9-1278-4fb9-a6cd-cde6fdb2e344"),
    thing1=Input("ri.foundry.main.dataset.9b9a2914-1e63-4433-96bd-7a7beb49f9f2"),
    thing2=Input("ri.foundry.main.dataset.8b6d914c-dd36-4bb7-86f7-f86e31f6d52a")
)
def foundryxform(thing1, thing2):
    # ...

mypipeline.py:

from foundryxform import foundryxform
from transforms.api import TRANSFORMS_CSV_MAP

global TRANSFORMS_CSV_MAP

TRANSFORMS_CSV_MAP["ri.foundry.main.dataset.9b9a2914-1e63-4433-96bd-7a7beb49f9f2"] = "testinput.csv"
TRANSFORMS_CSV_MAP["ri.foundry.main.dataset.8b6d914c-dd36-4bb7-86f7-f86e31f6d52a"] = "testinput2.csv"

foundryxform()

Since mypipeline.py is not run by foundry infrastructure, this file can be commited into the same repo as official templates and run in parallel.

Warning/note

Palantir maintains an open source fork of spark -- you may have inconsistencies if you try to use both transforms-api-csv and foundry. transforms-api-csv is not supported software.

foundry-transforms-api-csv's People

Contributors

blakehawkins avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.