csurfer / pypette Goto Github PK
View Code? Open in Web Editor NEWRidiculously simple flow controller for building complex pipelines
Home Page: https://csurfer.github.io/pypette
License: MIT License
Ridiculously simple flow controller for building complex pipelines
Home Page: https://csurfer.github.io/pypette
License: MIT License
seems to happen whenever i create a python file that has from pypette import Job
in it and run it.
@csurfer thanks for your codes, after read the README docs, i found it's a good project.
However, i think it can be better, such as job dependency is a little complex, i have ever try Airlfow, it's job dependency describe is interesting, in airflow, you can define the dependency like this:
job1 << job2 # for job1 depend job2
job3 >> job4 # for job4 depend job3
maybe it can be implemented in this project. ๐
In my opinion, pipelines should fail when a Job's function raises an exception. For example, I imagine a Pipe with [Job(build_a_thing), Job(test_a_thing), Job(deploy_a_thing)]. If the test fails, we should never get to the deploy step. That's not now it works now, though:
from pypette import Pipe, Job
def func1():
1/0
def func2():
print('in func2')
pipe = Pipe('Simple pipe')
pipe.add_jobs([
Job(func1),
Job(func2),
])
pipe.run()
In this case, func2
will be executed even though func1
explodes. If the two run in parallel, then func2 should be allowed to finish (that is, don't say "oh, func1 died! time to kill all the other Jobs!"). But given this scenario:
def func3():
print('in func3')
second_pipe = Pipe('Simple pipe')
second_pipe.add_jobs([
Job(func1),
Job(func2),
], run_in_parallel=True)
second_pipe.add_jobs([Job(func3)])
pipe.run()
I don't think func3 should ever be called because one of the Jobs in the first step of the pipeline failed.
Hi, one thing I want to see in this awesome lib is the ability to pass data or job results across the pipeline.
As wikipedia stands as a definition of pipeline is:
In computing, a pipeline is a set of data processing elements connected in series, where the output of one element is the input of the next one
Maybe I could help with this feature if you agree...
Pypyette Jobs currently insist that their function be an actual FunctionType object. However, class methods are perfectly good callables that work just fine if you lie to a Job about what they are:
from pypette import Job, Pipe
class Foo:
def method(self):
print('In method')
f = Foo()
p1 = Pipe('Method pipe is sad')
p1.add_jobs([Job(f.method)]) # Fails: AssertionError: Python function expected
p2 = Pipe('Function pipe is happy')
p2.add_jobs([Job(lambda: f.method())])
p2.run()
That's a problem, though, because
.graph()
displays just <lambda>
instead), andPlease broaden Job's typechecking to accept methods, too. Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.