Coder Social home page Coder Social logo

automi's Introduction

Automi

Composable Stream Processing over Go Channels!

GoDoc Build Status

Automi abstracts away (not too far away) the gnarly details of using Go channels to create pipelined and staged processes. It exposes higher-level API to compose and integrate stream of data over Go channels for processing. This is still alpha work. The API is still evolving and changing rapidly with each commit (beware). Nevertheless, the core concepts are have been bolted onto the API. The following example shows how Automi could be used to compose a multi-stage pipeline to process stream of data from a csv file.

Example

The following abbreviated code snippet shows how the Automi API could be used to apply multiple operators to data items as they are streamed. [See full example here.](example https://github.com/vladimirvivien/automi/blob/master/examples/ex0/procdata.go)

type scientist struct {
	FirstName string
	LastName  string
	Title     string
	BornYear  int
}
// Example of stream processing with multi operators applied
// src - CsvSource to load data file, emits each record as []string
// out - CsvSink to write result file, expects each entry as []string
func main() {
	in := src.New().WithFile("./data.txt")
	out := snk.New().WithFile("./result.txt")

	stream := stream.New().From(in)
	stream.Map(func(cs []string) scientist {
		yr, _ := strconv.Atoi(cs[3])
		return scientist{
			FirstName: cs[1],
			LastName:  cs[0],
			Title:     cs[2],
			BornYear:  yr,
		}
	})
	stream.Filter(func(cs scientist) bool {
		if cs.BornYear > 1930 {
			return true
		}
		return false
	})
	stream.Map(func(cs scientist) []string {
		return []string{cs.FirstName, cs.LastName, cs.Title}
	})
	stream.To(out)

	<-stream.Open() // wait for completion
}

The previous code sample creates a new stream to process data ingested from a csv file using several. In the code, each method call on the stream (From(), Filter(), Map(),, and To()) represents a stage in the pipeline as illustrated in the following:

From(source) -> Map(item) -> Filter(item) -> Map(item) -> To(sink)

The stream operators that are applied to the stream as follows:

  • stream.From() - reads stream from CsvSource
  • stream.Map() - maps []string to to scientist type
  • stream.Filter() - filters out scientist.BornYear > 1938
  • stream.Map() - maps scientist value to []string
  • stram.To() - writes sream values to a CsvSink
  • stream.Open() - opens and executes stream operator and wait for completion.

The code implements stream processing based on the pipeline patterns. What is clearly absent, however, is the low level channel communication code to coordinate and synchronize goroutines. The programmer is provided a clean surface to express business code without the noisy channel infrastructure code. Underneath the cover however, Automi is using patterns similar to the pipeline patterns to create safe and concurrent structures to execute the processing of the data stream.

Roadmap

The API is still taking shape into something that, hopefully will be enjoyable and practical code to create stream processors. The project is a moving target right now, however the code will focus stabilizing the core API, additional operators, and sources/sinks connectors. In the near future, there's plan to add functionalities to support execution windows to control stream growth and pressure on reductive operators.

Operators

The focus of the project will be to continue to implement pre-built operators to help stream processor creators.

  • Transformation Filter, Maps, Join
  • Accumulation Reduce, Aggregation, Grouping
  • Etc

Features

Automi will strive to implement more features to make it more useful as the project matures including some of the followings.

  • Continuous streams Better control of non-ending streams
  • Concurrency Control of concurrently running operators
  • Timout and Cancellation Policies Establish constraints for running processes
  • Metrics Expose metrics of running processes

Core Sources/Sinks

The followings are potential sources and sink components that will be part of the core API.

  • io.Reader Stream from io.Reader sources
  • io.Writer Write stream data to io.Writer sinks
  • Csv: Stream from/to value-separated files
  • Socket Stream from/to network sockets
  • Http: Source from http end-point, stream from/to http sinks
  • Database: Source from DB tables, stream to DB sinks

External Source/Sink Ideas

The following is a list of sources and sinks ideas that should be implemented as externals projects to avoid unwanted dependencies.

  • Messaging Kafka, NATs, Etc
  • Distributed FS HDFS, S3, etc
  • Distributed DB Cassandra, Mongo, etc
  • Logging Flume, statsd, syslog, etc
  • Whatever source/sink users find useful
  • Etc

automi's People

Contributors

vladimirvivien avatar

Watchers

James Cloos avatar netgod avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.