Coder Social home page Coder Social logo

more-free / glow Goto Github PK

View Code? Open in Web Editor NEW

This project forked from chrislusf/glow

0.0 2.0 0.0 6.18 MB

Glow is an easy-to-use distributed computation system written in Go, similar to Hadoop Map Reduce, Spark, Flink, Samza, etc. Currently just started and not feature rich yet, but should be reliable to run most common cases.

Go 99.17% Protocol Buffer 0.83%

glow's Introduction

glow

Examples are in this repo https://github.com/chrislusf/glow_examples

GoDoc

Purpose

Glow is providing a library to easily compute in parallel threads or distributed to clusters of machines.

Installation

go get github.com/chrislusf/glow
go get github.com/chrislusf/glow/flow

One minute tutorial

Simple Start

Here is a simple full example:

package main

import (
	"flag"
	"strings"

	"github.com/chrislusf/glow/flow"
)

func main() {
	flag.Parse()

	flow.New().TextFile(
		"/etc/passwd", 3,
	).Filter(func(line string) bool {
		return !strings.HasPrefix(line, "#")
	}).Map(func(line string, ch chan string) {
		for _, token := range strings.Split(line, ":") {
			ch <- token
		}
	}).Map(func(key string) int {
		return 1
	}).Reduce(func(x int, y int) int {
		return x + y
	}).Map(func(x int) {
		println("count:", x)
	}).Run()
}

Try it.

./word_count

It will run the input text file, '/etc/passwd', in 3 go routines, filter/map/map, and then reduced to one number in one goroutine (not exactly one goroutine, but let's skip the details for now.) and print it out.

This is useful already, saving lots of idiomatic but repetitive code on channels, sync wait, etc.

However, there is one more thing!

Scale it out

To setup the Glow cluster, we do not need experts on Zookeeper/HDFS/Mesos/YARN etc. Just build or download one binary file.

Setup the cluster

  // fetch and install via go, or just download it from somewhere
  go get github.com/chrislusf/glow
  // start a master on one computer
  glow master
  // run one or more agents on computers
  glow agent --dir . --max.executors=16 --memory=2048 --master="localhost:8930" --port 8931

Glow Master and Glow Agent run very efficiently. They take about 6.5MB and 5.5MB memory respectively in my environments. I would recommend set up agents on any server you can find. You can tap into the computing power whenever you need to.

Start the driver program

To leap from one computer to clusters of computers, add this line to the import list:

	_ "github.com/chrislusf/glow/driver"

This will "steroidize" the code to run in cluster mode!

./word_count -glow -glow.leader="localhost:8930"

The word_count program will become a driver program, dividing the execution into a directed acyclic graph(DAG), and send tasks to agents.

Glow Hello World Execution Plan

Read More

  1. [Glow Introduction Slides] (https://raw.githubusercontent.com/chrislusf/glow/master/etc/GlowIntroduction.pdf)
  2. Wiki page: https://github.com/chrislusf/glow/wiki
  3. Mailing list: https://groups.google.com/forum/#!forum/glow-user-discussion
  4. Examples: https://github.com/chrislusf/glow_examples/tree/master/word_count

Contribution

Start using it! And report or fix any issue you have seen, add any feature you want.

Fork it, code it, and send pull requests. Better first discuss about the feature you want on the mailing list. https://groups.google.com/forum/#!forum/glow-user-discussion

License

http://www.apache.org/licenses/LICENSE-2.0

glow's People

Contributors

chrislusf avatar

Watchers

James Cloos avatar morefree avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.