Coder Social home page Coder Social logo

incubator-samza's Introduction

What is Samza?

Apache Incubator Samza is a distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.

  • Simpe API: Unlike most low-level messaging system APIs, Samza provides a very simple call-back based "process message" API that should be familiar to anyone that's used Map/Reduce.
  • Managed state: Samza manages snapshotting and restoration of a stream processor's state. Samza will restore a stream processor's state to a snapshot consistent with the processor's last read messages when the processor is restarted.
  • Fault tolerance: Samza will work with YARN to restart your stream processor if there is a machine or processor failure.
  • Durability: Samza uses Kafka to guarantee that messages will be processed in the order they were written to a partition, and that no messages will ever be lost.
  • Scalability: Samza is partitioned and distributed at every level. Kafka provides ordered, partitioned, re-playable, fault-tolerant streams. YARN provides a distributed environment for Samza containers to run in.
  • Pluggable: Though Samza works out of the box with Kafka and YARN, Samza provides a pluggable API that lets you run Samza with other messaging systems and execution environments.
  • Processor isolation: Samza works with Apache YARN, which supports processor security through Hadoop's security model, and resource isolation through Linux CGroups.

Check out Hello Samza to try Samza. Read the Background page to learn more about Samza.

Building Samza

To build Samza, run:

./gradlew clean build

Scala and YARN

Samza builds with Scala 2.9.2 and YARN 2.0.5-alpha, by default. Use the -PscalaVersion and -PyarnVersion switches to change versions. Samza supports building Scala with 2.8.1 or 2.9.2, and building YARN with 2.0.3-alpha, 2.0.4-alpha, and 2.0.5-alpha.

./gradlew -PscalaVersion=2.8.1 -PyarnVersion=2.0.3-alpha clean build

YARN protocols are backwards incompatible, so you must pick the version that matches your YARN grid.

Testing Samza

To run all tests:

./gradlew clean test

To run a single test:

./gradlew clean :samza-test:test -Dtest.single=TestStatefulTask

Maven

Samza uses Kafka, which is not managed by Maven. To use Kafka as though it were a Maven artifact, Samza installs Kafka into a local repository using the mvn install command. You must have Maven installed to build Samza.

Developers

To get eclipse projects, run:

./gradlew eclipse

For IntelliJ, run:

./gradlew idea

Pardon our Dust

Apache Samza is currently undergoing incubation at the Apache Software Foundation.

incubator-samza's People

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.