Light

gulshankrsingh97 / sparkiris Goto Github PK

View Code? Open in Web Editor NEW

This project forked from misano9699/sparkiris

0.0 1.0 0.0 4 KB

Java 100.00%

sparkiris's Introduction

SparkIris

An Apache Spark Machine Learning example for predicting flower species from the classic Iris dataset from the UCI Machine Learning Repository. It contains three iris species with 50 samples each.

How to run on a standalone cluster

The code runs on a standalone Apache Spark cluster.

The steps to take are:

Build the jar
Copy the jar and dataset to a location on the Apache Spark cluster
Submit the jar to Apache Spark

For more information on how to install a standalone Spark cluster see https://spark.apache.org/docs/latest/spark-standalone.html

For submitting the application see https://spark.apache.org/docs/latest/submitting-applications.html

Example

I used a Spark Docker image from Semantive which can be found on Github

Usage:

Clone the docker-spark repository to your local machine
Adapt docker-compose.yml to your liking (for example, number of workers, number of cores, memory allocated to the workers)
Run docker-compose up, which will start the Spark cluster
Copy both the Iris dataset and the SparkIris jar to the data directory of the docker-spark repository
Connect to the master image with a bash shell (docker exec -i -t dockerspark_master_1 /bin/bash)
In the bash shell submit the application to Spark (./bin/spark-submit --class nl.craftsmen.spark.iris.SparkIris --master local[1] /tmp/data/spark-iris-1.0-SNAPSHOT.jar)

sparkiris's People

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.