Coder Social home page Coder Social logo

awesome-apachespark-collections's Introduction

awesome-ApacheSpark-collections or Awesome Spark

Book keeping of Apache Spark web search!
Also a curated list of awesome Apache Spark packages and resources.

Other github awesome links

Online Free Clusters

Notebooks and IDEs

  • Apache Zeppelin - Web-based notebook that enables interactive data analytics with plugable backends, integrated plotting, and extensive Spark support out-of-the-box.
  • Spark Notebook - Scalable and stable Scala and Spark focused notebook bridging the gap between JVM and Data Scientists (incl. extendable, typesafe and reactive charts).
  • sparkmagic - Jupyter magics and kernels for working with remote Spark clusters, for interactively working with remote Spark clusters through Livy, in Jupyter notebooks.

Books on Apache Spark

Blogs

Must Read list

##Introduction

Spark + Hadoop

Spark Internals

SparkSQL

Streaming

Spark on GPU / DeepLearning

Tips & Tricks

-http://blog.smaato.com/tuning-spark-streaming-applications/

Spark Packages

Videos on Apache Spark

Channels

Playlists

Github Projects - Ever Growing List!

Setup

  1. https://github.com/clearstorydata-cookbooks/apache_spark
  2. https://github.com/gwik/spark-cookbook
  3. https://github.com/azavea/ansible-spark
  4. https://github.com/tzolov/apache-spark-build-pipeline
  5. https://github.com/aur-atomica-net/apache-spark
  6. https://github.com/GELOG/docker-ubuntu-spark
  7. https://github.com/kbastani/spark-neo4j

Spark Internals

  1. https://github.com/JerryLead/SparkInternals

Spark Learning/Workshop

  1. https://github.com/Mageswaran1989/aja
  2. https://github.com/deanwampler/spark-workshop
  3. https://github.com/ceteri/spark-exercises
  4. https://github.com/lenards/explore-spark
  5. https://github.com/seglo/learning-spark
  6. https://github.com/ceteri/intro_spark
  7. https://github.com/HadoopTW/CS100.1x
  8. https://github.com/EvanZ/myvagrant
  9. https://github.com/zfz/spark-cs100.1x
  10. https://github.com/StephenHarrington/spark
  11. https://github.com/gudiseva/Spark
  12. https://github.com/hoangtamvo/spark
  13. https://github.com/okaram/spark
  14. https://github.com/linshiu/spark
  15. https://github.com/jingjinggu/Apache_Spark
  16. https://github.com/aur-atomica-net/apache-spark
  17. https://github.com/dhesse/SparkTalk
  18. https://github.com/adamliesko/bigdata-spark
  19. https://github.com/skrusche63/spark-connect
  20. https://github.com/spirom/LearningSpark

Spark

  1. https://github.com/hohonuuli/sparknotebook
  2. https://github.com/googlegenomics/spark-examples
  3. https://github.com/sujee81/SparkApps
  4. https://github.com/praveensripati/spark-examples
  5. https://github.com/jdutton/spark-playground
  6. https://github.com/arjones/spark-news
  7. https://github.com/felixcheung/spark-notebook-examples
  8. https://github.com/manku-timma/spark
  9. https://github.com/joseratts/Spark
  10. https://github.com/giocode/SparkTutorial
  11. https://github.com/eenov8/apacheSpark
  12. https://github.com/yu-iskw/spark-dataframe-introduction
  13. https://github.com/rajanpupa/ApacheSparkExample
  14. https://github.com/XD-DENG/Spark-practice

Streaming

  1. https://github.com/prabeesh/SparkTwitterAnalysis
  2. https://github.com/cotdp/spark-example-clickstream-social
  3. https://github.com/ippontech/metrics-spark-receiver
  4. https://github.com/aleph-w/ApacheSparkLearning

Sql

  1. https://github.com/rnamboodiri/spark-cassandra-integrations
  2. https://github.com/choi258/Spark_apache

MLLib

  1. https://github.com/OndraFiedler/spark-recommender
  2. https://github.com/marklit/recommend
  3. https://github.com/staple/spark-agd
  4. https://github.com/tizfa/sparkboost
  5. https://github.com/rahmanusta/Spark-Bayes
  6. https://github.com/spacedotworks/decisiontree_ApacheSpark

Spark Machine Learning

  1. https://github.com/PredictionIO/PredictionIO
  2. https://github.com/BaiGang/spark_multiboost
  3. https://github.com/alitouka/spark_dbscan
  4. https://github.com/amplab/keystone
  5. https://github.com/krasserm/akka-analytics

Spark Streaming

  1. https://github.com/miguno/kafka-storm-starter
  2. https://github.com/killrweather/killrweather
  3. https://github.com/NFLabs/ambari
  4. https://github.com/rustyrazorblade/killranalytics

Spark + Visulization

  1. https://github.com/FRosner/spawncamping-dds

Spark + WebServer

  1. https://github.com/calrissian/spark-jetty-server

Spark + REST

  1. https://github.com/spark-jobserver/spark-jobserver

Spark + Cassendra

  1. https://github.com/datastax/spark-cassandra-connector

Spark + NoSQL datastore

  1. https://github.com/Stratio/deep-spark
  2. https://github.com/RussellSpitzer/spark-cassandra-csv
  3. https://github.com/haosdent/spark-hbase

Spark + Elastic search

  1. https://github.com/skrusche63/spark-elastic
  2. https://github.com/mhausenblas/elsa
  3. https://github.com/SHSE/spark-es

Spark + Azure + PowerBI

  1. https://github.com/granturing/spark-power-bi

Spark + Genomics

  1. https://github.com/bigdatagenomics/adam

Spark + Ruby

  1. https://github.com/ondra-m/ruby-spark

Usefull Addons

  1. https://github.com/amplab/spark-indexedrdd
  2. https://github.com/mrsqueeze/spark-hash
  3. https://github.com/simplymeasured/phoenix-spark
  4. https://github.com/calrissian/spark-jetty-server
  5. https://github.com/cloudera/spark-timeseries
  6. https://github.com/skrusche63/spark-weblog

Tools

  1. https://github.com/andypetrella/spark-notebook
  2. https://github.com/ibm-et/spark-kernel
  3. https://github.com/mraad/SparkProject
  4. https://github.com/saurfang/sbt-spark-submit

awesome-apachespark-collections's People

Contributors

mageswaran1989 avatar xd-deng avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.