Coder Social home page Coder Social logo

mjboothaus / data-algorithms-with-spark Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mahmoudparsian/data-algorithms-with-spark

0.0 1.0 0.0 45.28 MB

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Shell 9.24% Python 53.62% Scala 37.15%

data-algorithms-with-spark's Introduction

Data Algorithms with Spark by Mahmoud Parsian

"... This book will be a great resource for
both readers looking to implement existing
algorithms in a scalable fashion and readers
who are developing new, custom algorithms
using Spark. ..."

Dr. Matei Zaharia
Original Creator of Apache Spark

FOREWORD by Dr. Matei Zaharia

Foreword by Dr. Matei Zaharia (Original Creator of Apache Spark)

Goal of this book: enable writing efficient & simpler PySpark code for data algorithms using Spark




Software:

Spark Python Scala Java
Apache Spark 3.2.0 Python 3.7.2 Scala 2.13 Java 8

Table of Contents

Chapter Title
Bonus
Chapters
Chapter 1 Introduction to Data Algorithms
Chapter 2 Transformations in Action
Chapter 3 Mapper Transformations
Chapter 4 Reductions in Spark
Chapter 5 Partitioning Data
Chapter 6 Graph Algorithms
Chapter 7 Interacting with External Data Sources
Chapter 8 Ranking Algorithms
Chapter 9 Fundamental Data Design Patterns
Chapter 10 Common Data Design Patterns
Chapter 11 Join Design Patterns
Chapter 12 Feature Engineering in PySpark

Data Algorithms with Spark Data Algorithms with Spark

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.