This is the code repository for Modern Scala Projects, published by Packt.
Leverage the power of Scala for building data-driven and high-performant projects
Scala, together with the Spark Framework, forms a rich and powerful data processing ecosystem. Modern Scala Projects is a journey into the depths of this ecosystem. The machine learning (ML) projects presented in this book enable you to create practical, robust data analytics solutions, with an emphasis on automating data workflows with the Spark ML pipeline API. This book showcases or carefully cherry-picks from Scala’s functional libraries and other constructs to help readers roll out their own scalable data processing frameworks. The projects in this book enable data practitioners across all industries gain insights into data that will help organizations have strategic and competitive advantage.
Modern Scala Projects focuses on the application of supervisory learning ML techniques that classify data and make predictions. You'll begin with working on a project to predict a class of flower by implementing a simple machine learning model. Next, you'll create a cancer diagnosis classification pipeline, followed by projects delving into stock price prediction, spam filtering, fraud detection, and a recommendation engine.
By the end of this book, you will be able to build efficient data science projects that fulfil your software requirements.
This book covers the following exciting features:
Create pipelines to extract data or analytics and visualizations
Automate your process pipeline with jobs that are reproducible
Extract intelligent data efficiently from large, disparate datasets
Automate the extraction, transformation, and loading of data
Develop tools that collate, model, and analyze data
Maintain the integrity of data as data flows become more complex
Develop tools that predict outcomes based on “pattern discovery”
Build really fast and accurate machine-learning models in Scala
If you feel this book is for you, get your copy today!
All of the code is organized into folders. For example, Chapter02.
The code will look like the following:
val dataFrame = spark.createDataFrame(result5).toDF(featureVector, speciesLabel)
Following is what you need for this book: Modern Scala Projects is for Scala developers who would like to gain some hands-on experience with some interesting real-world projects. Prior programming experience with Scala is necessary.
With the following software and hardware list you can run all code files present in the book (Chapter 1-7).
Chapter | Software required | Hardware required |
---|---|---|
1,2,3,4,5,6,7 | JDK 8, Scala 2.11.12, SBT 1.04, Spark 2.3.3, IntelliJ Community Edition 2018.1.5/6 with Scala Plugin, Scala IDE 4.7+, HDP Sandbox 2.6.5, Suitable SSH client, Oracle Virtual Box 5.1/2 | At least 16 GB of RAM |
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.
Ilango Gurusamy holds an MS degree in computer science from California State University. He has lead Java projects at Northrop Grumman, AT&T, and such. He moved into Scala and Functional Programming. His current interests are IoT, navigational applications, and all things Scala related. A strategic thinker, speaker, and writer, he also loves yoga, skydiving, cars, dogs, and fishing. You can know more about his achievements in his blog, titled scalanirvana. His LinkedIn user name is ilangogurusamy.
Click here if you have any feedback or suggestions.