Building Spark Applications by Jonathan Dinu — Oreilly — Free download
13+ Hours of Video Instruction
Overview
Building Spark Applications LiveLessons provides data scientists and developers with a practical introduction to the Apache Spark framework using Python, R, and SQL. Additionally, it covers best practices for developing scalable Spark applications for predictive analytics in the context of a data scientist’s standard workflow.
Description
In this video training, Jonathan starts off with a brief history of Spark itself and shows you how to get started programming in a Spark environment on a laptop. Taking an application and code first approach, he then covers the various APIs in Python, R, and SQL to show how Spark makes large scale data analysis much more accessible through languages familiar to data scientists and analysts alike. With the basics covered, the videos move into a real-world case study showing you how to explore data, process text, and build models with Spark. Throughout the process, Jonathan exposes the internals of the Spark framework itself to show you how to write better application code, optimize performance, and set up a cluster to fully leverage the distributed nature of Spark. After watching these videos, data scientists and developers will feel confident building an end-to-end application with Spark to perform machine learning and do data analysis at scale!