Apache Spark in 24 Hours, Sams Teach Yourself

Pengarang : Jeffrey Aven
Penerbit : Sams Publishing
Call Number : 005.74/AVE/a/c.i

Category:

Description

This book’s straightforward, step-by-step approach shows you how to deploy, program, optimize, manage, integrate, and extend Spark-now, and for years to come. You’ll discover how to create powerful solutions encompassing cloud computing, real-time stream processing, machine learning, and more. Every lesson builds on what you’ve already learned, giving you a rock-solid foundation for real-world success. Whether you are a data analyst, data engineer, data scientist, or data steward, learning Spark will help you to advance your career or embark on a new career in the booming area of Big Data. Learn how to o Discover what Apache Spark does and how it fits into the Big Data landscape o Deploy and run Spark locally or in the cloud o Interact with Spark from the shell o Make the most of the Spark Cluster Architecture o Develop Spark applications with Scala and functional Python o Program with the Spark API, including transformations and actions o Apply practical data engineering/analysis approaches designed for Spark o Use Resilient Distributed Datasets (RDDs) for caching, persistence, and output o Optimize Spark solution performance o Use Spark with SQL (via Spark SQL) and with NoSQL (via Cassandra) o Leverage cutting-edge functional programming techniques o Extend Spark with streaming, R, and Sparkling Water o Start building Spark-based machine learning and graph-processing applications o Explore advanced messaging technologies, including Kafka o Preview and prepare for Spark’s next generation of innovations