Wikipedia

Search results

Saturday, 19 September 2015


Spark DataFrames: Simple and Fast Analysis of Structured Data



This session will provide a technical overview of Spark’s DataFrame API. First, we’ll review the DataFrame API and show how to create DataFrames from a variety of data sources such as Hive, RDBMS databases, or structured file formats like Avro. We’ll then give example user programs that operate on DataFrames and point out...[More]

Saturday, 5 September 2015

Advanced Spark

Good one in Spark summit...

New Features in Machine Learning Pipelines in Spark 1.4 

 

Spark 1.2 introduced Machine Learning (ML) Pipelines to facilitate the creation, tuning, and inspection of practical ML workflows.  Spark’s latest release, Spark 1.4, significantly extends the ML library.  In this post, we highlight  several new features in the....[More]