Wikipedia
Search results
Saturday, 19 September 2015
Apache Spark 1.5 presented by Databricks co-founder Patrick Wendell
Spark 1.5 ships Spark's Project Tungsten initiative, a cross-cutting performance update that uses binary memory management and code generation to dramatically improve latency of most Spark jobs. This release also includes several updates to Spark's DataFrame API and SQL optimizer, along with new Machine Learning algorithms and feature transformers, and several new features in Spark's native streaming engine
Spark DataFrames: Simple and Fast Analysis of Structured Data
This session will provide a technical overview of Spark’s DataFrame API. First, we’ll review the DataFrame API and show how to create DataFrames from a variety of data sources such as Hive, RDBMS databases, or structured file formats like Avro. We’ll then give example user programs that operate on DataFrames and point out...[More]
Subscribe to:
Comments (Atom)