Saturday, 5 September 2015

Scalable Collaborative Filtering with Spark MLlib

Recommendation systems are among the most popular applications of machine learning. The idea is to predict whether a customer would like a certain item: a product, a movie, or a song. Scale is a key concern for recommendation systems, since computational complexity increases with the size of a company’s customer base. In this blog post, we discuss how Spark MLlib enables building recommendation .....[More]

Spark MLib - Use Case

In this chapter, we will use MLlib to make personalized movie recommendations tailored for you. We will work with 10 million ratings from 72,000 users on 10,000 movies, collected..[More]

Apache Spark - MLlib Introduction

In one of our earlier posts we have mentioned that we use Scalding (among others) for writing MR jobs. Scala/Scalding simplifies the implementation of many MR patterns and makes it easy to implement quite complex jobs like machine learning algorithms. Map Reduce is a mature and widely used framework and it is a good choice for processing large amounts of data – but not as great if you’d like to use it for fast iterative algorithms/processing. This is a use case...[More]

Friday, 4 September 2015

Apache Spark on MapR with MLlib

Demo: Apache Spark on MapR with MLlib

Editor's Note: In this demo we are using Spark and PySpark to process and analyze the data set, calculate aggregate statistics about the user base in a PySpark script, persist all of that back into MapR-DB for use in Spark and Tableau, and finally use MLlib to build ...[more]

Subscribe to: Posts (Atom)