movie-recommendation-engine

A Context based Recommendation system for Big Data setting to recommend movies and TV shows for users.

Movie recommendations for users

Objective
Data
Technologies
Algorithms
Implementation
Results
References

OBJECTIVE

The main objective of the project is to design a full fledge custom movie-recommendation engine for the users, the other key objectives are

Design a content-based recommendation system that provides movie recommendations to users based on movie genres
Implement a collaborative-filtering approach to recommend movies to users

DATA

In view of achieving the core objectives using multiple approaches, two different data sources were referred.

Movielens data: Consisting of 27 million instances of movie ratings provided by users
Movies metadata: Movie metadata with 24 features capturing various details about the film

TECHNOLOGIES

Python - Spark, pyspark, sklearn, nltk, scikit learn, pandas, matplotlib, seaborn

ALGORITHMS

Collaborative Filtering using ALS algorithm
Content based filtering using k-means clustering

IMPLEMENTATION

Collaborative filtering using ALS algorithm:

Collaborative filtering technique allows filtering out items that a user might like by leveraging the ratings of similar users. The underlying assumption in recommendation using collaborative filtering is that, if the user A and user B share a similar response (movie rating in our case) to a movie, then they are likely to share a similar response to any movie X, compared to any random user.

Employed the model-based system of performing collaborative filtering on the MovieLens dataset.
Implemented Alternating Least Square(ALS) with Spark. ALS is a matrix factorization technique to perform collaborative filtering. The objective function of ALS uses L1 regularization and optimizes the loss functions using Gradient Descent.
The dataset contained movie_id and user_ratings in the format of a user-rating matrix shown as factors as given below:

Here, d would be the number of features we learned from each user and movie association. With ALS, we intend to minimize the error in the matrix calculation shown below:

And the error is given by the below equation:

We train the ALS model by tuning the below hyper-parameters:

Rank: Indicating the number of latent factors generated in matrix factorization
regParam: The L1-regularization parameter used in ALS algorithm
maxIter: The maximum number of iterations the algorithm is run

After tuning the parameters and implementing ALS with Cross validation an optimal RMSE value of 0.8037 for 30 latent factors at the regParam value of 0.05 in 10 iterations.

Below are the resulting movie predictions made by the tuned ALS model on the test data

Refer to this link for code - Collaborative filtering using ALS

Context-based filtering using k-means clustering:

Used the movies-metadata file with 45k instances and 24 features. In view of capturing the content-based information for a given movie, the feature 'Overview' which provides the description about the genre as well as the plot of the film
The description containing a paragraph with average 50-70 words was cleaned to remove whitespaces and stopwords were removed
The text data is then input to compute TF-IDF scores and the corresponding TF-IDF matrix is generated
The scores are used to group similar movies (content with similar scores) into clusters
These clusters provide recommendations to user

Below is a sample output of movie recommendations provided by the k-means clustering

Refer to this link for code: Context-based filtering using k-means clustering

RESULTS

The movie recommendation system has shown tremendous potential. Movie recommendations have been pretty accurate for specific users, and movie titles have been successfully segmented into clusters based on their overview content. In the future scope, I plan to extend project to build recommender systems for TV shows

tigeryi1998 / collaborative-filtering-using-als-for-movie-recommendation Goto Github PK

collaborative-filtering-using-als-for-movie-recommendation's Introduction

movie-recommendation-engine

Movie recommendations for users

TABLE OF CONTENTS

OBJECTIVE

DATA

TECHNOLOGIES

ALGORITHMS

IMPLEMENTATION

Collaborative filtering using ALS algorithm:

Context-based filtering using k-means clustering:

RESULTS

REFERENCES

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent