Recommender Systems using Deep Learning in PyTorch from scratch

Here is the GitHub repository for this article.Problem DefinitionGiven a past record of movies seen by a user, we will build a recommender system that helps the user discover movies of their interest.Specifically, given <userID, itemID> occurrence pairs, we need to generate a ranked list of movies for each user.We model the problem as a binary classification problem, where we learn a function to predict whether a particular user will like a particular movie or not.Our model will learn this mappingDatasetWe use the MovieLens 100K dataset, which has 100,000 ratings from 1000 users on 1700 movies..We assume that all entries in the user-item interaction matrix are negative samples (a strong assumption, and easy to implement).We randomly sample 4 items that are not interacted by the user, for every item interacted by the user..These negative interactions cannot contain any positive interaction by the user, though they may not be all unique due to random sampling.EvaluationWe randomly sample 100 items that are not interacted by the user, ranking the test item among the 100 items..This evaluation methodology is also known as leave-one-out strategy and is the same as used in the reference paper.MetricsWe use Hit Ratio(HR), and Normalized Discounted Cumulative Gain(NDCG) to evaluate the performance for our RS.Our model gives a confidence score between 0 and 1 for each item present in the test set for a given user..If the test item (which is only one for each user) is present in this list, HR is one for this user, else it is zero..The real strength of RS lies in giving a ranked list of top-k items, which a user is most likely to interact..In much of use cases for recommender systems, recommending the same list of most popular items to all users gives a tough to beat baseline.In the GitHub repository, you will also find the code for implementing item popularity model from scratch..The exact model definition can be found in the file output of the sigmoid neuron can be interpreted as the probability the user is likely to interact with an item.. More details

Leave a Reply