Evaluation Metrics for Recommender Systems

(To increase training time, this data was downsampled to only include ratings from users who have rated over 1000 movies, and ratings of less than 3 stars were removed.)Example of user movie ratingsModelsThree different recommender systems are tested and compared.Random recommender (recommends 10 random movies to each user)Popularity recommender (recommends the top 10 most popular movies to each user)Collaborative Filter (matrix factorization approach using SVD)Let’s dive into the metrics and diagnostic plots and compare these models!Long Tail PlotI like to start off every recommender project by looking at the Long Tail Plot..An example will best illustrate how personalization is calculated.Example list of recommended items for 3 different users.First, the recommended items for each user are represented as binary indicator variables (1: the item was recommended to the user. 0: the item was not recommended to the user).Then, the cosine similarity matrix is calculated across all user’s recommendation vectors.Finally, the average of the upper triangle of the cosine matrix is calculated..The personalization is 1-the average cosine similarity.A high personalization score indicates user’s recommendations are different, meaning the model is offering a personalized experience to each user.Intra-list SimilarityIntra-list similarity is the average cosine similarity of all items in a list of recommendations..This calculation is also best illustrated with an example.Example recommendations of movie ids for 3 different users.These movie genre features are used to calculate a cosine similarity between all the items recommended to a user..This matrix shows the features for all recommended movies for user 1.Intra-list similarity can be calculated for each user, and averaged over all users in the test set to get an estimate of intra-list similarity for the model.If a recommender system is recommending lists of very similar items to single users (for example, a user receives only recommendations of romance movies), then the intra-list similarity will be high.Using the right training dataThere are a few things that can be done to the training data that could quickly improve a recommender system.Remove popular items from the training data.. More details

Leave a Reply