Recommending items using item similarity score in Python/Machine Learning

Recommending items using item similarity, Recommending items in Python/Machine Learning, Recommending items using pre-calculated item similarity scores


Category: Machine Learning Tags: Python, Python 3


Introduction

    In our previous article we have seen how we can find similar items and store the data for repetitive use and that is better than finding similar users which can't be pre-calculated and stored because of highly variant. Now we know how to find similar items or item similarity score and we are going to see in this article how we can recommend items to user using item similarity score, before going forward I recommend you to read my previous article.

Implementation

    Let’s go back to our previous article on finding item similarity we implemented a method calculateSimilarItems which pre-process similar items scores and stores the data in an object. Here we are going to use that data. online_music data and the Item Similarity Score data we created from our last article are given below:

online_music = {
    'Donald':{'Taylor Swift':3.5,'Rihanna':3.0,'Justin Bieber':4.0},
    'Chandler':{'Taylor Swift':3.0,'Rihanna':3.5,'Justin Bieber':4.5},
    'Ruby':{'Rihanna':5.0,'Justin Bieber':2.0,'Demi Lovato':3.5, 'MJ':3.0},
    'Zoya':{'Taylor Swift': 3.0, 'Rihanna':2.0, 'Justin Bieber':4.0,'Demi Lovato':3.0},
    'Sam': {'Rihanna':3.0, 'Justin Bieber':3.5, 'MJ':4.0},
    'Robert': {'Rihanna':1.0,'Justin Bieber':2.5,'Demi Lovato':2.5}
}
item_similarity_score={
'Demi Lovato': [(1.0, 'Taylor Swift'), (0.6666666666666666, 'MJ'), (0.3567891723253309, 'Justin Bieber'), (0.2989350844248255, 'Rihanna')], 
'Rihanna': [(0.4494897427831781, 'Taylor Swift'), (0.3090169943749474, 'MJ'), (0.2989350844248255,'Demi Lovato'), (0.19292728076790167, 'Justin Bieber')], 
'Justin Bieber': [(0.4721359549995794, 'MJ'), (0.3567891723253309, 'Demi Lovato'), (0.3483314773547883, 'Taylor Swift'), (0.19292728076790167, 'Rihanna')], 
'Taylor Swift': [(1.0, 'Demi Lovato'), (0.4494897427831781, 'Rihanna'), (0.3483314773547883, 'Justin Bieber'), (0, 'MJ')], 
'MJ': [(0.6666666666666666, 'Demi Lovato'), (0.4721359549995794, 'Justin Bieber'), (0.3090169943749474, 'Rihanna'), (0, 'Taylor Swift')]
}

Above I have online_music data which we used in our all articles but above I have taken data from result generated by previous article which is our item similarity scores. In previous article I had generated only two most similar scores above I given all possible similarity scores per singer.

Calculation of Weighted Rating using Item Similarity Score
Fig 1: Calculation of Weighted Rating using Item Similarity Score

Now analyze above table calculations, we have singers rated by user Donald in rows with ratings given by him and singers which not rated by Donald in columns. In columns we kept item similarity score of row and column combination. Then we multiplied ratings to scores and calculated sum as well we calculated sum of scores and after dividing we got weighted score.

Above we can clearly MJ has higher rating than Demi in final score so we recommend him first. Now let’s see code below:

def getRecommendations(music_data, itemSimilarityScores, user):
    userRatings = music_data[user] 
    totalScore = {}
    totalScoresProd = {}
    for (item, rating) in userRatings.items():    # looping all the rated items by user(rows)
        for (simScore, item2) in itemSimilarityScores[item]:   
            #only taking items not rated by user
            if item2 in userRatings: continue;

            totalScore.setdefault(item2, 0)
            totalScore[item2] += simScore   #total score

            totalScoresProd.setdefault(item2, 0)
            totalScoresProd[item2] += simScore * rating #total score*rating

    rankings = [(totalScoresProd[item]/totalScore[item], item) for item in totalScore.keys()]
    rankings.sort()
    rankings.reverse()
    return rankings

Now execute above method

# generating similarity scores implemented in previous article
similarItems = calculateSimilarItems(online_music, 10) 
print(getRecommendations(online_music, similarItems, 'Donald'))

[(3.6044091050000144, 'MJ'), (3.5174709308221592, 'Demi Lovato')]

The same output is generated by table.

Conclusion

    Results generated above may vary based on how many similarity scores has been considered, above we generated 10 similar singers for every singer if we take more or less than results might change but in larger dataset it will not affect much. So, after seeing results we can say MJ has more rating/score and will be recommended first than Demi.


Like 1 Person
Last modified on 11 October 2018
Nikhil Joshi

Nikhil Joshi
Ceo & Founder at Dotnetlovers
Atricles: 127
Questions: 9
Given Best Solutions: 8 *

Comments:

No Comments Yet

You are not loggedin, please login or signup to add comments:

Existing User

Login via:

New User



x