Spotify Recommendation Engine

Find New Songs on Spotify With Machine Learning in Python

Find New Songs on Spotify With Machine Learning in Python



I started working on this project after I finished the Spotify Wrapped project. This project is a machine learning recommendation engine. If you have ever seen the Enhance button under the title of your Spotify playlist, this is a lite version of that.



My Spotify Recommendation Engine used song metadata to make new song recommendations. My training set included songs that I had streamed over 25 times. This totaled to just over 1500 songs. The song metadata included various musical attributes such as: danceability, energy, key, loudness, mode, acousticness, instrumentalness, liveness, valence, tempo, and time_signature.


I used over 55,000 songs for my test data. These songs came from all of Spotify's (the user) playlists. I used the script getSpotifyDummyData.py to loop through all of Spotify's playlists and add add each song's metadata to a DataFrame.


Once I had my target data, I then tagged all of my top songs with a 1, and all of the songs from Spotify's playlists with a 0. To avoid the model getting an artificially high accuracy by finding songs that were a top stream of mine and in one of Spotify's playlists, I check all the URI values and if any matched, they were removed from the test set.


I used SMOTE, Synthetic Minority Over-sampling Technique, to oversample 1's because there was a 1:36 ratio of my top streamed songs to not top streamed songs.


I used a decision tree classification model to predict songs I would like based off my top streamed songs. After training the model and testing it, I used predict_proba to get the predicted probabilities of songs the model predicted I would like. The predict_proba returns a two-element array for each tested instance. The first value is the chance the tested instance belongs to the negative class, a song I would not like, and the second value is the chance the tested instance belongs to the positive class, a song I would like. So I opted to use a threshold of 0.50 so I could find some new songs I would hopefully like.


Once I had the URI's of songs that I would like, I used the Spotify API to create a playlist and add songs by their URI. Take a look and a listen below!



The entire code is below.