
Certificate of Completion

THIS ACKNOWLEDGES THAT
HAS COMPLETED THE SUMMER 2025 DATA SCIENCE BOOT CAMP
Beverly Warner
Roman Holowinsky, PhD
JULY 08, 2025
DIRECTOR
DATE

TEAM
Vibes-based music recommendation: predicting the contents of Spotify playlists by clustering and emotional content
Cheyenne Wakeland-Hart, Arlene Lormestoire, Beverly Warner, Daniel Bragg

Our team developed a music recommendation system that predicts songs that users are likely to add to a given playlist. We used the Spotify million playlist dataset, a collection of playlists created by Spotify users. Our model encodes the songs on a given playlist as vectors, reflecting the different playlists on which those songs appear. We use k-means clustering to collect these vectors into five focal points (reflecting existing structures we discovered in the dataset). We then search for songs to add so as to minimize the cosine-similarity to existing playlists across each of the cluster centers. Finally, we developed a refined model incorporating statistics on the emotional content of songs. We scraped lyrics from Genius.com, using the Python library BeautifulSoup4, and then used a natural language processing model trained on the GoEmotions Google dataset to extract emotion vectors for each song.