
Certificate of Completion
THIS ACKNOWLEDGES THAT
HAS COMPLETED THE SUMMER 2025 DATA SCIENCE BOOT CAMP
Nadia Khoury
Roman Holowinsky, PhD
JULY 08, 2025
DIRECTOR
DATE

TEAM
Spotify Playlist Continuation and Genre Inference
Nick Geis, Jonathan Bloom, Nadia Khoury

This project explores algorithmic playlist continuation and genre inference using the Spotify Million Playlist Dataset and AcousticBrainz database. We develop three models to predict subsequent tracks by exploiting co-occurrence patterns: a baseline random model, a graph-based KNN approach using an implicit distance metric, and an SVD-based model for dimensionality reduction from 2+ million tracks. We quantify performance using four metrics: bulls-eye (exact match), vibe (cosine similarity), artist match, and album match.
For genre prediction, we identify tracks present in both datasets to extract sonic and genre features from AcousticBrainz. Using UMAP, we project the 2.2 million track space into lower dimensions, then apply KNN to propagate genre labels from the labeled subset to unlabeled tracks. This allows us to make and test predictions on track and artist genre through co-occurence information of the playlists.
