Team Lime: Spotify Podcast Recommender
Thursday, February 16, 2023
Congratulations to Team Lime on winning The Erdős Institute’s Fall 2022 Data Science Boot Camp with their project: Spotify Podcast Recommender!
Composed of Music Theory Ph.D. Candidate Aditya Chander (Yale University), Economics Postdoc Ritika Khurana (University of Delaware), Sociology Ph.D. Student Yuchen Luo (New York University), and recent Linguistics Ph.D. and Market Researcher Taylor Mahler (The Ohio State University), Team Lime successfully utilized Spotify’s podcast dataset to build a podcast recommendation system. Their model is designed to take one of two inputs and suggest similar podcast episodes for the user: either the name of a podcast episode or a description of a podcast/podcast episode of interest. The relevance and relatability of suggestions was confirmed by measuring the similarity between podcasts within user-tagged categories. With more time, the team would also like to add in a user feedback option to continuously retrain their model for improved recommendations. Team Lime further suggests that the applications of this system are not limited to simply maintaining user engagement, but could also be employed by advertisers to increase revenue by targeting connected podcasts to advertise diverse products, avoiding repetitive advertising to the same listeners. Ultimately, of the two models they tried, the pre-trained transformer model resulted in 88.3% of the ordered category pairs maintaining lower similarity scores between-category than within, as compared to 75.1% with the other. Thus, they selected the pre-trained transformer model for their recommendation app.
When discussing how the team settled on this dataset and specific project, Ritika explained that she wanted to try something different; Aditya has a music background and also wanted to expand his horizons. Since he had worked with Spotify’s API in the past and had some familiarity with Natural Language Processing (NLP), this project was a natural extension of both their interests. Taylor and Yuchen joined their group later, both drawn to the NLP aspect of the project. Taylor’s Ph.D. is in linguistics, but she had not previously worked with NLP, and Yuchen’s experience was more theoretical for her sociology studies—she was excited to apply her NLP knowledge to something practical that a company would like.
At the end of the project, they were excited to have a finished product. “When we see the finished project and we realize, wait, it actually works, that I think the recommended episodes make a lot of intuitive sense,” Yuchen thought that was the most rewarding part. For Ritika it was “learning new skills, and definitely–at the end–when we realized that we won the project,” was great, “but for me, the biggest or most rewarding part was that this was my first Python project.” Taylor found that “to sit down and think about what this would mean for an actual business and actual users, because I have very limited experience outside of academia, [to] realize that it actually has business value, I think was rewarding.” Aditya agreed that it was exciting to have a product at the end: “From that perspective, knowing that—what Yuchen said—we had a product at the end of it, it wasn’t just a series of insights that maybe would have led to something else, we had a concrete deliverable app.”
The team noted that with more time and computational resources, they envision adding more features to the model and improving their app. For instance, they would like to continually retrain the model by having users provide feedback on the generated recommendations and include descriptions of the episodes (in both the results and for the modeling process as well). Following the completion of their project, though Aditya still mostly listens to music, he now listens to more podcasts and has utilized their app for recommendations. Taylor plans to try it to help her husband find a new podcast now that his favorite one has ended; she mostly listens to interviews or podcasts on topics she’s interested in learning. Ritika likes to listen to Hidden Brain and Trained, whose topics vary widely on the speaker, from science to philosophy. Yuchen enjoys podcasts about anime and book summaries since she doesn’t have much time to read outside of work.
Team Lime attributes much of their success to organization and clear delegation of tasks. They highly recommend having weekly meetings to help hold each other accountable and making clear to-do lists following the meetings so that everyone knows their task(s). Furthermore, though it is good to consider small details, it is important to not lose sight of the big picture or the end goals and deliverables of the project. Two other factors of their success that they highlight were paired-programming and great advice from their project mentor, Gleb Zhelezov.
Congratulations again to Team Lime as well as all of the other teams who completed a Fall 2022 Data Science Boot Camp project!
Marcos Ortiz, Kristina Knowles, Fatih Catpinar, Diptanil Roy
This team will work on the corporate NLP project from Aware. We will will replicate an internal RAG-building effort at Aware, a Columbus software startup that uses AI in the digital workplace to help customers reduce operational costs and drive insights from their internal data (such as those derived from HR surveys, Slack conversations, or email).
See the post from Lindsay Warrenburg in Slack for further details.