CertificateBackground.png

Certificate of Completion

ErdosHorizontal.png

THIS ACKNOWLEDGES THAT

HAS COMPLETED THE SPRING 2022 DATA SCIENCE BOOT CAMP

Ethan Zell

clear.png

Roman Holowinsky, PhD

JUNE 08, 2022

DIRECTOR

DATE

TEAM

Supermassive Black Hole

Anna Brosowsky, Sayantan Khan, Nancy Wang, Ethan Zell, Yili Zhang

clear.png

We built a movie finder app that allows a user to enter some details they remember about a movie (along with some optional filter info on the genre and release year) and then predicts what movie the user is thinking of. To solve this NLP problem, our tool uses an embed-and-rerank model. We have precomputed vectorizations of movie plot information for the approximately 34,000 movies in our dataset.

Our model’s first step is to vectorize the user’s query and do a fast comparison to find the 100 closest plot vectors. Then it reranks these top 100 closest plots, performing a more thorough comparison using a neural network that semantically compares the plot fragments with the original query. Finally, we output the 10 movies which show up at the top of this new ranking.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL