top of page
CertificateBackground.png

Certificate of Completion

ErdosHorizontal.png

THIS ACKNOWLEDGES THAT

HAS COMPLETED THE SPRING 2024 DATA SCIENCE BOOT CAMP

Jiuqin Wei

clear.png

Roman Holowinsky, PhD

MAY 01, 2024

DIRECTOR

DATE

TEAM

Data Science - Economists

Muhammad Usman Taj, Jiuqin Wei, Di Kang, Fang Li, Estefania Padilla Gonzalez

clear.png

In this study, our objective is to examine a dataset containing movie information to identify groups of similar movies based on their profitability. We utilized K-Means clustering, a well-known unsupervised machine learning approach. The dataset comprises various attributes including movie titles, release years, revenue, budgets, and genres associated with each movie. Following data preprocessing to address missing values and ensure data compatibility, we applied the clustering technique. Our results reveal that the number of votes is the most influential factor in determining a movie's profitability. Additionally, features like popularity and runtime are also noteworthy contributors.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL
bottom of page