top of page
TEAM
Data Science - Economists
Muhammad Usman Taj, Jiuqin Wei, Di Kang, Fang Li, Estefania Padilla Gonzalez
In this study, our objective is to examine a dataset containing movie information to identify groups of similar movies based on their profitability. We utilized K-Means clustering, a well-known unsupervised machine learning approach. The dataset comprises various attributes including movie titles, release years, revenue, budgets, and genres associated with each movie. Following data preprocessing to address missing values and ensure data compatibility, we applied the clustering technique. Our results reveal that the number of votes is the most influential factor in determining a movie's profitability. Additionally, features like popularity and runtime are also noteworthy contributors.
bottom of page