Karan Srivastava, Benjamin Sheller, Anya Michaelsen, Alejandra Castillo
Our project seeks to aid our food industry stakeholders, including online recipe repositories and platforms seeking to produce food or restaurant recommendations, in understanding the relationship between the ingredients of a recipe and the type of cuisine to which it corresponds, as well as the relationships between different cuisines. To study this, our team used both classification and clustering models. Our best classification model used logistic regression to achieve a predictive accuracy of 77.9%, with linear SVC in close second, with a predictive accuracy of 76.8%. The logistic model also gives insight about top ingredients for each cuisine. Clustering algorithms, k-means and hierarchical, showed natural groupings of cuisines, as well as insights as to which cuisines were most similar within those groups.