Project Database
View Team Project Submissions for various cohorts and programs below:
38 results were found.
FALL 2023
TEAM
The Silent Emergency - Predicting Preterm Birth

Data Science Boot Camp
Katherine Grillaert, Divya Joshi, Alexander Sutherland, Kristina Zvolanek, Noah Rahman
Preterm birth is a primary cause of infant mortality and morbidity in the United States, affecting approximately 1 in 10 births. The rates are notably higher among Black women (14.6%), compared to White (9.4%) and Hispanic women (10.1%). Despite its prevalence, predicting preterm birth remains challenging due to its multifaceted etiology rooted in environmental, biological, genetic, and behavioral interactions. Our project harnesses machine learning techniques to predict preterm birth using electronic health records. This data intersects with social determinants of health, reflecting some of the interactions contributing to preterm birth. Recognizing that under-representation in healthcare research perpetuates racial and ethnic health disparities, we take care to use diverse data to ensure equitable model performance across underrepresented populations.



FALL 2023
TEAM
Will my flight be late?

Data Science Boot Camp
Simon Guichandut, Ketan Sand, Tim Hallatt
Flight delays are not only bothersome but also widespread, causing over 200,000 hours of combined delay annually in just 20 of the busiest airports in the United States. This results in a staggering $32.9 billion annual economic loss for the US. The ability to understand the
contributing factors and predict delays is crucial for better preparation and minimizing the impact. To address this issue, we utilized 12 years of data from the Bureau of Transportation Statistics in the US - (https://www.transtats.bts.gov/HomeDrillChart.asp). The dataset was refined to focus on flights between the top 20 busiest airports, operated by the top 8 airline carriers in US. We employed a random forest model for training, predicting both the likelihood of delay and quantifying the delay duration. A user-friendly website (https://willmyflightbelate.streamlit.app/) was developed to enhance the overall experience
FALL 2023
TEAM
Mu 'n I: Direction Detection

Data Science Boot Camp
Christopher Stith, Katja Vassilev, Benjamin Riley, Lukas Scheiwiller, Chinmaya Kausik
The goal of this project is to determine the direction of incoming neutrinos detected by the IceCube neutrino observatory and posted on Kaggle. The IceCube detector indirectly observes high-energy neutrinos from incoming cosmic radiation. IceCube wants to use data science to estimate the direction to feed into their software which calculates the precise direction. We used several linear regression models, including tensorflow, before training a convolutional and fully connected NN in pytorch. These networks were trained using features provided by IceCube and additional features used in the regression.



FALL 2023
TEAM
BrewSavvy

Data Science Boot Camp
Timothy Alland, Brandon Butler, Phuc Nguyen, Aidan Lorenz
We built a beer recommender app that recommends beers to a user based on a list of beers that the user likes. The underlying model uses matrix factorization trained on a data set of ~1.5 million reviews with ~65,000 different beers and ~33,000 users.



FALL 2023
TEAM
Biomedical Categorization

Data Science Boot Camp
shayne plourde, Gary Hu, Michelle Lobb, Donna Chen
Diabetes is a major issue in the world, impacting 8.5% of adults and killing 1.5 million people in 2019 according to the World Health Organization. Diabetes is a chronic disease that affects how the body regulates blood glucose levels. Over time, having raised blood glucose levels may lead to serious damage to the nerves and blood vessels, leading to further complications.
The goal of this project is to better understand the relationship between lifestyle factors and diabetes and subsequently predict whether an individual has diabetes or not, based on a survey questionnaire.
FALL 2023
TEAM
DDTs: Dementia Detection Tool

Data Science Boot Camp
Himanshu Khanchandani, Clark Butler, Cisil Karaguzel, Selman Ipek, Shreya Shukla
Alzheimer’s disease (AD) is one of the most common types of dementia and frequently affects the elderly. Electroencephalography (EEG) is a non-invasive technique to measure the brain activity using external electrodes and may help provide improved diagnosis of AD. In this project we use power spectrum of EEG to build a robust machine learning classifier which predicts whether a patient has Alzheimer's or is healthy. We vastly improve upon existing models in the literature by using modified features compared to the ones used in literature.