top of page

Project Database

View Team Project Submissions for various cohorts and programs below:

38 results were found.

FALL 2023

TEAM

The Silent Emergency - Predicting Preterm Birth

clear.png

Data Science Boot Camp

Katherine Grillaert, Divya Joshi, Alexander Sutherland, Kristina Zvolanek, Noah Rahman

Preterm birth is a primary cause of infant mortality and morbidity in the United States, affecting approximately 1 in 10 births. The rates are notably higher among Black women (14.6%), compared to White (9.4%) and Hispanic women (10.1%). Despite its prevalence, predicting preterm birth remains challenging due to its multifaceted etiology rooted in environmental, biological, genetic, and behavioral interactions. Our project harnesses machine learning techniques to predict preterm birth using electronic health records. This data intersects with social determinants of health, reflecting some of the interactions contributing to preterm birth. Recognizing that under-representation in healthcare research perpetuates racial and ethnic health disparities, we take care to use diverse data to ensure equitable model performance across underrepresented populations.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

FALL 2023

TEAM

Will my flight be late?

clear.png

Data Science Boot Camp

Simon Guichandut, Ketan Sand, Tim Hallatt

Flight delays are not only bothersome but also widespread, causing over 200,000 hours of combined delay annually in just 20 of the busiest airports in the United States. This results in a staggering $32.9 billion annual economic loss for the US. The ability to understand the
contributing factors and predict delays is crucial for better preparation and minimizing the impact. To address this issue, we utilized 12 years of data from the Bureau of Transportation Statistics in the US - (https://www.transtats.bts.gov/HomeDrillChart.asp). The dataset was refined to focus on flights between the top 20 busiest airports, operated by the top 8 airline carriers in US. We employed a random forest model for training, predicting both the likelihood of delay and quantifying the delay duration. A user-friendly website (https://willmyflightbelate.streamlit.app/) was developed to enhance the overall experience

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

FALL 2023

TEAM

Mu 'n I: Direction Detection

clear.png

Data Science Boot Camp

Christopher Stith, Katja Vassilev, Benjamin Riley, Lukas Scheiwiller, Chinmaya Kausik

The goal of this project is to determine the direction of incoming neutrinos detected by the IceCube neutrino observatory and posted on Kaggle. The IceCube detector indirectly observes high-energy neutrinos from incoming cosmic radiation. IceCube wants to use data science to estimate the direction to feed into their software which calculates the precise direction. We used several linear regression models, including tensorflow, before training a convolutional and fully connected NN in pytorch. These networks were trained using features provided by IceCube and additional features used in the regression.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

FALL 2023

TEAM

BrewSavvy

clear.png

Data Science Boot Camp

Timothy Alland, Brandon Butler, Phuc Nguyen, Aidan Lorenz

We built a beer recommender app that recommends beers to a user based on a list of beers that the user likes. The underlying model uses matrix factorization trained on a data set of ~1.5 million reviews with ~65,000 different beers and ~33,000 users.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

FALL 2023

TEAM

Biomedical Categorization

clear.png

Data Science Boot Camp

shayne plourde, Gary Hu, Michelle Lobb, Donna Chen

Diabetes is a major issue in the world, impacting 8.5% of adults and killing 1.5 million people in 2019 according to the World Health Organization. Diabetes is a chronic disease that affects how the body regulates blood glucose levels. Over time, having raised blood glucose levels may lead to serious damage to the nerves and blood vessels, leading to further complications.

The goal of this project is to better understand the relationship between lifestyle factors and diabetes and subsequently predict whether an individual has diabetes or not, based on a survey questionnaire.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

FALL 2023

TEAM

DDTs: Dementia Detection Tool

clear.png

Data Science Boot Camp

Himanshu Khanchandani, Clark Butler, Cisil Karaguzel, Selman Ipek, Shreya Shukla

Alzheimer’s disease (AD) is one of the most common types of dementia and frequently affects the elderly. Electroencephalography (EEG) is a non-invasive technique to measure the brain activity using external electrodes and may help provide improved diagnosis of AD. In this project we use power spectrum of EEG to build a robust machine learning classifier which predicts whether a patient has Alzheimer's or is healthy. We vastly improve upon existing models in the literature by using modified features compared to the ones used in literature.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL
bottom of page