Ohio State 2023 Report
Spring 2023 Cohort
108 grad students, postdocs, and faculty affiliated with OSU registered out of 550 participants total.
All registrants had access to the following programs and services
-
Career Exploration Seminars
-
Invitations to Industry (Weekly)
-
Invitations to Entrepreneurship (Monthly)
-
-
Data Science Boot Camp
-
1 month intensive with mentored projects (4 classes per week)
-
-
Mini-courses
-
Python Prep
-
Data Visualization
-
Software Engineering for Data Scientists
-
Quantum Computing
-
Mixed Methods in Business Intelligence
-
UX Research
-
-
Interview Prep (Monthly)
-
Foundations: Application Materials, Informational Interviews, Behavioral Interviews
-
Technical: Whiteboarding, Paired Coding, Case Studies
-
-
1-on-1 Career Coaching (On-Demand)
-
Resume Review
-
Mock Interviews
-
Navigating/Negotiating Offers
-
-
Employer & Alumni Career Connections
-
Internal and LinkedIn Job Board
-
Introductions to our Hiring Partners
-
Direct Introductions to our Alumni Members
-
May 2023 Data Science Boot Camp
Out of 34 teams, 3 teams with participants affiliated with OSU landed in the Top 5 Finals of our Spring 2023 Data Science Boot Camp with one of the teams winning First Place overall:
-
Team 28, "Correcting Racial Bias in Measurement of Blood Oxygen Saturation", FIRST PLACE
-
Team 37, "Protein Function Prediction", TOP 5
-
Team 39, "AirBnB", TOP 5
SPRING 2023
TEAM
Correcting Racial Bias in Measurement of Blood Oxygen Saturation
Data Science Boot Camp
Rohan Myers, Saad Khalid, woojeong kim, Brooks Miner, Jaychandran Padayasi
Fingertip pulse oximeters are the current standard for estimating blood oxygen saturation without a blood draw, both at home and in healthcare settings. However, pulse oximeters overestimate oxygen saturation, often resulting in ‘hidden hypoxemia’: a patient has hypoxemia (dangerously low oxygen saturation), but the oximeter returns a healthy oxygen value. Unfortunately, oximeter overestimation of oxygen saturation is exacerbated for patients with darker skin tones due to light-based oximeter technology. This results in Black patients experiencing hidden hypoxemia at twice the rate of white patients. By combining pulse oximeter readings (SpO2) with additional patient data, we develop improved methods for estimating arterial blood oxygen saturation (SaO2) and identifying Hidden Hypoxemia. The predictions of our models are more accurate than pulse-oximeter readings alone, and remove the systematic racial inequity inherent in the current medical practice of using oximeter readings alone.
SPRING 2023
TEAM
Protein Function Prediction
Data Science Boot Camp
Eamon Byrne, Dustin Nguyen, Ness Mayker, Salma Abdelbaky
Predict the biological function of proteins based upon their amino acid sequence and other publicly available data.
This project would double as a team submission to the Fifth Critical Assessment of Functional Annotation (CAFA 5) competition - similar to the Critical Assessment of Structure Prediction (CASP 14) competition in which AlphaFold2 gained prominence (in 2021).
Here is the Kaggle competition (submissions are open for the next 3 months): https://www.kaggle.com/competitions/cafa-5-protein-function-prediction/
(1st prize is $15,000... lol)
SPRING 2023
TEAM
Airbnb
Data Science Boot Camp
Zheyu Ni, Muhammad Reza Averly, Ricky Oropeza, Shubhrika Ahuja, Praveen Shahani
How to make money in LA with Airbnb? Being a host has never been easy since they need to provide a quality experience for guests while considering the profit margin. We aim to alleviate their burdens by streamlining the decision-making process. We explore and analyze tens of thousands of listings in LA for insights. Then, we combined structural modeling and machine learning to customize pricing for new listings based on property locations, features, ownership, and substitution between nearby listings to maximize the host’s profit. The structural model helps capture supply & demand dynamics and machine learning helps capture the consumer consideration set (hotspot market). We also use machine learning for price prediction to utilize rich features better. Furthermore, we recommend areas with the best rate of return for potential hosts and suggest possible amenities to become popular in Airbnb.
Fall 2023 Cohort
109 grad students, postdocs, and faculty affiliated with OSU registered out of 506 participants total.
All registrants had access to the following programs and services
-
Career Exploration Seminars
-
Invitations to Industry (Weekly)
-
Invitations to Entrepreneurship (Monthly)
-
-
Data Science Boot Camp
-
12 weeks with mentored projects (1 class per week)
-
-
Mini-courses
-
Python Prep
-
Data Visualization
-
Software Engineering for Data Scientists
-
Quantum Computing
-
Deep Learning
-
UX Research
-
-
Interview Prep (Monthly)
-
Foundations: Application Materials, Informational Interviews, Behavioral Interviews
-
Technical: Whiteboarding, Paired Coding, Case Studies
-
-
1-on-1 Career Coaching (On-Demand)
-
Resume Review
-
Mock Interviews
-
Navigating/Negotiating Offers
-
-
Employer & Alumni Career Connections
-
Internal and LinkedIn Job Board
-
Introductions to our Hiring Partners
-
Direct Introductions to our Alumni Members
-
Fall 2023 Data Science Boot Camp
Out of 37 teams, 3 teams with participants affiliated with OSU landed in the Top 5 Finals, while two more were awarded "with distinction":
-
Team 7, "Funk", TOP 5
-
Team 20, "AI-generated Image Detection", TOP 5
-
Team 25, "The Silent Emergency - Predicting Preterm Birth", WITH DISTINCTION
-
Team 27, "Somm", TOP 5
-
Team 45, "Biomedical Categorization" WITH DISTINCTION
FALL 2023
TEAM
Funk
Data Science Boot Camp
aydin ozbek, Dane Miyata, Kristina Knowles, Mario Gomez, Kashish Mehta
Most existing music recommendation systems rely on listeners to provide seed tracks, and then utilize a variety of different approaches to recommend additional tracks in either a playlist-like listening session or as sequential track recommendations based on user feedback.
We built a playlist recommendation engine that takes a different approach, allowing listeners to generate a novel playlist based on a semantic string, such as the title of desired playlist, specific mood (happy, relaxed), atmosphere (tropical vibe), or function (party music, focus). Using a publicly available dataset of existing playlists (https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge), we combine a semantic similarity vector model with a matrix factorization model to allow users to quickly and easily generate playlists to fit any occasion.
FALL 2023
TEAM
AI-generated Image Detection
Data Science Boot Camp
Amanda Pan, Hasan Saad, Alina Al Beaini, Cemile Kurkoglu
AI-generated images have become increasingly realistic, prompting a variety of malicious uses. We plan to develop a model for detecting AI-generated images, ideally improving upon some of the current difficulties: generalization to different methods of image generation, robustness to image resizing and compression, and interpretability of results.
FALL 2023
TEAM
The Silent Emergency - Predicting Preterm Birth
Data Science Boot Camp
Katherine Grillaert, Divya Joshi, Alexander Sutherland, Kristina Zvolanek, Noah Rahman
Preterm birth is a primary cause of infant mortality and morbidity in the United States, affecting approximately 1 in 10 births. The rates are notably higher among Black women (14.6%), compared to White (9.4%) and Hispanic women (10.1%). Despite its prevalence, predicting preterm birth remains challenging due to its multifaceted etiology rooted in environmental, biological, genetic, and behavioral interactions. Our project harnesses machine learning techniques to predict preterm birth using electronic health records. This data intersects with social determinants of health, reflecting some of the interactions contributing to preterm birth. Recognizing that under-representation in healthcare research perpetuates racial and ethnic health disparities, we take care to use diverse data to ensure equitable model performance across underrepresented populations.