
Certificate of Completion
THIS ACKNOWLEDGES THAT
HAS COMPLETED THE FALL 2025 DATA SCIENCE BOOT CAMP
Mustafain Ali
Roman Holowinsky, PhD
NOVEMBER 13, 2025
DIRECTOR
DATE

TEAM
Predicting Antibiotic Resistance: Challenges, Findings, and Lessons Learned
Haejun Oh, Dominique Hughes, Tinghao Huang, Chiara Mattamira, Mustafain Ali

Goal: Explore whether patient and microbiological data can be used to predict antibiotic resistance.
Dataset: https://datadryad.org/dataset/doi:10.5061/d
Motivation: Infections that could once be easily cured by simple antibiotics are becoming harder to treat due to emerging antibiotic resistance in bacteria. In clinical practice, physicians prescribe antibiotics based on prior experience and established guidelines while awaiting laboratory test results that confirm the effectiveness of those antibiotics.
Methods: Preprocess the dataset to remove duplicate entries, one-hot encode categorical variables, and keep unique patient IDs. We train six different models (dummy classifier, logistic regression, random forest, XGBoost, SVMs, and KNNs) and evaluate their accuracy, F1-score, precision, recall, and false negative rate and choose the best-performing one.
Real-world impact: Our project support clinicians in making data-informed antibiotic prescriptions while awaiting lab results.
