Certificate of Completion
THIS ACKNOWLEDGES THAT
HAS COMPLETED THE MAY-SUMMER 2024 DATA SCIENCE BOOT CAMP
Karla Paulette Flores Silva
Roman Holowinsky, PhD
JUNE 10, 2024
DIRECTOR
DATE
TEAM
Cancer Survivability
Dilruba Sofia, Funmilola Mary Taiwo, Enayon Sunday Taiwo, Samuel Ogunfuye, Karla Paulette Flores Silva, Ray Lee
Unfortunately, each of us has a 1/4 chance of getting cancer. Although with advances in treatment technologies, the survival rate of cancer patients has increased, cancer still kills many people. Breast cancer is the second most diagnosed cancer and most fatal in women. The goal of this project is to develop models that can accurately classify breast cancer patient outcomes as either "alive" or "dead", based on demographic data and clinical data at the time of diagnosis.
Data: The Cancer Genome Atlas Breast Cancer (TCGA-BRCA) project through the National Cancer Institute - GDC Data Portal.
Method: We extract patient clinical information of the patients and engineer the features as necessary. Then we apply a few classification algorithms such as random forest, AdaBoost, SVC, logistic regression, K-nearest neighbor, and MLP while keeping the decision tree algorithm as our base model to predict patients' vital status.