Data Science Boot Camp
Fall 2023
Sep 7, 2023
-
Dec 7, 2023
Registration Deadlines
Sep 24, 2023
-
Academics from Member Institutions/Departments
Sep 23, 2023
-
Academics from Non-Member Institutions paying the $500 cohort membership fee
Sep 23, 2023
-
Academics from Non-Member Institutions applying for Corporate Sponsored Fellowships
Registration Link
You are registered for this program.
Overview
The Erdős Institute's signature Data Science Boot Camp has been running since May 2018 thanks to the generous support of our sponsors, members, and partners. Due to its popularity, we now offer our boot camp online twice per year in two different formats: a 1-month long intensive boot camp each May and a semester long version each Fall.
Instructional Team
Matthew Osborne, PhD
Head of Boot Camps
Office Hours:
By appointment only
Email:
Preferred Contact:
Slack
Don't hesitate to contact me with any questions or concerns.
Steven Gubkin, PhD
Lead Teaching Assistant
Office Hours:
Tu: 11am - 12pm ET
Email:
Preferred Contact:
Slack
Please feel free to message me on Slack with any questions! I will also be running a “math hour” every Wednesday from 11am - 12pm ET which will explore the mathematical underpinnings of the techniques covered in the previous lecture.
Alec Clott, PhD
Head of Data Science Projects
Office Hours:
Wed. 12-12:30pm EST, and by appt.
Email:
Preferred Contact:
Slack
Participants are welcome to reach out to me via slack or email. I normally work standard EST hours (9am-5pm), but can always find time to meet folks via Zoom too after work. Let me know how I can help!
Objectives
The goal of our Data Science Boot Camp is to provide you with the skills and mentorship necessary to produce a portfolio worthy data science/machine learning project while also providing you with valuable career development support and connecting you with potential employers.
Project Examples
TEAM
Correcting Racial Bias in Measurement of Blood Oxygen Saturation
Rohan Myers, Saad Khalid, woojeong kim, Brooks Miner, Jaychandran Padayasi

Fingertip pulse oximeters are the current standard for estimating blood oxygen saturation without a blood draw, both at home and in healthcare settings. However, pulse oximeters overestimate oxygen saturation, often resulting in ‘hidden hypoxemia’: a patient has hypoxemia (dangerously low oxygen saturation), but the oximeter returns a healthy oxygen value. Unfortunately, oximeter overestimation of oxygen saturation is exacerbated for patients with darker skin tones due to light-based oximeter technology. This results in Black patients experiencing hidden hypoxemia at twice the rate of white patients. By combining pulse oximeter readings (SpO2) with additional patient data, we develop improved methods for estimating arterial blood oxygen saturation (SaO2) and identifying Hidden Hypoxemia. The predictions of our models are more accurate than pulse-oximeter readings alone, and remove the systematic racial inequity inherent in the current medical practice of using oximeter readings alone.
First Steps/Prerequisites
Participants should have a base-level familiarity with Python. Participants should also be familiar with some basic math concepts. Finally, you will also need to have your laptop or desktop computer set up for the course.
If you are new to Python, need a quick math refresher, or if you need help setting up your computer, then please follow the link below.

Slack Channel: #slack-channel
Program Content
You will find all of the course content below in our GitHub repository. If you see a 404 Error when trying to open this repository, first check that you are signed into your GitHub account and then check with our community manager that you have been added to our repositories. Because our repositories are private, you must first be added before you can access them.
Every lecture in the "lectures" folder of the repository comes with a pre-recorded lecture video which you can find below. Note that these videos are not presented in the order in which they should be viewed. To see the suggested viewing order read the README document for the lectures here, https://github.com/TheErdosInstitute/code-2023/tree/main/lectures.
Live Lecture Notebook Schedule
---------------------------------
Will be filled in closer to start of Fall boot camp.
Textbook/Notes
Project/Homework Instructions
Erdős Project Instructions (Fall 2023)
The group project is a time to put everything you’ve learned to the test! You will work with your team to produce a portfolio-worthy project that you can use as a talking point with future employers.
General Information
In order to get an Erdős certificate, you must complete a data science project from start to finish.
Project Topics
Your project can be anything you would like, as long as you use Python. We want your project to be something you’re passionate about and can really dig into. We understand that open ended projects can be difficult so we’ve provided a few resources:
Project Database (Past Project Examples)
Project Help
There are a number of Project Mentors that will be available for project help! Feel free to chat with them via Slack (#project-help) for advice.
Project Expectations
The goal is to complete a data science project that could be presented in a job interview.
Requirements (see more details below)
Have an annotated GitHub repository
Executive summary of your project results and implications
5-min pre-recorded PowerPoint presentation detailing project process from start to finish
Timeline
The tasks for each week should be submitted to your Project Mentor before your weekly check-in. Some of the items listed below are more of a rough guideline, depending on your project. Consult your project mentor or Alec if you are unsure.
Schedule
Click on any date for more details
Lecture 1: Introduction
September 7, 2023 at 9:30:00 PM
EVENT
Office Hours 1
September 12, 2023 at 3:00:00 PM
EVENT
Problem Session 2
September 18, 2023 at 3:00:00 PM
EVENT
Lecture 3: Supervised Learning and Regression I
September 21, 2023 at 9:30:00 PM
EVENT
Math Hour 3
September 27, 2023 at 3:00:00 PM
EVENT
Office Hours 4
October 3, 2023 at 3:00:00 PM
EVENT
Problem Session 5
October 9, 2023 at 3:00:00 PM
EVENT
Lecture 6: Time Series II
October 12, 2023 at 9:30:00 PM
EVENT
Math Hour 6
October 18, 2023 at 3:00:00 PM
EVENT
Office Hours 7
October 24, 2023 at 3:00:00 PM
EVENT
Problem Session 8
October 30, 2023 at 3:00:00 PM
EVENT
Lecture 9: Classification II
November 2, 2023 at 9:30:00 PM
EVENT
Math Hour 9
November 8, 2023 at 4:00:00 PM
EVENT
Office Hours 10
November 14, 2023 at 4:00:00 PM
EVENT
Problem Session 11
November 27, 2023 at 4:00:00 PM
EVENT
Lecture 12: Neural Networks
November 30, 2023 at 10:30:00 PM
EVENT
Problem Session 1
September 11, 2023 at 3:00:00 PM
EVENT
Math Hour 1
September 13, 2023 at 3:00:00 PM
EVENT
Office Hours 2
September 19, 2023 at 3:00:00 PM
EVENT
Problem Session 3
September 25, 2023 at 3:00:00 PM
EVENT
Lecture 4: Regression II
September 28, 2023 at 9:30:00 PM
EVENT
Math Hour 4
October 4, 2023 at 3:00:00 PM
EVENT
Office Hours 5
October 10, 2023 at 3:00:00 PM
EVENT
Problem Session 6
October 16, 2023 at 3:00:00 PM
EVENT
Lecture 7: Time Series II
October 19, 2023 at 9:30:00 PM
EVENT
Math Hour 7
October 25, 2023 at 3:00:00 PM
EVENT
Office Hours 8
October 31, 2023 at 3:00:00 PM
EVENT
Problem Session 9
November 6, 2023 at 4:00:00 PM
EVENT
Lecture 10: Classification III
November 9, 2023 at 10:30:00 PM
EVENT
Math Hour 9
November 15, 2023 at 2:36:00 PM
EVENT
Office Hours 11
November 28, 2023 at 4:00:00 PM
EVENT
Office Hours 12
December 5, 2023 at 4:00:00 PM
EVENT
Study Group
September 12, 2023 at 3:00:00 PM
EVENT
Lecture 2: Data Collection
September 14, 2023 at 9:30:00 PM
EVENT
Math Hour 2
September 20, 2023 at 3:00:00 PM
EVENT
Office Hours 3
September 26, 2023 at 3:00:00 PM
EVENT
Problem Session 4
October 2, 2023 at 3:00:00 PM
EVENT
Lecture 5: Regression III & Time Series I
October 5, 2023 at 9:30:00 PM
EVENT
Math Hour 5
October 11, 2023 at 3:00:00 PM
EVENT
Office Hours 6
October 17, 2023 at 3:00:00 PM
EVENT
Problem Session 7
October 23, 2023 at 3:00:00 PM
EVENT
Lecture 8: Classification I
October 26, 2023 at 9:30:00 PM
EVENT
Math Hour 8
November 1, 2023 at 3:00:00 PM
EVENT
Office Hours 9
November 7, 2023 at 4:00:00 PM
EVENT
Problem Session 10
November 13, 2023 at 4:00:00 PM
EVENT
Lecture 11: Ensemble Learning
November 16, 2023 at 10:30:00 PM
EVENT
Math Hour 11
November 29, 2023 at 4:00:00 PM
EVENT
Math Hour 12
December 6, 2023 at 2:36:00 PM
EVENT
Please check your registration email for program schedule and zoom links.
Project/Homework Deadlines
Sep 23, 2023
3:59 AM
Watch 3 Previous Top Projects
Consult the project database, and watch at least 3 previous top projects from Erdos Alumni.
Oct 6, 2023
8:30 PM
Project Pitch Hour
Opportunity to meet with other Erdos Fellows and form teams and propose topics.
Oct 12, 2023
3:59 AM
Submit Team Proposal or Idea to Project Formation Page
If you want to propose a project, or have an idea for a project, submit it by this date.
Oct 14, 2023
3:59 AM
Finalized Teams with Preliminary Project Ideas
Teams need to be finalized by this point. If you proposed or created a project, you must have others in your group. If you did not propose or create a project, you must join an open group.
Oct 21, 2023
3:59 AM
Data gathering and defining stakeholders + KPIs
Find the dataset you will be working with. Describe the dataset and the problem you are looking to solve (1 page max). List the stakeholders of the project and company key performance indicators (KPIs) (bullet points).
Oct 28, 2023
3:59 AM
Data cleaning + preprocessing
Look for missing values and duplicates. Basic data manipulation & preliminary feature engineering.
Nov 4, 2023
3:59 AM
Exploratory data analysis + visualizations [Checkpoint]
Distributions of variables, looking for outliers, etc. Descriptive statistics.
Nov 10, 2023
4:59 PM
Written proposal of modeling approach [Checkpoint]
Test linearity assumptions. Dimensionality reductions (if necessary). Describe your planned modeling approach, based on the exploratory data analysis from the last two weeks (< 1 page, bullet points).
Nov 16, 2023
4:59 AM
Machine learning models or equivalent [Checkpoint]
Results with visualizations and/or metrics. List of successes and pitfalls.
Dec 2, 2023
4:59 AM
Final project due
Please read the submission instructions on the link below.