Data Science Boot Camp
Spring 2026
Jan 26, 2026
-
May 1, 2026
I'm a paragraph. Click here to add your own text and edit me. It's easy.
Checking your registration status...
To access the program content, you must first create an account and member profile and be logged in.
You are registered for this program.
Registration Deadlines
Jan 21, 2026
-
All Erdős Spring 2026 Career Launch Cohort or Alumni Club members who are not participating in another Launch bootcamp
-
-
Category
Launch, Core Program, Boot Camp, Projects, Certificates
Overview
In this bootcamp, we will develop the skills needed to complete a data science project from start to finish. This includes defining a problem in quantitative terms, identifying key performance indicators (KPIs), acquiring and cleaning data, exploring patterns and trends, and transforming raw data into meaningful variables. We will then build models for prediction and inference, focusing primarily on supervised learning methods for regression and classification.

Click here to be invited to the slack organization: The Erdős Institute
Click here to access the slack cohort channel: #slack-cohort-channel
Click here to access the slack program channel: #slack-program-channel
Click here to download the Events & Deadlines .ics calendar file
Organizers, Instructors, and Advisors
Steven Gubkin, PhD
Lead Instructor
Office Hours:
By appt. only
Email:
Preferred Contact:
Slack
Please feel free to message me on Slack with any questions!
Alec Clott, PhD
Head of Data Science Projects
Office Hours:
By appt. only
Email:
Preferred Contact:
Slack
Participants are welcome to reach out to me via slack or email. I normally work standard EST hours (9am-5pm), but can always find time to meet folks via Zoom too after work. Let me know how I can help!
Objectives
The goal of our Data Science Boot Camp is to provide you with the skills and mentorship necessary to produce a portfolio worthy data science/machine learning project while also providing you with valuable career development support and connecting you with potential employers.
Project Examples
TEAM 33
Tuning Up Music Highway
James O'Quinn, Yang Mo, john hurtado cadavid, Ruixuan Ding, Chilambwe Natasha Wapamenshi

Known as the most dangerous highway in Tennessee, Music Highway, the stretch of Interstate 40 between Memphis and Nashville, could use a serious tuning up. This project investigates the effectiveness and cost-efficiency of potential physical safety interventions along its Madison and Henderson County segments, with the goal of reducing crash severity. We used a data-driven geospatial modeling approach to assess whether adding specific safety features to targeted segments predicts statistically significant changes in crash injury outcomes.
TEAM 29
Who Regulates the Regulators?
Jared Able, Joshua Jackson, Zachary Brennan, Alexandria Wheeler, Nicholas Geiser

With recent major cuts to governmental regulation agencies in the US, we investigate whether those cuts are justified. In particular, we analyze the efficacy of RGGI, a state-level cap-and-trade program designed to regulate CO2 emissions in power plants. By using synthetic controls, we answer the counterfactual question: "how would CO2 emissions look in a world where RGGI was never enacted?".
First Steps/Prerequisites
- Base level familiarity with Python
- Differential calculus. Ideally you also know some multivariate differential calculus and linear algebra.
- Basic statistics and probability
Program Content
I'm a paragraph. Click here to add your own text and edit me. It's easy.
Course materials are available on github through the following link:
github message for user
Textbook/Notes
Note: our video player does not support playback speed options. You can find a third party browser extension which will allow you to modify video playback speed. For example, this one works for Chrome: video-speed-controller. If you would prefer to avoid a browser extension you can manually modify the playback speed in the javascript console as well: Speed up any HTML5 video player!
(Prerecorded) L9E1: Moving Average (MA(q)) Models
Lecture 09: Time Series II Control (prerecorded)
A Moving Average model expresses the value of a time series as a linear combination of lagged independent normally distributed errors.
Data Source Websites
Data Collection (prerecorded)
We cover a plethora of data source websites you can use.
Web Scraping with BeautifulSoup
Data Collection (prerecorded)
We give a brief introduction into web scraping with BeautifulSoup
(Prerecorded) L9E1: Autoregressive (AR(p)) Models
Lecture 09: Time Series II Control (prerecorded)
An autoregressive model expresses the value of a time series as a linear combination of lags plus an error term.
Python and APIs
Data Collection (prerecorded)
How can we use python to collect data from APIs?
Project/Homework Instructions
I'm a paragraph. Click here to add your own text and edit me. It's easy.
Schedule
Click on any date for more details
Phase 1 - Instruction and Project Completion: Feb 02 - Mar 20, 2026
Project Review & Judging: Mar 23 - Mar 26, 2026
Phase 2 - Intense Interview Prep & Career Connections: Mar 27 - May 1, 2026
Lecture 00: Orientation / Computer Setup Day
Jan 29, 2026 at 06:30 PM UTC
EVENT
Lecture 01: Supervised Learning
Feb 3, 2026 at 06:30 PM UTC
EVENT
Lecture 02: Model Evaluation
Feb 5, 2026 at 06:30 PM UTC
EVENT
Problem Session 02
Feb 9, 2026 at 06:30 PM UTC
EVENT
Problem Session 03
Feb 11, 2026 at 06:30 PM UTC
EVENT
Problem Session 04
Feb 16, 2026 at 06:30 PM UTC
EVENT
Problem Session 05
Feb 18, 2026 at 06:30 PM UTC
EVENT
Problem Session 06
Feb 23, 2026 at 06:30 PM UTC
EVENT
Problem Session 07
Feb 25, 2026 at 06:30 PM UTC
EVENT
Problem Session 08
Mar 2, 2026 at 06:30 PM UTC
EVENT
Problem Session 09
Mar 4, 2026 at 06:30 PM UTC
EVENT
Problem Session 10
Mar 9, 2026 at 05:30 PM UTC
EVENT
Problem Session 11
Mar 11, 2026 at 05:30 PM UTC
EVENT
Math Hour 00
Feb 2, 2026 at 03:00 PM UTC
EVENT
Math Hour 01
Feb 4, 2026 at 03:00 PM UTC
EVENT
Project Pitch Hour
Feb 6, 2026 at 09:00 PM UTC
EVENT
Lecture 03: Complexity Control
Feb 10, 2026 at 06:30 PM UTC
EVENT
Lecture 04: Linear Regression
Feb 12, 2026 at 06:30 PM UTC
EVENT
Lecture 05: Generalized Linear Models and Generalized Additive Models
Feb 17, 2026 at 06:30 PM UTC
EVENT
Lecture 06: Inference I
Feb 19, 2026 at 06:30 PM UTC
EVENT
Lecture 07: Inference II
Feb 24, 2026 at 06:30 PM UTC
EVENT
Lecture 08: Time Series I
Feb 26, 2026 at 06:30 PM UTC
EVENT
Lecture 09: Time Series II
Mar 3, 2026 at 06:30 PM UTC
EVENT
Lecture 10: Ensemble Learning I
Mar 5, 2026 at 06:30 PM UTC
EVENT
Lecture 11: Ensemble Learning II
Mar 10, 2026 at 05:30 PM UTC
EVENT
Lecture 12: Introduction to Neural Networks
Mar 12, 2026 at 05:30 PM UTC
EVENT
Problem Session 00
Feb 2, 2026 at 06:30 PM UTC
EVENT
Problem Session 01
Feb 4, 2026 at 06:30 PM UTC
EVENT
Math Hour 02
Feb 9, 2026 at 03:00 PM UTC
EVENT
Math Hour 03
Feb 11, 2026 at 03:00 PM UTC
EVENT
Math Hour 04
Feb 16, 2026 at 03:00 PM UTC
EVENT
Math Hour 05
Feb 18, 2026 at 03:00 PM UTC
EVENT
Math Hour 06
Feb 23, 2026 at 03:00 PM UTC
EVENT
Math Hour 07
Feb 25, 2026 at 03:00 PM UTC
EVENT
Math Hour 08
Mar 2, 2026 at 03:00 PM UTC
EVENT
Math Hour 09
Mar 4, 2026 at 03:00 PM UTC
EVENT
Math Hour 10
Mar 9, 2026 at 02:00 PM UTC
EVENT
Math Hour 11
Mar 11, 2026 at 02:00 PM UTC
EVENT
Project/Homework Deadlines
Jan 31, 2026
04:59 AM UTC
Last chance to switch bootcamps
Email Amalya Lehmann at amalya@erdosinstitute.org if you would like to switch to a different bootcamp.
Feb 5, 2026
04:59 AM UTC
Watch video about Project Formation
This should help answer any Q's you may have going into project formation
Feb 5, 2026
04:59 AM UTC
Watch 3 Previous Top Projects
Consult the project database, and watch at least 3 previous top projects from Erdos Alumni.
Feb 6, 2026
09:00 PM UTC
Project Pitch Hour
Opportunity to meet with other Erdős Fellows and form teams and propose topics.
Feb 12, 2026
04:59 AM UTC
Last day to defer enrollment to a future cohort
Contact Amalya Lehmann (amalya@erdosinstitute.org) if you would like to unenroll from this cohort and defer to a future cohort.
Feb 12, 2026
04:59 AM UTC
Finalized Teams with Preliminary Project Ideas
Teams need to be finalized by this point. If you proposed or created a project, you must have others in your group. If you did not propose or create a project, you must join an open group.
Feb 14, 2026
04:59 AM UTC
Data gathering and defining stakeholders + KPIs
Find the dataset you will be working with. Describe the dataset and the problem you are looking to solve (1 page max). List the stakeholders of the project and company key performance indicators (KPIs) (bullet points).
Feb 21, 2026
04:59 AM UTC
Data cleaning + preprocessing + EDA
Look for missing values and duplicates. Basic data manipulation & preliminary feature engineering. Exploratory data analysis.
Feb 28, 2026
04:59 AM UTC
Written proposal of modeling approach [Checkpoint]
Describe your planned modeling approach, based on the exploratory data analysis from the last two weeks (< 1 page, bullet points).
Mar 7, 2026
04:59 AM UTC
Modeling and Preliminary Results
Results with visualizations and/or metrics. List of successes and pitfalls.
Mar 14, 2026
03:59 AM UTC
Clean your repository and start working on final presentation
Clean up your repository so that an outsider can easily follow your work. Convert notebooks into scripts where possible. Confirm that the whole pipeline from data ingestion all the way to prediction or inference works without fuss.
Mar 21, 2026
03:59 AM UTC
Final Project Deadline
Submit your final project by this time.



