Data Science Boot Camp
May-Summer 2024
May 6, 2024
-
Jun 5, 2024
I'm a paragraph. Click here to add your own text and edit me. It's easy.
Checking your registration status...
You are registered for this program.
Registration Deadlines
May 7, 2024
-
All interested participants
-
-
Category
Launch, Core Program, Boot Camp, Projects, Certificates
Overview
The Erdős Institute's signature Data Science Boot Camp has been running since May 2018 thanks to the generous support of our sponsors, members, and partners. Due to its popularity, we now offer our boot camp online three times per year in two different formats: a 1-month long intensive boot camp each May and a semester long version each Spring & Fall.
#slack-channel
Organizers, Instructors, and Advisors
Steven Gubkin, PhD
Lead Instructor
Office Hours:
MTWRF 12pm - 1pm ET, and by appt.
Email:
Preferred Contact:
Slack
Please feel free to message me on Slack with any questions!
Alec Clott, PhD
Head of Data Science Projects
Office Hours:
By appt. only
Email:
Preferred Contact:
Slack
Participants are welcome to reach out to me via slack or email. I normally work standard EST hours (9am-5pm), but can always find time to meet folks via Zoom too after work. Let me know how I can help!
Objectives
The goal of our Data Science Boot Camp is to provide you with the skills and mentorship necessary to produce a portfolio worthy data science/machine learning project while also providing you with valuable career development support and connecting you with potential employers.
Project Examples
TEAM 34
Aware NLP Project III
Mohammad Nooranidoost, Baian Liu, Craig Franze, Mustafa Anıl Tokmak, Himanshu Raj, Peter Williams
This project involves the investigation and evaluation of different methodologies for retrieval for use in RAG (Retrieval-Augmented Generation) systems. In particular, this project investigates retrieval quality for information downloaded from employee subreddits. We investigated the impacts of using clustering, multi-vector indexing, and multi-querying in advanced retrieval methodologies against baseline naive retrieval.
First Steps/Prerequisites
Program Content
I'm a paragraph. Click here to add your own text and edit me. It's easy.
Textbook/Notes
Project/Homework Instructions
I'm a paragraph. Click here to add your own text and edit me. It's easy.
Schedule
Click on any date for more details
Problem Solving Session 1
May 6, 2024 at 03:00 PM UTC
EVENT
Extra Help with Setting Up
May 6, 2024 at 08:30 PM UTC
EVENT
Lecture 2: Data Collection
May 7, 2024 at 07:00 PM UTC
EVENT
Lecture 3: Regression I
May 8, 2024 at 07:00 PM UTC
EVENT
Office Hours
May 9, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 10, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 13, 2024 at 04:00 PM UTC
EVENT
Problem Solving Session 6
May 14, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 6
May 14, 2024 at 11:00 PM UTC
EVENT
Lecture 7: Time Series II
May 15, 2024 at 07:00 PM UTC
EVENT
Office Hours
May 16, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 17, 2024 at 04:00 PM UTC
EVENT
Lecture 9: Classification II
May 20, 2024 at 07:00 PM UTC
EVENT
Office Hours
May 21, 2024 at 04:00 PM UTC
EVENT
Problem Solving Session 11
May 22, 2024 at 03:00 PM UTC
EVENT
Problem Solving Session 12
May 23, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 11
May 23, 2024 at 11:00 PM UTC
EVENT
Office Hours
May 28, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 31, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 6, 2024 at 04:00 PM UTC
EVENT
Problem Solving Session 2
May 7, 2024 at 03:00 PM UTC
EVENT
Problem Solving Session 3
May 8, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 3
May 8, 2024 at 11:00 PM UTC
EVENT
Lecture 4: Regression II
May 9, 2024 at 07:00 PM UTC
EVENT
Project Pitch Hour
May 10, 2024 at 08:00 PM UTC
EVENT
Lecture 5: Regression III
May 13, 2024 at 07:00 PM UTC
EVENT
Office Hours
May 14, 2024 at 04:00 PM UTC
EVENT
Problem Solving Session 7
May 15, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 7
May 15, 2024 at 11:00 PM UTC
EVENT
Lecture 8: Classification I
May 16, 2024 at 07:00 PM UTC
EVENT
Problem Solving Session 9
May 20, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 9
May 20, 2024 at 11:00 PM UTC
EVENT
Lecture 10: Ensemble Learning I
May 21, 2024 at 07:00 PM UTC
EVENT
Office Hours
May 22, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 23, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 24, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 29, 2024 at 04:00 PM UTC
EVENT
Erdős May 2024 Final Project Showcase
Jun 5, 2024 at 04:00 PM UTC
EVENT
Lecture 1: Introduction
May 6, 2024 at 07:00 PM UTC
EVENT
Office Hours
May 7, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 8, 2024 at 04:00 PM UTC
EVENT
Problem Solving Session 4
May 9, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 4
May 9, 2024 at 11:00 PM UTC
EVENT
Problem Solving Session 5
May 13, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 5
May 13, 2024 at 11:00 PM UTC
EVENT
Lecture 6: Time Series I
May 14, 2024 at 07:00 PM UTC
EVENT
Office Hours
May 15, 2024 at 04:00 PM UTC
EVENT
Problem Solving Session 8
May 16, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 8
May 16, 2024 at 11:00 PM UTC
EVENT
Office Hours
May 20, 2024 at 04:00 PM UTC
EVENT
Problem Solving Session 10
May 21, 2024 at 03:00 PM UTC
EVENT
Alternate Problem Session 10
May 21, 2024 at 11:00 PM UTC
EVENT
Lecture 11: Ensemble Learning II
May 22, 2024 at 07:00 PM UTC
EVENT
Lecture 12: Neural Networks
May 23, 2024 at 07:00 PM UTC
EVENT
Office Hours
May 27, 2024 at 04:00 PM UTC
EVENT
Office Hours
May 30, 2024 at 04:00 PM UTC
EVENT
Please check your registration email for program schedule and zoom links.
Project/Homework Deadlines
May 9, 2024
03:59 AM UTC
Watch 5 Previous Distinguished Projects
Click the "only show projects with distinction or higher" check box, watch five previous projects and explore their githubs.
May 10, 2024
08:00 PM UTC
Project Pitch Hour
Opportunity to meet with other Erdos Fellows and form teams and propose topics.
May 14, 2024
03:59 AM UTC
Submit Team Proposal to Project Formation Page
If you want to propose a project, or have an idea for a project, submit it by this date.
May 15, 2024
03:59 AM UTC
Finalized Teams with Preliminary Project Idea
Teams need to be finalized by this point. If you proposed or created a project, you must have others in your group. If you did not propose or create a project, you must join an open group.
May 17, 2024
02:06 PM UTC
Data gathering and defining stakeholders + KPIs
Find the dataset you will be working with. Describe the dataset and the problem you are looking to solve (1 page max). List the stakeholders of the project and company key performance indicators (KPIs) (bullet points).
May 18, 2024
03:59 AM UTC
Data cleaning + preprocessing
Look for missing values and duplicates. Basic data manipulation & preliminary feature engineering.
May 25, 2024
03:59 AM UTC
Written proposal of modeling approach [Checkpoint]
Test linearity assumptions. Dimensionality reductions (if necessary). Describe your planned modeling approach, based on the exploratory data analysis from the last two weeks (< 1 page, bullet points).
May 25, 2024
03:59 AM UTC
Exploratory data analysis + visualizations [Checkpoint]
Distributions of variables, looking for outliers, etc. Descriptive statistics.
Jun 1, 2024
03:59 AM UTC
Machine learning models or equivalent [Checkpoint]
Results with visualizations and/or metrics. List of successes and pitfalls.
Jun 2, 2024
03:59 AM UTC
Final project due
Please read the submission instructions on the link below.