top of page
Data Science Boot Camp

Spring 2023

May 9, 2023

-

Jun 8, 2023

I'm a paragraph. Click here to add your own text and edit me. It's easy.

erdosOspin.gif

Checking your registration status...

To access the program content, you must first create an account and member profile and be logged in.

You are registered for this program.

Matt Osborne Office Hour

Next Event

NEXT EVENT

Registration Deadlines

Mar 16, 2023

-

Academics from Member Institutions/Departments

Mar 16, 2023

-

Academics from Non-Member Institutions paying the $500 membership fee

Jan 16, 2023

-

Academics from Non-Member Institutions applying for Corporate Sponsored Fellowships

Category

Launch

Overview

The Erdős Institute's signature Data Science Boot Camp has been running since May 2018 thanks to the generous support of our sponsors, members, and partners. Due to its popularity, we now offer our boot camp online twice per year in two different formats: a 1-month long intensive boot camp each May and a semester long version each Fall.

Slack

Click here to be invited to the slack organization: The Erdős Institute

Click here to access the slack cohort channel: #slack-cohort-channel

Click here to access the slack program channel: #slack-program-channel

calendar-icon.png

Click here to download the Events & Deadlines .ics calendar file

Organizers, Instructors, and Advisors

Objectives

The goal of our Data Science Boot Camp is to provide you with the skills and mentorship necessary to produce a portfolio worthy data science/machine learning project while also providing you with valuable career development support and connecting you with potential employers.

Those who successfully complete a team project will receive a digital certificate of completion with a sharable URL.

Project Examples

TEAM 16

Predicting Lead Contamination in NY School Drinking Water

Ranadeep Roy,Cami Goray,Hana Lang

clear.png
Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

Lead is a toxic metal, and in children especially, lead exposure can have severe health consequences -- even small amounts of lead have the potential to affect memory, behavior, and learning ability. Despite this, numerous schools across New York State have at least one drinking water outlet with lead levels testing for above 5 ppb. In this project, we aim to predict the presence of lead contamination in school drinking water, and better understand the role of demographic, socioeconomic, infrastructural, and geographic features in elevated lead levels.

TEAM 33

Tuning Up Music Highway

James O'Quinn, Yang Mo, john hurtado cadavid, Ruixuan Ding, Chilambwe Natasha Wapamenshi

clear.png
Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

Known as the most dangerous highway in Tennessee, Music Highway, the stretch of Interstate 40 between Memphis and Nashville, could use a serious tuning up. This project investigates the effectiveness and cost-efficiency of potential physical safety interventions along its Madison and Henderson County segments, with the goal of reducing crash severity. We used a data-driven geospatial modeling approach to assess whether adding specific safety features to targeted segments predicts statistically significant changes in crash injury outcomes.

First Steps/Prerequisites

Participants should have a base-level familiarity with Python. Participants should also be familiar with some basic math concepts. Finally, you will also need to have your laptop or desktop computer set up for the course. If you are new to Python, need a quick math refresher, or if you need help setting up your computer, then please follow the link below.

Program Content

I'm a paragraph. Click here to add your own text and edit me. It's easy.

Course materials are available on github through the following link:

25231-github-cat-in-a-circle-icon-vector-icon-vector-eps.png
Request Access to GitHub

github message for user

Program Content

Textbook/Notes

Note: our video player does not support playback speed options. You can find a third party browser extension which will allow you to modify video playback speed. For example, this one works for Chrome: video-speed-controller. If you would prefer to avoid a browser extension you can manually modify the playback speed in the javascript console as well: Speed up any HTML5 video player!

Introduction to Convolutional Neural Networks

Neural Networks

We introduce the basic theory behind convolutional neural networks, NNs designed for grid-based data.

Slides
Code

Loading Pre-Trained Models

Neural Networks

How you can load a model you have saved after training it for a long time.

Slides
Code

PCA I

Dimension Reduction

We introduce the theory behind principal components analysis and demonstrate how it is implemented in python.

Slides
Code

tSNE

Dimension Reduction

Another dimension reduction used primarily for data visualization.

Slides
Code

Introduction to Convolutional Neural Networks II

Neural Networks

The basics of fitting a convolutional neural network in keras.

Slides
Code

Future Directions

Neural Networks

Directions you may pursue if you want to learn more about neural networks, theory and implementation.

Slides
Code

PCA II

Dimension Reduction

We review how to explain all that variance.

Slides
Code

What is Clustering?

Clustering

We take a moment to define clustering problems.

Slides
Code

Introduction to Recurrent Neural Networks I

Neural Networks

A brief review of the theory of basic recurrent neural networks.

Slides
Code

Introduction

Unsupervised Learning

We introduce the field of unsupervised learning.

Slides
Code

PCA III

Dimension Reduction

We end our three part series on PCA by demonstrating how you can interpret the results of a PCA fit.

Slides
Code

k Means Clustering

Clustering

Our first clustering algorithm.

Slides
Code

Project/Homework Instructions

I'm a paragraph. Click here to add your own text and edit me. It's easy.

Project/Team Formation
Project Submission
Projects README

Schedule

Click on any date for more details

Orientation & Setup

Phase 1: Instruction and Project Completion

Project Review & Judging

Phase 2: Intense Interview Prep & Career Connections

Matt Osborne Office Hour

May 3, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 1

May 9, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 2

May 10, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 3

May 11, 2023 at 02:00 PM UTC

EVENT

Matt Office Hour

May 12, 2023 at 03:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 4

May 15, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 5

May 16, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 6

May 17, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 7

May 18, 2023 at 02:00 PM UTC

EVENT

Matt Office Hour

May 19, 2023 at 03:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 8

May 22, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 9

May 23, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 10

May 24, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 11

May 25, 2023 at 08:00 PM UTC

EVENT

Matt Office Hour

May 26, 2023 at 07:00 PM UTC

EVENT

Matt Office Hour

May 31, 2023 at 06:00 PM UTC

EVENT

Erdős Final Project Showcase and Commencement

Jun 7, 2023 at 04:00 PM UTC

EVENT

Matt Osborne Office Hour

May 5, 2023 at 03:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 1

May 9, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 2

May 10, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 3

May 11, 2023 at 08:00 PM UTC

EVENT

Matt Office Hour

May 12, 2023 at 07:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 4

May 15, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 5

May 16, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 6

May 17, 2023 at 08:00 PM UTC

EVENT

Data Science Boot Camp PM Problem Session 7

May 18, 2023 at 08:00 PM UTC

EVENT

Matt Office Hour

May 19, 2023 at 07:00 PM UTC

EVENT

Data Science Boot Camp Lecture 9

May 22, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 10

May 23, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 11

May 24, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 12

May 25, 2023 at 09:30 PM UTC

EVENT

Matt Office Hour

May 29, 2023 at 08:00 PM UTC

EVENT

Matt Office Hour

Jun 1, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp Lecture 1

May 8, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 2

May 9, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 3

May 10, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 4

May 11, 2023 at 09:30 PM UTC

EVENT

Project Pitch Day (Live on Zoom)

May 12, 2023 at 08:30 PM UTC

EVENT

Data Science Boot Camp Lecture 5

May 15, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 6

May 16, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 7

May 17, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp Lecture 8

May 18, 2023 at 09:30 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 8

May 22, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 9

May 23, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 10

May 24, 2023 at 02:00 PM UTC

EVENT

Data Science Boot Camp AM Problem Session 11

May 25, 2023 at 02:00 PM UTC

EVENT

Matt Office Hour

May 26, 2023 at 03:00 PM UTC

EVENT

Matt Office Hour

May 31, 2023 at 02:00 PM UTC

EVENT

Matt Office Hour

Jun 1, 2023 at 08:00 PM UTC

EVENT

Project/Homework Deadlines

May 12, 2023

08:30 PM UTC

Project Pitch Day (Live on Zoom)

Opportunity to meet with other Erdos Fellows and form teams and propose topics.

May 13, 2023

03:59 AM UTC

Submit Team Proposal to Project Formation Page

If you want to propose a project, or have an idea for a project, submit it by this date.

May 15, 2023

03:59 AM UTC

Finalized Teams with Preliminary Project Idea

Teams need to be finalized by this point. If you proposed or created a project, you must have others in your group. If you did not propose or create a project, you must join an open group.

May 20, 2023

03:59 AM UTC

Data gathering and defining stakeholders + KPIs

Find the dataset you will be working with. Describe the dataset and the problem you are looking to solve (1 page max). List the stakeholders of the project and company key performance indicators (KPIs) (bullet points).

May 20, 2023

03:59 AM UTC

Data cleaning + preprocessing

Look for missing values and duplicates. Basic data manipulation & preliminary feature engineering.

May 27, 2023

03:59 AM UTC

Exploratory data analysis + visualizations [Checkpoint]

Distributions of variables, looking for outliers, etc. Descriptive statistics.

May 27, 2023

03:59 AM UTC

Written proposal of modeling approach [Checkpoint]

Test linearity assumptions. Dimensionality reductions (if necessary). Describe your planned modeling approach, based on the exploratory data analysis from the last two weeks (< 1 page, bullet points).

Jun 2, 2023

03:59 AM UTC

Machine learning models or equivalent [Checkpoint]

Results with visualizations and/or metrics. List of successes and pitfalls.

Jun 3, 2023

04:00 PM UTC

Final project due

Please read the submission instructions on the link below.

©2017-2025 by The Erdős Institute.

bottom of page