Your certificate is now private
Certificate of Completion
THIS ACKNOWLEDGES THAT
HAS COMPLETED THE FALL 2024 DATA SCIENCE BOOT CAMP
Alyce Stagliano
Roman Holowinsky, PhD
December 11, 2024
DIRECTOR
DATE
TEAM
Predicting Voter Turnout Using Census Data and Polling Site Accessibility
Avram Steiner,Alyce Stagliano,Charles Kimball
The problem being explored is to use demographic and geographic data to predict voter turnout, in order to see what variables are indicators of poor voter turnout.
We restricted our analysis to just the city of Chicago, and pulled demographic data from the US Census Bureau, voting data from the Illinois Board of Elections, geographical precinct data from the City of Chicago, polling station data from the Center for Public Integrity, and transit times from the Google Maps API. Our baseline model was a simple average of voter turnout across precincts in our training set. The vast amount of our time and effort was spent on sourcing, cleaning, and engineering data. For our analysis, we compared linear and logistic regressions, as well as an XGBoostRegressor ensemble method. Although the logistic and linear regression models perform similarly (and both outperform XGBoost), the logistic model is "philosophically" better. In particular, it reflects the nature