Certificate of Completion
THIS ACKNOWLEDGES THAT
HAS COMPLETED THE SPRING 2022 DATA SCIENCE BOOT CAMP
Irati Hurtado
Roman Holowinsky, PhD
JUNE 08, 2022
DIRECTOR
DATE
TEAM
Mariner
Irati Hurtado, Konstantinos Karatapanis, Sammy Sbiti
In this project we develop a model that can ascribe prespecified annotations on students essays. For this categorical classification problem
we used the pretrained word embedding neural network (the longformer) and on top of it we trained, via supervised learning, a two layer
neural network model. Our neural network was trained by optimizing the cross categorical entropy, a very popular choice for these kind of tasks.
Next to smoothen out our prediction output we properly weighted the probability distributions of our tokens to optimize the predictions across the essay on a sentence level. The model has an app version so users can interact and annotate their own essays.