top of page
CertificateBackground.png

Certificate of Completion

ErdosHorizontal.png

THIS ACKNOWLEDGES THAT

HAS COMPLETED THE MAY-SUMMER 2024 DATA SCIENCE BOOT CAMP

Craig Franze

clear.png

Roman Holowinsky, PhD

JUNE 10, 2024

DIRECTOR

DATE

TEAM

Topic recognition on NYT articles

Ravi Tripathi, Touseef Haider, Ping Wan, Schinella D'Souza, Alessandro Malusà, Craig Franze

clear.png

The project proposes to study metadata of New York Times article to detect most relevant topics and build a recommendation system based on topic similarity.

We plan to do the following:
1) Apply methods like Latent Dirichlet Allocation (LDA) and Bidirectional Encoder Representations from Transformers (BERT) to identify the most relevant topics from a corpus of about 42,000 article published over the last year
2) Draw insightful visuals to highlight topic and word distribution as well as popular trends
3) Use Neural Networks to assign significant labels to topics
4) Create a recommender system based on topic similarity

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL
bottom of page