top of page


Topic recognition on NYT articles

Ravi Tripathi, Touseef Haider, Ping Wan, Schinella D'Souza, Alessandro Malusà, Craig Franze


The project proposes to study metadata of New York Times article to detect most relevant topics and build a recommendation system based on topic similarity.

We plan to do the following:
1) Apply methods like Latent Dirichlet Allocation (LDA) and Bidirectional Encoder Representations from Transformers (BERT) to identify the most relevant topics from a corpus of about 42,000 article published over the last year
2) Draw insightful visuals to highlight topic and word distribution as well as popular trends
3) Use Neural Networks to assign significant labels to topics
4) Create a recommender system based on topic similarity

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL
bottom of page