Certificate of Completion
THIS ACKNOWLEDGES THAT
HAS COMPLETED THE FALL 2024 DATA SCIENCE BOOT CAMP
Xiangwei Peng
Roman Holowinsky, PhD
December 11, 2024
DIRECTOR
DATE
TEAM
Analyzing the Impact of News Topics on Stock Prices
Xiangwei Peng, Xiaokang Wang
The stock price is influenced by numerous factors. We focus on using the daily news information to model the abnormal gain of the stock return which is not explained by the market information. To be more precisely, we build an automatic pipeline of :
- Stock price, news, factors ingestion;
- Preprocessing both stock and news data;
- Classifying the news and predicting the future price.
We use the Famma-French 5 factor model to get the abnormal return. We annotate the news using a soft-voting classifier and do the clustering of topics using the Hierarchical Dirichlet Process (HDP). Finally, we regress the abnormal return using the normalized daily topics counts as the features and XGBoost as the model.
Here are the datasets:
- Headlines: https://www.kaggle.com/datasets/rmisra/news-category-dataset
- News: https://components.one/datasets/all-the-news-articles-dataset/
- Factors: https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html
- Stock price: Yahoo Finance