top of page


Consumer Sentiment Analysis

Gilyoung Cheong, Dohoon Kim, Vinicius Ambrosi


This project utilizes natural language processing (NLP) and machine learning (ML) techniques to construct predictive models capable of assessing and rating comments provided by consumers. Our goal is to create a model that receives an online comment from a consumer and gives a rating from 1 star to 5 stars. The baseline model can be created by vectorizing online comments with a sentence transformer and train an ML classification model with input features of a comment being the components of the corresponding vector. However, the performance of the model often suffers from possible outlier inputs and the imbalanced rating data in training. This project aims to resolve these issues by
1. fine-tuning the vectorization process so that it fits the input review data better and
2. developing loss function and architecture that take care of imbalance issue in training data.

This is a continuation of my previous data science project with Vinicius Ambrosi, Dohoon Kim, and Hannah Lloyd.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL
bottom of page