Certificate of Completion
THIS ACKNOWLEDGES THAT
HAS COMPLETED THE SPRING 2024 DATA SCIENCE BOOT CAMP
Scott Auerbach
Roman Holowinsky, PhD
MAY 01, 2024
DIRECTOR
DATE
TEAM
Nuclear Localization Signal (NLS) Prediction - NLSeer
Scott Auerbach, Ukamaka Nnyaba, Ming Zhang, Yingyi Guo, Hemaa Selvakumar, Cisil Karaguzel
The purpose of the project is to build a prediction tool that estimates the possibility of having nuclear localization signals inside a protein's sequence based on the significance of each amino acid. Nuclear localization signals (NLS) are segments of a protein sequence that direct it towards the nucleus and have been implicated in human diseases and play an important role in many biological pathways. We employed datasets including whole protein sequences with and without nuclear localization signals and trained both classifiers and neural networks to predict whether or not a protein contained a NLS. Using a random forest classifier, we developed a web app through Flask that can predict whether or not a given protein is likely to have a NLS, and if so, also estimate the likelihood of each amino acid contributing to a NLS.