top of page

TEAM

Predicting Protein Solubility from Sequence Data

Sicheng Zhao, Zikang Jia

clear.png

Build a machine learning model that predicts the solubility of proteins based solely on their amino acid sequences. Protein solubility is crucial in biotechnology and pharmaceutical applications.

Approach:
Start by encoding protein sequences using techniques like one-hot encoding or by extracting biochemical features (e.g., hydrophobicity, charge).
Train ML models to predict solubility. Experiment with different feature sets and model parameters to optimize performance.

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

©2017-2025 by The Erdős Institute.

bottom of page