THPLM: a sequence-based deep learning framework for protein stability changes prediction upon point variations using pretrained protein language model

  • Gong, Jianting
  • Jiang, Lili
  • Chen, Yongbing
  • Zhang, Yixiang
  • Li, Xue
  • Ma, Zhiqiang
  • Fu, Zhiguo
  • He, Fei
  • Sun, Pingping
  • Ren, Zilin
  • Tian, Mingyao
Bioinformatics 39(11):p btad646, November 2023. | DOI: 10.1093/bioinformatics/btad646

Abstract

Motivation:

Quantitative determination of protein thermodynamic stability is a critical step in protein and drug design. Reliable prediction of protein stability changes caused by point variations contributes to developing-related fields. Over the past decades, dozens of structure-based and sequence-based methods have been proposed, showing good prediction performance. Despite the impressive progress, it is necessary to explore wild-type and variant protein representations to address the problem of how to represent the protein stability change in view of global sequence. With the development of structure prediction using learning-based methods, protein language models (PLMs) have shown accurate and high-quality predictions of protein structure. Because PLM captures the atomic-level structural information, it can help to understand how single-point variations cause functional changes.

Results:

Here, we proposed THPLM, a sequence-based deep learning model for stability change prediction using Meta's ESM-2. With ESM-2 and a simple convolutional neural network, THPLM achieved comparable or even better performance than most methods, including sequence-based and structure-based methods. Furthermore, the experimental results indicate that the PLM's ability to generate representations of sequence can effectively improve the ability of protein function prediction.

Availability and implementation:

The source code of THPLM and the testing data can be accessible through the following links: https://github.com/FPPGroup/THPLM.

Copyright © Copyright Oxford University Press 2023.
View full text|Download PDF