Congratulations on the Acceptance of a Paper by Dr. Hedi Chen in Advanced Science

Recently, a research article authored by Dr. Hedi Chen, a PhD graduate from our research group, entitled “Automatically Defining Protein Words for Diverse Functional Predictions Based on Attention Analysis of a Protein Language Model,” has been officially accepted for publication in Advanced Science. This work represents an important starting point for our group’s systematic research efforts in protein language models and protein function prediction in recent years.

This work defines “protein words” as an alternative to “motif” for studying proteins and functional prediction applications. We first developed an unsupervised tool we term Protein Wordwise, which parses analyte protein sequences into protein words by analyzing attention matrices from a protein language model (PLM) through a community detection algorithm. We then developed a supervised sequence-function prediction model called Word2Function, for mapping protein words to GO terms through feature importance analysis. We compared the prediction performance of our protein word-based toolkit with a motif-based method (PROSITE) for multiple protein function datasets. We also assembled a functionally diverse data resource we term PWNet to support evaluation of protein words for predicting functional residues across 10 tasks (e.g., diverse biomolecular binding, catalysis, and ion-channel activity). Our toolkit outperforms PROSITE in all the examined datasets and tasks. By abandoning domains and instead using attention matrices from a PLM for automatic, systematic, and annotation-agnostic parsing of proteins, our toolkit both outperforms currently available tools for functional annotations at the residue and whole-protein levels and suggests innovative forms of protein analysis well-suited to the post-AlphaFold era of biochemistry.

This achievement reflects our group’s sustained efforts in protein language modeling, bioinformatics methodology, and protein function prediction, and it lays a solid foundation for a series of follow-up studies. We warmly congratulate Dr. Hedi Chen on this accomplishment and sincerely thank all group members and collaborators who contributed to and supported this work.

For details: https://doi.org/10.1002/advs.202521970