pLM4ACP:基于蛋白质语言模型的抗癌肽预测模型

pLM4ACP:a model for predicting anticancer peptides based on machine learning and protein language models

导出

摘要癌症是严重的全球健康问题,是导致人类死亡的重要原因之一。传统的癌症治疗往往有损害重要器官功能的风险。抗癌肽(anticancer peptides,ACPs)因其体积小、特异性高和毒性低被认为是对抗人类常见癌症最有前景的治疗剂之一。但抗癌肽识别高度局限于实验室,成本昂贵且耗时,为了解决这一问题,本研究提出了一种基于机器学习和蛋白质语言模型的抗癌肽预测模型pLM4ACP。该模型采用ProtT5蛋白质语言模型对抗癌肽序列进行特征提取,将提取的特征输入到支持向量机(support vector machine,SVM)分类算法中,并进行优化和性能评估。独立测试结果表明,该模型的准确性(accuracy,ACC)、F1分数(F1-score)、马修斯相关系数(Matthews correlation coefficient,MCC)和曲线下面积(area under the curve,AUC)分别为0.763、0.767、0.527和0.827,优于现有模型。本研究基于蛋白质语言模型构建抗癌肽高效预测模型,可推动人工智能技术在生物医学领域的应用,促进精准医疗和计算生物学的发展。 Cancer is a serious global health problem and a major cause of human death.Conventional cancer treatments often run the risk of impairing vital organ functions.Anticancer peptides(ACPs)are considered to be one of the most promising therapeutic agents against common human cancers due to their small sizes,high specificity,and low toxicity.Since ACP recognition is highly limited to the laboratory,expensive,and time-consuming,we proposed pLM4ACP,a model for predicting ACPs based on machine learning and protein language models.In this model,the protein language model ProtT5 was used to extract the features of ACPs,and the extracted features were input into the support vector machine(SVM)classification algorithm for optimization and performance evaluation.The model showcased significantly higher accuracy than other methods,with the overall accuracy of 0.763,F1-score of 0.767,Matthews correlation coefficient of 0.527,and area under the curve of 0.827 on the independent test set.This study constructs an efficient anticancer peptide prediction model based on protein language models,further advancing the application of artificial intelligence in the biomedical field and promoting the development of precision medicine and computational biology.

作者刘奕彤陈文欣李娟娟迟雪马香唐燕琼李宏 LIU Yitong;CHEN Wenxin;LI Juanjuan;CHI Xue;MA Xiang;TANG Yanqiong;LI Hong(School of Life and Health Sciences,Hainan University,Haikou 570228,Hainan,China)

机构地区海南大学生命健康学院

出处《生物工程学报》北大核心 2025年第8期3252-3261,共10页 Chinese Journal of Biotechnology

基金国家自然科学基金(32460244) 海南省自然科学基金(322RC589)。

关键词抗癌肽预测模型蛋白质语言模型机器学习生物信息学 anticancer peptides prediction model protein language models machine learning bioinformatics

分类号 R730.5 [医药卫生—肿瘤] Q811.4 [生物学—生物工程]

作者简介 Corresponding author:李宏,E-mail:lihongbio@hainanu.edu.cn。

引文网络
相关文献

参考文献1

1陈彩霞,韩瑞兰,苏秀兰.生物活性肽在抗癌领域的应用研究进展[J].中国医药导报,2015,12(24):35-39. 被引量：8

二级参考文献36

1Rutherfurd-Markwick KJ. Food proteins as a source ofbioactive peptides with diverse functions [J]. Br J Nutr,2012,108(Suppl 2):S149-S157.
2Suarez - Jimenez GM,Burgos - Hernandez A , Ezquerra -Brauer JM. Bioactive peptides and depsipeptides with an-ticancer potential ; sources from marine animals [J]. MarDrugs,2012,10(5) :963-986.
3Moxley KM. Endometrial carcinoma: a review of chemother-apy, drug resistance, and the search for new agents [J],Oncol,2010,15(10):1026-1033.
4Shewach DS,Kuchta RD. Introduction to cancer chemother-apeutics [J]. Chem Rev,2009,109(7):2859-2861.
5Tofilon PJ, Camphausen K. Review Molecular targets fortumor radio sensitization [J]. Chem Rev,2009,109(7) :2974-2988.
6Espinosa E,Zamora P,Feliu J,et al. Review Classificationof anticancer drugs--a new system based on therapeutictargets [J]. Cancer Treat Rev,2003,29(6) :515-523.
7Newman DJ, Cragg GM. Review natural products as sourcesof new drugs over the last 25 years [J]. J Nat Prod, 2007,70(3):461-477.
8Boens B,Azouz M,Ouk TS,et al. Synthesis and biologicalevaluation of nitrogenmustard derivatives of purine bases [J].Talor Francis Online, 2013,32(2) : 69-80.
9Riedl S,Zweytick D,Lohner K. Membrane-active host de-fense peptides - challenges and perspectives for the de-velopment of novel anticancer drugs [J]. Chem Phys Lipids,2011,164(8):766-781.
10Rinehart KL Jr,Gloer JB,Wilson GR,et al. Antiviral andantitumor compounds from tunicates [J]. Fed Proc, 1983,42(l):87-90.

共引文献7

1张莉,刘洁,吴倩,贺攀华,朱华玲,黄治强.含甘二肽甲酯结构的β-二酮衍生物的合成、除草抑菌活性及构效关系[J].天津农学院学报,2017,24(1):52-57. 被引量：1
2石扬,张永进,赖年悦,林琳,姜绍通,陆剑锋.中华草龟肉抗肿瘤活性肽的分离纯化及鉴定研究[J].现代食品科技,2018,34(5):24-31. 被引量：6
3李宁,石爱民,刘红芝,刘丽,胡晖,王强.生物活性肽抗癌活性及其作用机制研究进展[J].中国食品学报,2019,19(11):261-269. 被引量：17
4李子健,刘秀丽,裴乐,李锋,阿木古楞,乔健敏,侯勇跃.生物活性肽的研究进展[J].畜牧与饲料科学,2019,40(12):20-24. 被引量：13
5方春,孙福振,李彩虹,宋莉.基于长短期记忆网络的抗癌肽的预测[J].山东理工大学学报（自然科学版）,2020,34(3):34-39. 被引量：1
6蔡标,葛成,徐晴,陆翼,孔韧,常珊.基于Transformer网络的抗癌肽的预测[J].现代计算机,2022,28(18):9-15. 被引量：1
7冯映雪,赵凌云,贺佳诺,孟繁婧,郝光飞,封晓娟,刘洪伟.活性肽成药性优化策略研究进展[J].中国当代医药,2025,32(26):189-193.

1Genghong Lu,Chi Wai Tsang,Ho Nam Yim,Chao Lei,Siqi Bu,Winco K.C.Yung,Michael Pecht.Interpretable Fault Diagnosis for Overhead Lines with Covered Conductors:a Physics-informed Deep Learning Approach[J].Protection and Control of Modern Power Systems,2025,10(2):25-39.
2李文利,张甜甜,王梦媛,张志清.小胶质细胞与缺血性脑卒中相关性的研究进展[J].中国老年学杂志,2025,45(17):4343-4347.
3付恒建,姚明丽,夏杰,罗新宇,刘清心,张洁,吴松,潘卫东.基于机器学习和体外评价发现新型抗生素佐剂[J].中国药学杂志,2025,60(16):1707-1713.
4许家伟,朱大智,王岩.结合决策树的业扩报装供电特征提取与可解释性分析[J].微型电脑应用,2025,41(8):282-285.
5科尔.蚊子工厂:以蚊攻蚊[J].少年大世界(小学4-6年级),2025(7):49-49.
6刘洋,曹赛雅,冯月娇,王杰,陈腾.胰腺癌双硫死亡基因相关LncRNA预后模型构建[J].中国老年学杂志,2025,45(16):3859-3864.
7杜志勇,沈丽萍,王清.镧系元素在肿瘤领域的研究进展[J].现代肿瘤医学,2025,33(11):1976-1981.
8苏轶琰,胡泽琦,翁文庆,顾毅豪,冯燕兵,程冰冰,姜宁华.基于文献计量学的视网膜静脉阻塞研究现状和趋势分析[J].中华眼科杂志,2025,61(9):706-714.
9薛文轩,王征,武雪.绿原酸治疗心力衰竭作用机制的研究进展[J].生命科学,2025,37(9):1085-1092.
10张琳,张何阳,王东文.长链酰基辅酶A合成酶3在泛癌中的预后价值和免疫浸润分析[J].现代泌尿生殖肿瘤杂志,2025,17(4):267-270.

生物工程学报

2025年第8期

浏览历史

内容加载中请稍等...

pLM4ACP:基于蛋白质语言模型的抗癌肽预测模型

参考文献1

二级参考文献36

共引文献7

相关作者

相关机构

相关主题

浏览历史