摘要
                
                    提出了一种基于多次重复液相色谱-质谱(LC-MS)实验,结合肽信号时间差校准的峰形相似性统计学习模型,解决了重复实验数据肽链校准匹配准确性与覆盖率低的问题。采用统计学习的方法,首先建立时间差统计模型,结果表明仅靠时间特征无法完全消除校准误差。因此,除了时间特征,引入了峰形相似性特征,即认为同一种肽链在多次重复实验谱图中会产生相似的LC峰形。通过选取训练数据集,提出了一种新的基于LC峰形相似性的肽信号校准算法,并通过测试序列完成模型测试。结果表明,改进算法的准确率达98. 3%;将该数学模型应用于校准匹配两个实验数据的所有LC-MS/MS肽链,其覆盖率达91. 0%。峰形相似性特征结合时间特征可以提升多次重复LC-MS实验中相关肽链信号的匹配校准的准确性与覆盖率。
                
                Based on replicate results of liquid chromatography-mass spectrometry(LC-MS),a statistical learning model,which combined time difference and peak shape similarity,was proposed in this paper to solve the problems of low matching accuracy and low coverage of peptide chain alignment.A time difference statistical model was built,which focused on the statistical characteristics of time shift.However,only based on the time feature,the error of alignment could not be eliminated completely.Besides the time,peak shape feature was also introduced in this paper under the hypothesis that the same peptide chain would produce similar LC peaks in repeated experiments.This model was also tested by testing peptide sequences signal.Results showed that the accuracy of the proposed method could reach 98.3%.The coverage for the union of the two datasets could achieve 91.0%.The peak shape similarity model could improve the final result of the time model,helping to confirm the corresponding peak pair in LC-MS replicates data.
    
    
                作者
                    崔健
                    董晓睿
                    商凯
                    陈强
                    祁鑫
                    崔浩
                CUI Jian;DONG Xiao-rui;SHANG Kai;CHEN Qiang;QI Xin;CUI Hao(Department of Information Technology,Shengli College China University of Petroleum,Dongying257016,China;College of Computer and Communication Engineering,China University of Petroleum,Qingdao266580,China)
     
    
    
                出处
                
                    《分析测试学报》
                        
                                CAS
                                CSCD
                                北大核心
                        
                    
                        2018年第12期1457-1462,共6页
                    
                
                    Journal of Instrumental Analysis
     
            
                基金
                    山东省高等学校科研发展计划项目(J18KB071)
                    地方高校国家级大学生创新创业训练计划项目(201713386028)
            
    
                关键词
                    液相色谱-质谱实验
                    校准
                    峰形相似性
                    统计学习算法
                
                        liquid chromatography-mass spectrometry(LC-MS)
                        alignment
                        similarity of peak shape
                        statistical learning model
                
     
    
    
                作者简介
通讯作者:崔健,博士,讲师,研究方向蛋白质的液相色谱-质谱实验数据挖掘及肽信号匹配识别,E-mailjian.cui@slcupc.edu.cn.