摘要
偏最小二乘算法(PLS)是与红外、近红外光谱分析结合使用最为广泛的化学计量学算法,然而当前PLS算法通常采用单线程方式实现,当校正模型数量多或样本数量大、波长点数和主成分数较多,模型需对光谱预处理和波长选择方法反复优化时,计算十分缓慢。为大幅提高建模速度,该文提出了一种基于图形处理器(GPU)的并行计算策略,利用具有大规模并行计算特性的GPU作为计算设备,结合CUBLAS库函数实现了基于GPU并行的PLS建模算法(CUPLS)。利用近红外光谱数据集进行性能对比实验,结果表明CUPLS建模算法较传统单线程实现的PLS算法,加速比可达近42倍,极大地提升了化学计量学算法的建模效率。该方法亦可用于其它化学计量学算法的加速。
Partial least squares (PLS) algorithm is one of the most common used chemometric algo- rithms, and is often combined with infrared and near infrared spectroscopy analysis. However, its regular implementation in a single-threaded way makes the modeling process severely ineffective when there are a great deal of models to built, or when there are iterative optimizations of the wavelength ranges and its preprocessing methods need to build an optimal model which contains thousands of sam- pies, enormous data points, and uses a large number of principal components. To give an effective modeling method in this situation, this paper presented a novel parallel chemometric computation strategy which takes the Graphic Processing Unit(GPU) as computing devices, and then the parallel PLS algorithm, i.e. CUPLS, is implemented using the CUBLAS library. Finally, using a large near infrared spectroscopy (NIR) dataset as the test bed, a performance comparison experiment is conducted, and the results showed that the speed of the parallel algorithm is 42 times faster than that of the CPU-based implementation, which dramatically ing algorithm. The proposed method shines a light on appropriate adoption. improves the efficiency of chemometrie model- speeding up other chemometric algorithms with
出处
《分析测试学报》
CAS
CSCD
北大核心
2012年第7期771-778,共8页
Journal of Instrumental Analysis
基金
国家自然科学基金(61105004)
广西自然科学基金(2012GXNSFAA053230)
广西高等学校优秀人才资助计划项目(桂教人[2011]40号)
广西可信软件重点实验室开放基金(kx201121)
广西研究生教育创新计划资助项目(2010105950812M22
2011105950811M24)
作者简介
通讯作者:杨辉华,博士,教授,研究方向:智能信息处理、机器学习与数据挖掘、近红外光谱技术及其应用,Tel:0773—2291069,E—mail:yanghuihua@tsinghHa.edu.cn