Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this pap...Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this paper, we present a quantum partial least squares(QPLS) regression algorithm. To solve the high time complexity of the PLS regression, we design a quantum eigenvector search method to speed up principal components and regression parameters construction. Meanwhile, we give a density matrix product method to avoid multiple access to quantum random access memory(QRAM)during building residual matrices. The time and space complexities of the QPLS regression are logarithmic in the independent variable dimension n, the dependent variable dimension w, and the number of variables m. This algorithm achieves exponential speed-ups over the PLS regression on n, m, and w. In addition, the QPLS regression inspires us to explore more potential quantum machine learning applications in future works.展开更多
Pinus densiflora is a pine species native to the Korean peninsula,and seed orchards have supplied mate-rial needed for afforestation in South Korea.Climate vari-ables affecting seed production have not been identified...Pinus densiflora is a pine species native to the Korean peninsula,and seed orchards have supplied mate-rial needed for afforestation in South Korea.Climate vari-ables affecting seed production have not been identified.The purpose of this study was to determine climate variables that influence annual seed production of two seed orchards using multiple linear regression(MLR),elastic net regres-sion(ENR)and partial least square regression(PLSR)mod-els.The PLSR model included 12 climatic variables from 2003 to 2020 and explained 74.3%of the total variation in seed production.It showed better predictive performance(R2=0.662)than the EN(0.516)and the MLR(0.366)mod-els.Among the 12 climatic variables,July temperature two years prior to seed production and July precipitation after one year had the strongest influence on seed production.The time periods indicated by the two variables corresponded to pollen cone initiation and female gametophyte development.The results will be helpful for developing seed collection plans,selecting new orchard sites with favorable climatic conditions,and investigating the relationships between seed production and climatic factors in related pine species.展开更多
Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used parti...Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used partial least square regression (PLSR) models to relate forest parameters, i.e. canopy closure density and above ground tree biomass, to Landsat ETM+ data. The established models were optimized according to the variable importance for projection (VIP) criterion and the bootstrap method, and their performance was compared using several statistical indices. All variables selected by the VIP criterion passed the bootstrap test (p〈0.05). The simplified models without insignificant variables (VIP 〈1) performed as well as the full model but with less computation time. The relative root mean square error (RMSE%) was 29% for canopy closure density, and 58% for above ground tree biomass. We conclude that PLSR can be an effective method for estimating canopy closure density and above ground biomass.展开更多
偏最小二乘(Partial least square,PLS)聚类法是一种全新的气溶胶单粒子光谱数据处理方法,是利用具有"自组织机制"的PLS回归算法去完成数据的聚类。阐述了PLS聚类对模拟数据集的运用以展示这种方法的一般特征及有效性,然后应...偏最小二乘(Partial least square,PLS)聚类法是一种全新的气溶胶单粒子光谱数据处理方法,是利用具有"自组织机制"的PLS回归算法去完成数据的聚类。阐述了PLS聚类对模拟数据集的运用以展示这种方法的一般特征及有效性,然后应用到气溶胶激光飞行时间质谱数据以展示PLS聚类的正确性及成功运用,最后将PLS聚类应用到氯化钙、氯化镁、氯化钠及氯化钾四种气溶胶单粒子激光击穿光谱混合数据集,通过分析聚类获得的树形图和图中节点的统计特性,剖析了正确聚类及发生错误划分的原因,表明了PLS聚类方法在气溶胶单粒子谱分析方面的应用潜力。展开更多
基金Project supported by the Fundamental Research Funds for the Central Universities, China (Grant No. 2019XD-A02)the National Natural Science Foundation of China (Grant Nos. U1636106, 61671087, 61170272, and 92046001)+2 种基金Natural Science Foundation of Beijing Municipality, China (Grant No. 4182006)Technological Special Project of Guizhou Province, China (Grant No. 20183001)the Foundation of Guizhou Provincial Key Laboratory of Public Big Data (Grant Nos. 2018BDKFJJ016 and 2018BDKFJJ018)。
文摘Partial least squares(PLS) regression is an important linear regression method that efficiently addresses the multiple correlation problem by combining principal component analysis and multiple regression. In this paper, we present a quantum partial least squares(QPLS) regression algorithm. To solve the high time complexity of the PLS regression, we design a quantum eigenvector search method to speed up principal components and regression parameters construction. Meanwhile, we give a density matrix product method to avoid multiple access to quantum random access memory(QRAM)during building residual matrices. The time and space complexities of the QPLS regression are logarithmic in the independent variable dimension n, the dependent variable dimension w, and the number of variables m. This algorithm achieves exponential speed-ups over the PLS regression on n, m, and w. In addition, the QPLS regression inspires us to explore more potential quantum machine learning applications in future works.
基金supported by the National Institute of Forest Science and by the R&D Program for Forest Science Technology(No.2022458B10-2224-0201)of the Korea Forest Service.
文摘Pinus densiflora is a pine species native to the Korean peninsula,and seed orchards have supplied mate-rial needed for afforestation in South Korea.Climate vari-ables affecting seed production have not been identified.The purpose of this study was to determine climate variables that influence annual seed production of two seed orchards using multiple linear regression(MLR),elastic net regres-sion(ENR)and partial least square regression(PLSR)mod-els.The PLSR model included 12 climatic variables from 2003 to 2020 and explained 74.3%of the total variation in seed production.It showed better predictive performance(R2=0.662)than the EN(0.516)and the MLR(0.366)mod-els.Among the 12 climatic variables,July temperature two years prior to seed production and July precipitation after one year had the strongest influence on seed production.The time periods indicated by the two variables corresponded to pollen cone initiation and female gametophyte development.The results will be helpful for developing seed collection plans,selecting new orchard sites with favorable climatic conditions,and investigating the relationships between seed production and climatic factors in related pine species.
基金supported by the 948 Program of the State Forestry Administration (2009-4-43)the National Natura Science Foundation of China (No.30870420)
文摘Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used partial least square regression (PLSR) models to relate forest parameters, i.e. canopy closure density and above ground tree biomass, to Landsat ETM+ data. The established models were optimized according to the variable importance for projection (VIP) criterion and the bootstrap method, and their performance was compared using several statistical indices. All variables selected by the VIP criterion passed the bootstrap test (p〈0.05). The simplified models without insignificant variables (VIP 〈1) performed as well as the full model but with less computation time. The relative root mean square error (RMSE%) was 29% for canopy closure density, and 58% for above ground tree biomass. We conclude that PLSR can be an effective method for estimating canopy closure density and above ground biomass.
文摘偏最小二乘(Partial least square,PLS)聚类法是一种全新的气溶胶单粒子光谱数据处理方法,是利用具有"自组织机制"的PLS回归算法去完成数据的聚类。阐述了PLS聚类对模拟数据集的运用以展示这种方法的一般特征及有效性,然后应用到气溶胶激光飞行时间质谱数据以展示PLS聚类的正确性及成功运用,最后将PLS聚类应用到氯化钙、氯化镁、氯化钠及氯化钾四种气溶胶单粒子激光击穿光谱混合数据集,通过分析聚类获得的树形图和图中节点的统计特性,剖析了正确聚类及发生错误划分的原因,表明了PLS聚类方法在气溶胶单粒子谱分析方面的应用潜力。