Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used parti...Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used partial least square regression (PLSR) models to relate forest parameters, i.e. canopy closure density and above ground tree biomass, to Landsat ETM+ data. The established models were optimized according to the variable importance for projection (VIP) criterion and the bootstrap method, and their performance was compared using several statistical indices. All variables selected by the VIP criterion passed the bootstrap test (p〈0.05). The simplified models without insignificant variables (VIP 〈1) performed as well as the full model but with less computation time. The relative root mean square error (RMSE%) was 29% for canopy closure density, and 58% for above ground tree biomass. We conclude that PLSR can be an effective method for estimating canopy closure density and above ground biomass.展开更多
Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can a...Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can affect its quantification performance.In this work,we propose a hybrid variable selection method to improve the performance of LIBS quantification.Important variables are first identified using Pearson's correlation coefficient,mutual information,least absolute shrinkage and selection operator(LASSO)and random forest,and then filtered and combined with empirical variables related to fingerprint elements of coal ash content.Subsequently,these variables are fed into a partial least squares regression(PLSR).Additionally,in some models,certain variables unrelated to ash content are removed manually to study the impact of variable deselection on model performance.The proposed hybrid strategy was tested on three LIBS datasets for quantitative analysis of coal ash content and compared with the corresponding data-driven baseline method.It is significantly better than the variable selection only method based on empirical knowledge and in most cases outperforms the baseline method.The results showed that on all three datasets the hybrid strategy for variable selection combining empirical knowledge and data-driven algorithms achieved the lowest root mean square error of prediction(RMSEP)values of 1.605,3.478 and 1.647,respectively,which were significantly lower than those obtained from multiple linear regression using only 12 empirical variables,which are 1.959,3.718 and 2.181,respectively.The LASSO-PLSR model with empirical support and 20 selected variables exhibited a significantly improved performance after variable deselection,with RMSEP values dropping from 1.635,3.962 and 1.647 to 1.483,3.086 and 1.567,respectively.Such results demonstrate that using empirical knowledge as a support for datadriven variable selection can be a viable approach to improve the accuracy and reliability of LIBS quantification.展开更多
以江汉平原滨湖地区不同土地利用类型的土壤样本为例,比较了基于目标土壤理化性质的浓度梯度法、扩展的基于多种理化性质的综合法(P-KS)、基于光谱信息的KS法、最邻近样本去除法(reduce nearest neighbor samples,RNNS)法和基于浓度分...以江汉平原滨湖地区不同土地利用类型的土壤样本为例,比较了基于目标土壤理化性质的浓度梯度法、扩展的基于多种理化性质的综合法(P-KS)、基于光谱信息的KS法、最邻近样本去除法(reduce nearest neighbor samples,RNNS)法和基于浓度分层并结合光谱信息的C-KS、C-RNNS法,基于地类分层再结合上述方法,构建具有不同层次土壤信息代表性的校正集,采用偏最小二乘回归法,建立土壤有机质可见光/近红外光谱反演模型。结果表明,具有单一代表性的浓度梯度法、KS法、RNNS法难以建立适用模型;具有光谱与理化性质二元代表性的C-KS方法模型预测精度得到了明显的提升,相对分析误差(ratio of performance to standard deviation,RPD)为1.66;考虑土地利用类型后,浓度梯度法、RNNS法与C-KS法模型预测精度有明显的提升,RPD分别达到了1.84、1.51、1.75,模型具有良好的适用性。说明具有多层次土壤信息代表性的校正集构建方法对提高土壤有机质可见光/近红外光谱反演模型的适用性具有较好作用。展开更多
基金supported by the 948 Program of the State Forestry Administration (2009-4-43)the National Natura Science Foundation of China (No.30870420)
文摘Boreal forests play an important role in global environment systems. Understanding boreal forest ecosystem structure and function requires accurate monitoring and estimating of forest canopy and biomass. We used partial least square regression (PLSR) models to relate forest parameters, i.e. canopy closure density and above ground tree biomass, to Landsat ETM+ data. The established models were optimized according to the variable importance for projection (VIP) criterion and the bootstrap method, and their performance was compared using several statistical indices. All variables selected by the VIP criterion passed the bootstrap test (p〈0.05). The simplified models without insignificant variables (VIP 〈1) performed as well as the full model but with less computation time. The relative root mean square error (RMSE%) was 29% for canopy closure density, and 58% for above ground tree biomass. We conclude that PLSR can be an effective method for estimating canopy closure density and above ground biomass.
基金financial supports from National Natural Science Foundation of China(No.62205172)Huaneng Group Science and Technology Research Project(No.HNKJ22-H105)Tsinghua University Initiative Scientific Research Program and the International Joint Mission on Climate Change and Carbon Neutrality。
文摘Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can affect its quantification performance.In this work,we propose a hybrid variable selection method to improve the performance of LIBS quantification.Important variables are first identified using Pearson's correlation coefficient,mutual information,least absolute shrinkage and selection operator(LASSO)and random forest,and then filtered and combined with empirical variables related to fingerprint elements of coal ash content.Subsequently,these variables are fed into a partial least squares regression(PLSR).Additionally,in some models,certain variables unrelated to ash content are removed manually to study the impact of variable deselection on model performance.The proposed hybrid strategy was tested on three LIBS datasets for quantitative analysis of coal ash content and compared with the corresponding data-driven baseline method.It is significantly better than the variable selection only method based on empirical knowledge and in most cases outperforms the baseline method.The results showed that on all three datasets the hybrid strategy for variable selection combining empirical knowledge and data-driven algorithms achieved the lowest root mean square error of prediction(RMSEP)values of 1.605,3.478 and 1.647,respectively,which were significantly lower than those obtained from multiple linear regression using only 12 empirical variables,which are 1.959,3.718 and 2.181,respectively.The LASSO-PLSR model with empirical support and 20 selected variables exhibited a significantly improved performance after variable deselection,with RMSEP values dropping from 1.635,3.962 and 1.647 to 1.483,3.086 and 1.567,respectively.Such results demonstrate that using empirical knowledge as a support for datadriven variable selection can be a viable approach to improve the accuracy and reliability of LIBS quantification.
文摘以江汉平原滨湖地区不同土地利用类型的土壤样本为例,比较了基于目标土壤理化性质的浓度梯度法、扩展的基于多种理化性质的综合法(P-KS)、基于光谱信息的KS法、最邻近样本去除法(reduce nearest neighbor samples,RNNS)法和基于浓度分层并结合光谱信息的C-KS、C-RNNS法,基于地类分层再结合上述方法,构建具有不同层次土壤信息代表性的校正集,采用偏最小二乘回归法,建立土壤有机质可见光/近红外光谱反演模型。结果表明,具有单一代表性的浓度梯度法、KS法、RNNS法难以建立适用模型;具有光谱与理化性质二元代表性的C-KS方法模型预测精度得到了明显的提升,相对分析误差(ratio of performance to standard deviation,RPD)为1.66;考虑土地利用类型后,浓度梯度法、RNNS法与C-KS法模型预测精度有明显的提升,RPD分别达到了1.84、1.51、1.75,模型具有良好的适用性。说明具有多层次土壤信息代表性的校正集构建方法对提高土壤有机质可见光/近红外光谱反演模型的适用性具有较好作用。