半监督学习方法通过少量标记数据和大量未标记数据来提升学习性能.Tri-training是一种经典的基于分歧的半监督学习方法,但在学习过程中可能产生标记噪声问题.为了减少Tri-training中的标记噪声对未标记数据的预测偏差,学习到更好的半监...半监督学习方法通过少量标记数据和大量未标记数据来提升学习性能.Tri-training是一种经典的基于分歧的半监督学习方法,但在学习过程中可能产生标记噪声问题.为了减少Tri-training中的标记噪声对未标记数据的预测偏差,学习到更好的半监督分类模型,用交叉熵代替错误率以更好地反映模型预估结果和真实分布之间的差距,并结合凸优化方法来达到降低标记噪声的目的,保证模型效果.在此基础上,分别提出了一种基于交叉熵的Tri-training算法、一个安全的Tri-training算法,以及一种基于交叉熵的安全Tri-training算法.在UCI(University of California Irvine)机器学习库等基准数据集上验证了所提方法的有效性,并利用显著性检验从统计学的角度进一步验证了方法的性能.实验结果表明,提出的半监督学习方法在分类性能方面优于传统的Tri-training算法,其中基于交叉熵的安全Tri-training算法拥有更高的分类性能和泛化能力.展开更多
To protect trains against strong cross-wind along Qinghai-Tibet railway, a strong wind speed monitoring and warning system was developed. And to obtain high-precision wind speed short-term forecasting values for the s...To protect trains against strong cross-wind along Qinghai-Tibet railway, a strong wind speed monitoring and warning system was developed. And to obtain high-precision wind speed short-term forecasting values for the system to make more accurate scheduling decision, two optimization algorithms were proposed. Using them to make calculative examples for actual wind speed time series from the 18th meteorological station, the results show that: the optimization algorithm based on wavelet analysis method and improved time series analysis method can attain high-precision multi-step forecasting values, the mean relative errors of one-step, three-step, five-step and ten-step forecasting are only 0.30%, 0.75%, 1.15% and 1.65%, respectively. The optimization algorithm based on wavelet analysis method and Kalman time series analysis method can obtain high-precision one-step forecasting values, the mean relative error of one-step forecasting is reduced by 61.67% to 0.115%. The two optimization algorithms both maintain the modeling simple character, and can attain prediction explicit equations after modeling calculation.展开更多
文摘半监督学习方法通过少量标记数据和大量未标记数据来提升学习性能.Tri-training是一种经典的基于分歧的半监督学习方法,但在学习过程中可能产生标记噪声问题.为了减少Tri-training中的标记噪声对未标记数据的预测偏差,学习到更好的半监督分类模型,用交叉熵代替错误率以更好地反映模型预估结果和真实分布之间的差距,并结合凸优化方法来达到降低标记噪声的目的,保证模型效果.在此基础上,分别提出了一种基于交叉熵的Tri-training算法、一个安全的Tri-training算法,以及一种基于交叉熵的安全Tri-training算法.在UCI(University of California Irvine)机器学习库等基准数据集上验证了所提方法的有效性,并利用显著性检验从统计学的角度进一步验证了方法的性能.实验结果表明,提出的半监督学习方法在分类性能方面优于传统的Tri-training算法,其中基于交叉熵的安全Tri-training算法拥有更高的分类性能和泛化能力.
基金Project(2006BAC07B03) supported by the National Key Technology R & D Program of ChinaProject(2006G040-A) supported by the Foundation of the Science and Technology Section of Ministry of RailwayProject(2008yb044) supported by the Foundation of Excellent Doctoral Dissertation of Central South University
文摘To protect trains against strong cross-wind along Qinghai-Tibet railway, a strong wind speed monitoring and warning system was developed. And to obtain high-precision wind speed short-term forecasting values for the system to make more accurate scheduling decision, two optimization algorithms were proposed. Using them to make calculative examples for actual wind speed time series from the 18th meteorological station, the results show that: the optimization algorithm based on wavelet analysis method and improved time series analysis method can attain high-precision multi-step forecasting values, the mean relative errors of one-step, three-step, five-step and ten-step forecasting are only 0.30%, 0.75%, 1.15% and 1.65%, respectively. The optimization algorithm based on wavelet analysis method and Kalman time series analysis method can obtain high-precision one-step forecasting values, the mean relative error of one-step forecasting is reduced by 61.67% to 0.115%. The two optimization algorithms both maintain the modeling simple character, and can attain prediction explicit equations after modeling calculation.