On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feat...On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feature extraction process integrating a forward rolling empirical mode decomposition(EMD)for financial time series signal analysis and principal component analysis(PCA)for the dimension reduction.The information-rich features are extracted then input to a weighted KNN classifier where the features are weighted with PCA loading.Finally,prediction is generated via regression on the selected nearest neighbors.The structure of the model as a whole is original.The test results on real historical data sets confirm the effectiveness of the models for predicting the Chinese stock index,an individual stock,and the EUR/USD exchange rate.展开更多
For a semi-supervised classification system, with the increase of the training samples number, the system needs to be continually updated. As the size of samples set is increasing, many unreliable samples will also be...For a semi-supervised classification system, with the increase of the training samples number, the system needs to be continually updated. As the size of samples set is increasing, many unreliable samples will also be increased. In this paper, we use fuzzy c-means (FCM) clustering to take out some samples that are useless, and extract the intersection between the original training set and the cluster after using FCM clustering. The intersection between every class and cluster is reliable samples which we are looking for. The experiment result demonstrates that the superiority of the proposed algorithm is remarkable.展开更多
针对多种故障类型的特征属性相互交叉导致故障难以辨识的问题,提出一种考虑相邻点之间成为近邻点概率的新度量函数。将新提出的近邻概率距离(Nearby Probability Distance,NPD)应用于局部保持投影算法(Locality Preserving Projection,L...针对多种故障类型的特征属性相互交叉导致故障难以辨识的问题,提出一种考虑相邻点之间成为近邻点概率的新度量函数。将新提出的近邻概率距离(Nearby Probability Distance,NPD)应用于局部保持投影算法(Locality Preserving Projection,LPP)与K-近邻(K-Nearest Neighbor,KNN)分类器中,提出基于近邻概率距离的局部保持投影算法(Nearby Probability Distance Locality Preserving Projection,NPDLPP)与基于近邻概率距离的K-近邻(Nearby Probability Distance K-Nearest Neighbor,NPDKNN)分类器;首先通过时域、频域特征提取方法,将振动信号转化为高维特征数据集,然后通过NPDLPP将高维数据集降维到低维空间,最后将降维得到的低维敏感特征集输入到NPDKNN中进行模式识别;用一个双跨度转子系统的振动信号集合进行验证,证明了所提出的降维算法效果明显,它能够达到各个故障类型更好分离。研究表明,新提出的近邻概率距离较传统的欧式距离测度更能最小化类内散度,最大化类间分离度。展开更多
Filament-induced breakdown spectroscopy(FIBS)combined with machine learning algorithms was used to identify five aluminum alloys.To study the effect of the distance between focusing lens and target surface on the iden...Filament-induced breakdown spectroscopy(FIBS)combined with machine learning algorithms was used to identify five aluminum alloys.To study the effect of the distance between focusing lens and target surface on the identification accuracy of aluminum alloys,principal component analysis(PCA)combined with support vector machine(SVM)and Knearest neighbor(KNN)was used.The intensity and intensity ratio of fifteen lines of six elements(Fe,Si,Mg,Cu,Zn,and Mn)in the FIBS spectrum were selected.The distances between the focusing lens and the target surface in the pre-filament,filament,and post-filament were 958 mm,976 mm,and 1000 mm,respectively.The source data set was fifteen spectral line intensity ratios,and the cumulative interpretation rates of PC1,PC2,and PC3 were 97.22%,98.17%,and 95.31%,respectively.The first three PCs obtained by PCA were the input variables of SVM and KNN.The identification accuracy of the different positions of focusing lens and target surface was obtained,and the identification accuracy of SVM and KNN in the filament was 100%and 90%,respectively.The source data set of the filament was obtained by PCA for the first three PCs,which were randomly selected as the training set and test set of SVM and KNN in 3:2.The identification accuracy of SVM and KNN was 97.5%and 92.5%,respectively.The research results can provide a reference for the identification of aluminum alloys by FIBS.展开更多
基金supported by the Social Science Foundation of China under Grant No.17BGL231。
文摘On the basis of machine leaning,suitable algorithms can make advanced time series analysis.This paper proposes a complex k-nearest neighbor(KNN)model for predicting financial time series.This model uses a complex feature extraction process integrating a forward rolling empirical mode decomposition(EMD)for financial time series signal analysis and principal component analysis(PCA)for the dimension reduction.The information-rich features are extracted then input to a weighted KNN classifier where the features are weighted with PCA loading.Finally,prediction is generated via regression on the selected nearest neighbors.The structure of the model as a whole is original.The test results on real historical data sets confirm the effectiveness of the models for predicting the Chinese stock index,an individual stock,and the EUR/USD exchange rate.
基金supported by the National Natural Science Foundation under Grant No.61175055 and No.61105059support of research funds of Sichuan Key Laboratory of Intelligent Network Information Processing under Grant No.SGXZD1002-10Si chuan Key Technology Research and Development Program under Grant No.2012GZ0019 and No.2011FZ0051
文摘For a semi-supervised classification system, with the increase of the training samples number, the system needs to be continually updated. As the size of samples set is increasing, many unreliable samples will also be increased. In this paper, we use fuzzy c-means (FCM) clustering to take out some samples that are useless, and extract the intersection between the original training set and the cluster after using FCM clustering. The intersection between every class and cluster is reliable samples which we are looking for. The experiment result demonstrates that the superiority of the proposed algorithm is remarkable.
文摘针对多种故障类型的特征属性相互交叉导致故障难以辨识的问题,提出一种考虑相邻点之间成为近邻点概率的新度量函数。将新提出的近邻概率距离(Nearby Probability Distance,NPD)应用于局部保持投影算法(Locality Preserving Projection,LPP)与K-近邻(K-Nearest Neighbor,KNN)分类器中,提出基于近邻概率距离的局部保持投影算法(Nearby Probability Distance Locality Preserving Projection,NPDLPP)与基于近邻概率距离的K-近邻(Nearby Probability Distance K-Nearest Neighbor,NPDKNN)分类器;首先通过时域、频域特征提取方法,将振动信号转化为高维特征数据集,然后通过NPDLPP将高维数据集降维到低维空间,最后将降维得到的低维敏感特征集输入到NPDKNN中进行模式识别;用一个双跨度转子系统的振动信号集合进行验证,证明了所提出的降维算法效果明显,它能够达到各个故障类型更好分离。研究表明,新提出的近邻概率距离较传统的欧式距离测度更能最小化类内散度,最大化类间分离度。
基金Project supported by the Natural Science Foundation of Jilin Province,China(Grant No.2020122348JC)。
文摘Filament-induced breakdown spectroscopy(FIBS)combined with machine learning algorithms was used to identify five aluminum alloys.To study the effect of the distance between focusing lens and target surface on the identification accuracy of aluminum alloys,principal component analysis(PCA)combined with support vector machine(SVM)and Knearest neighbor(KNN)was used.The intensity and intensity ratio of fifteen lines of six elements(Fe,Si,Mg,Cu,Zn,and Mn)in the FIBS spectrum were selected.The distances between the focusing lens and the target surface in the pre-filament,filament,and post-filament were 958 mm,976 mm,and 1000 mm,respectively.The source data set was fifteen spectral line intensity ratios,and the cumulative interpretation rates of PC1,PC2,and PC3 were 97.22%,98.17%,and 95.31%,respectively.The first three PCs obtained by PCA were the input variables of SVM and KNN.The identification accuracy of the different positions of focusing lens and target surface was obtained,and the identification accuracy of SVM and KNN in the filament was 100%and 90%,respectively.The source data set of the filament was obtained by PCA for the first three PCs,which were randomly selected as the training set and test set of SVM and KNN in 3:2.The identification accuracy of SVM and KNN was 97.5%and 92.5%,respectively.The research results can provide a reference for the identification of aluminum alloys by FIBS.