Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real proc...Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real processes, the available data set is usually obtained with missing values. To overcome the shortcomings of global modeling and missing data values, a new modeling method is proposed. Firstly, an incomplete data set with missing values is partitioned into several clusters by a K-means with soft constraints (KSC) algorithm, which incorporates soft constraints to enable clustering with missing values. Then a local model based on each group is developed by using SVR algorithm, which adopts a missing value insensitive (MVI) kernel to investigate the missing value estimation problem. For each local model, its valid area is gotten as well. Simulation results prove the effectiveness of the current local model and the estimation algorithm.展开更多
Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not ...Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not satisfactory. The contribution of the vector x(t) with different modules is theoretically proved to be unequal, and a weighted K-means clustering method is proposed on this grounds. The proposed algorithm is not only as fast as the conventional K-means clustering method, but can also achieve considerably accurate results, which is demonstrated by numerical experiments.展开更多
In this paper,a blind multiband spectrum sensing(BMSS)method requiring no knowledge of noise power,primary signal and wireless channel is proposed based on the K-means clustering(KMC).In this approach,the KMC algorith...In this paper,a blind multiband spectrum sensing(BMSS)method requiring no knowledge of noise power,primary signal and wireless channel is proposed based on the K-means clustering(KMC).In this approach,the KMC algorithm is used to identify the occupied subband set(OSS)and the idle subband set(ISS),and then the location and number information of the occupied channels are obtained according to the elements in the OSS.Compared with the classical BMSS methods based on the information theoretic criteria(ITC),the new method shows more excellent performance especially in the low signal-to-noise ratio(SNR)and the small sampling number scenarios,and more robust detection performance in noise uncertainty or unequal noise variance applications.Meanwhile,the new method performs more stablely than the ITC-based methods when the occupied subband number increases or the primary signals suffer multi-path fading.Simulation result verifies the effectiveness of the proposed method.展开更多
为了充分利用实际高速公路路段交通拥堵信息,更合理地聚类交通拥堵的内在规律和特征变化,提出自适应确定聚类中心C和类别K值(adaptive center and K-means value,ACK-Means)的聚类算法,进行高速公路拥堵路段聚类。ACK-Means算法借助簇...为了充分利用实际高速公路路段交通拥堵信息,更合理地聚类交通拥堵的内在规律和特征变化,提出自适应确定聚类中心C和类别K值(adaptive center and K-means value,ACK-Means)的聚类算法,进行高速公路拥堵路段聚类。ACK-Means算法借助簇类密度、簇类间距以及簇类强度,同时又考虑到数据样本的偶然性,对离群点进行合理分配,ACK-Means算法可实现自适应确定聚类中心C和类别K值。基于实际交通拥堵信息构建数据集,Python编程实现高速公路拥堵路段ACK-Means聚类,巧妙解决了高速公路拥堵路段聚类数目K和聚类中心C设定问题。聚类结果表明,ACK-Means算法实现高速公路拥堵路段无监督聚类,聚类结果完全基于实际的高速公路交通拥堵信息,具有更高的实用性。展开更多
The residual elastic energy index is a scientific evaluation index for rockburst proneness.In laboratory test,it is sometimes difficult to obtain the post-peak curve or to test the rock sample several times,which make...The residual elastic energy index is a scientific evaluation index for rockburst proneness.In laboratory test,it is sometimes difficult to obtain the post-peak curve or to test the rock sample several times,which makes it impossible to calculate the residual elastic energy index accurately.Based on 241 sets of experimental data and four input indexes of density,elastic modulus,peak intensity and peak input strain energy,this study proposed a machine learning model combining k-means clustering algorithm and random forest regression model:cluster forest(CF)model.The research employed a stratified sampling method on the dataset to ensure the representativeness and balance of the samples.Subsequently,grid search and five-fold cross-validation were utilized to optimize the model’s hyperparameters,aiming to enhance its generalization capability and prediction accuracy.Finally,the performance of the optimal model was evaluated using a test set and compared with five other commonly used models.The results indicate that the CF model outperformed the other models on the testing set,with a mean absolute error of 6.6%,and an accuracy of 93.9%.The results of sensitivity analyses reveal the degree of influence of each variable on rockburst proneness and the applicability of the CF model when the input parameters are missing.The robustness and generalization ability of the model were verified by introducing experimental data from other studies,and the results confirmed the reliability and applicability of the model.Therefore,the model not only effectively simplifies the acquisition of the residual elastic energy index,but also shows excellent performance and wide applicability.展开更多
Intuitionistic fuzzy sets(IFSs) are useful means to describe and deal with vague and uncertain data.An intuitionistic fuzzy C-means algorithm to cluster IFSs is developed.In each stage of the intuitionistic fuzzy C-me...Intuitionistic fuzzy sets(IFSs) are useful means to describe and deal with vague and uncertain data.An intuitionistic fuzzy C-means algorithm to cluster IFSs is developed.In each stage of the intuitionistic fuzzy C-means method the seeds are modified,and for each IFS a membership degree to each of the clusters is estimated.In the end of the algorithm,all the given IFSs are clustered according to the estimated membership degrees.Furthermore,the algorithm is extended for clustering interval-valued intuitionistic fuzzy sets(IVIFSs).Finally,the developed algorithms are illustrated through conducting experiments on both the real-world and simulated data sets.展开更多
Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a membership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set...Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a membership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set (IVIFS), whose components are intervals rather than exact numbers. IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty. However, it seems that little attention has been focused on the clustering analysis of IFSs and IVIFSs. An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs, which is based on the traditional hierarchical clustering procedure, the intuitionistic fuzzy aggregation operator, and the basic distance measures between IFSs: the Hamming distance, normalized Hamming, weighted Hamming, the Euclidean distance, the normalized Euclidean distance, and the weighted Euclidean distance. Subsequently, the algorithm is extended for clustering IVIFSs. Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.展开更多
To realize content-hased retrieval of large image databases, it is required to develop an efficient index and retrieval scheme. This paper proposes an index algorithm of clustering called CMA, which supports fast retr...To realize content-hased retrieval of large image databases, it is required to develop an efficient index and retrieval scheme. This paper proposes an index algorithm of clustering called CMA, which supports fast retrieval of large image databases. CMA takes advantages of k-means and self-adaptive algorithms. It is simple and works without any user interactions. There are two main stages in this algorithm. In the first stage, it classifies images in a database into several clusters, and automatically gets the necessary parameters for the next stage-k-means iteration. The CMA algorithm is tested on a large database of more than ten thousand images and compare it with k-means algorithm. Experimental results show that this algorithm is effective in both precision and retrieval time.展开更多
现有多视角聚类算法存在:1)在学习低维表征的过程中无法准确捕获或忽略嵌入在多视角数据中的高阶信息和互补信息;2)未能准确捕获数据局部信息;3)信息捕获方法缺少对噪声点鲁棒性等问题.为解决上述问题,提出一种自适应张量奇异值收缩的...现有多视角聚类算法存在:1)在学习低维表征的过程中无法准确捕获或忽略嵌入在多视角数据中的高阶信息和互补信息;2)未能准确捕获数据局部信息;3)信息捕获方法缺少对噪声点鲁棒性等问题.为解决上述问题,提出一种自适应张量奇异值收缩的多视角聚类(multi-view clustering based on adaptive tensor singular value shrinkage,ATSVS)算法.ATSVS首先提出一种符合秩特性的张量对数行列式函数对表示张量施加低秩约束,在张量奇异值分解(tensor singular value decomposition,t-SVD)过程中能够根据奇异值自身大小进行自适应收缩,更加准确地进行张量秩估计,进而从全局角度精准捕获多视角数据的高阶信息和互补信息.然后采用一种结合稀疏表示和流形正则技术优势的l_(1,2)范数捕获数据的局部信息,并结合l_(2,1)范数对噪声施加稀疏约束,提升算法对噪声点的鲁棒性.与11个对比算法在9个数据集上的实验结果显示,ATSVS的聚类性能均优于其他对比算法.因此,ATSVS是一个能够有效处理多视角数据聚类任务的优秀算法.展开更多
基金supported by Key Discipline Construction Program of Beijing Municipal Commission of Education (XK10008043)
文摘Most real application processes belong to a complex nonlinear system with incomplete information. It is difficult to estimate a model by assuming that the data set is governed by a global model. Moreover, in real processes, the available data set is usually obtained with missing values. To overcome the shortcomings of global modeling and missing data values, a new modeling method is proposed. Firstly, an incomplete data set with missing values is partitioned into several clusters by a K-means with soft constraints (KSC) algorithm, which incorporates soft constraints to enable clustering with missing values. Then a local model based on each group is developed by using SVR algorithm, which adopts a missing value insensitive (MVI) kernel to investigate the missing value estimation problem. For each local model, its valid area is gotten as well. Simulation results prove the effectiveness of the current local model and the estimation algorithm.
基金the National Natural Science Foundation of China (60672061)
文摘Blind separation of sparse sources (BSSS) is discussed. The BSSS method based on the conventional K-means clustering is very fast and is also easy to implement. However, the accuracy of this method is generally not satisfactory. The contribution of the vector x(t) with different modules is theoretically proved to be unequal, and a weighted K-means clustering method is proposed on this grounds. The proposed algorithm is not only as fast as the conventional K-means clustering method, but can also achieve considerably accurate results, which is demonstrated by numerical experiments.
基金Projects(61362018,61861019)supported by the National Natural Science Foundation of ChinaProject(1402041B)supported by the Jiangsu Province Postdoctoral Scientific Research Project,China+1 种基金Project(16A174)supported by the Scientific Research Fund of Hunan Provincial Education Department,ChinaProject([2016]283)supported by the Research Study and Innovative Experiment Project of College Students,China
文摘In this paper,a blind multiband spectrum sensing(BMSS)method requiring no knowledge of noise power,primary signal and wireless channel is proposed based on the K-means clustering(KMC).In this approach,the KMC algorithm is used to identify the occupied subband set(OSS)and the idle subband set(ISS),and then the location and number information of the occupied channels are obtained according to the elements in the OSS.Compared with the classical BMSS methods based on the information theoretic criteria(ITC),the new method shows more excellent performance especially in the low signal-to-noise ratio(SNR)and the small sampling number scenarios,and more robust detection performance in noise uncertainty or unequal noise variance applications.Meanwhile,the new method performs more stablely than the ITC-based methods when the occupied subband number increases or the primary signals suffer multi-path fading.Simulation result verifies the effectiveness of the proposed method.
文摘为了充分利用实际高速公路路段交通拥堵信息,更合理地聚类交通拥堵的内在规律和特征变化,提出自适应确定聚类中心C和类别K值(adaptive center and K-means value,ACK-Means)的聚类算法,进行高速公路拥堵路段聚类。ACK-Means算法借助簇类密度、簇类间距以及簇类强度,同时又考虑到数据样本的偶然性,对离群点进行合理分配,ACK-Means算法可实现自适应确定聚类中心C和类别K值。基于实际交通拥堵信息构建数据集,Python编程实现高速公路拥堵路段ACK-Means聚类,巧妙解决了高速公路拥堵路段聚类数目K和聚类中心C设定问题。聚类结果表明,ACK-Means算法实现高速公路拥堵路段无监督聚类,聚类结果完全基于实际的高速公路交通拥堵信息,具有更高的实用性。
基金Project(42077244)supported by the National Natural Science Foundation of ChinaProject(SDGZK2431)supported by the State Key Laboratory of Intelligent Construction and Healthy Operation and Maintenance of Deep Underground Engineering,Sichuan University,China。
文摘The residual elastic energy index is a scientific evaluation index for rockburst proneness.In laboratory test,it is sometimes difficult to obtain the post-peak curve or to test the rock sample several times,which makes it impossible to calculate the residual elastic energy index accurately.Based on 241 sets of experimental data and four input indexes of density,elastic modulus,peak intensity and peak input strain energy,this study proposed a machine learning model combining k-means clustering algorithm and random forest regression model:cluster forest(CF)model.The research employed a stratified sampling method on the dataset to ensure the representativeness and balance of the samples.Subsequently,grid search and five-fold cross-validation were utilized to optimize the model’s hyperparameters,aiming to enhance its generalization capability and prediction accuracy.Finally,the performance of the optimal model was evaluated using a test set and compared with five other commonly used models.The results indicate that the CF model outperformed the other models on the testing set,with a mean absolute error of 6.6%,and an accuracy of 93.9%.The results of sensitivity analyses reveal the degree of influence of each variable on rockburst proneness and the applicability of the CF model when the input parameters are missing.The robustness and generalization ability of the model were verified by introducing experimental data from other studies,and the results confirmed the reliability and applicability of the model.Therefore,the model not only effectively simplifies the acquisition of the residual elastic energy index,but also shows excellent performance and wide applicability.
基金supported by the National Natural Science Foundation of China for Distinguished Young Scholars(70625005)
文摘Intuitionistic fuzzy sets(IFSs) are useful means to describe and deal with vague and uncertain data.An intuitionistic fuzzy C-means algorithm to cluster IFSs is developed.In each stage of the intuitionistic fuzzy C-means method the seeds are modified,and for each IFS a membership degree to each of the clusters is estimated.In the end of the algorithm,all the given IFSs are clustered according to the estimated membership degrees.Furthermore,the algorithm is extended for clustering interval-valued intuitionistic fuzzy sets(IVIFSs).Finally,the developed algorithms are illustrated through conducting experiments on both the real-world and simulated data sets.
基金supported by the National Natural Science Foundation of China (70571087)the National Science Fund for Distinguished Young Scholars of China (70625005)
文摘Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a membership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set (IVIFS), whose components are intervals rather than exact numbers. IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty. However, it seems that little attention has been focused on the clustering analysis of IFSs and IVIFSs. An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs, which is based on the traditional hierarchical clustering procedure, the intuitionistic fuzzy aggregation operator, and the basic distance measures between IFSs: the Hamming distance, normalized Hamming, weighted Hamming, the Euclidean distance, the normalized Euclidean distance, and the weighted Euclidean distance. Subsequently, the algorithm is extended for clustering IVIFSs. Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.
基金This project was supported by National High Tech Foundation of 863 (2001AA115123)
文摘To realize content-hased retrieval of large image databases, it is required to develop an efficient index and retrieval scheme. This paper proposes an index algorithm of clustering called CMA, which supports fast retrieval of large image databases. CMA takes advantages of k-means and self-adaptive algorithms. It is simple and works without any user interactions. There are two main stages in this algorithm. In the first stage, it classifies images in a database into several clusters, and automatically gets the necessary parameters for the next stage-k-means iteration. The CMA algorithm is tested on a large database of more than ten thousand images and compare it with k-means algorithm. Experimental results show that this algorithm is effective in both precision and retrieval time.
文摘现有多视角聚类算法存在:1)在学习低维表征的过程中无法准确捕获或忽略嵌入在多视角数据中的高阶信息和互补信息;2)未能准确捕获数据局部信息;3)信息捕获方法缺少对噪声点鲁棒性等问题.为解决上述问题,提出一种自适应张量奇异值收缩的多视角聚类(multi-view clustering based on adaptive tensor singular value shrinkage,ATSVS)算法.ATSVS首先提出一种符合秩特性的张量对数行列式函数对表示张量施加低秩约束,在张量奇异值分解(tensor singular value decomposition,t-SVD)过程中能够根据奇异值自身大小进行自适应收缩,更加准确地进行张量秩估计,进而从全局角度精准捕获多视角数据的高阶信息和互补信息.然后采用一种结合稀疏表示和流形正则技术优势的l_(1,2)范数捕获数据的局部信息,并结合l_(2,1)范数对噪声施加稀疏约束,提升算法对噪声点的鲁棒性.与11个对比算法在9个数据集上的实验结果显示,ATSVS的聚类性能均优于其他对比算法.因此,ATSVS是一个能够有效处理多视角数据聚类任务的优秀算法.