Traditionally, it is widely accepted that measurement error usually obeys the normal distribution. However, in this paper a new idea is proposed that the error in digitized data which is a major derived data source in...Traditionally, it is widely accepted that measurement error usually obeys the normal distribution. However, in this paper a new idea is proposed that the error in digitized data which is a major derived data source in GIS does not obey the normal distribution but the p-norm distribution with a determinate parameter. Assuming that the error is random and has the same statistical properties, the probability density function of the normal distribution, Laplace distribution and p-norm distribution are derived based on the arithmetic mean axiom, median axiom and p-median axiom, which means that the normal distribution is only one of these distributions but not the least one. Based on this ideal distribution fitness tests such as Skewness and Kurtosis coefficient test, Pearson chi-square chi(2) test and Kolmogorov test for digitized data are conducted. The results show that the error in map digitization obeys the p-norm distribution whose parameter is close to 1.60. A least p-norm estimation and the least square estimation of digitized data are further analyzed, showing that the least p-norm adjustment is better than the least square adjustment for digitized data processing in GIS.展开更多
The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial charact...The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial characteristics is presented to estimate road traffic states. Firstly, the representative road traffic state data were extracted to establish the reference sequences of road traffic running characteristics(RSRTRC). Secondly, the spatial road traffic state data sequence was selected and the kernel function was constructed, with which the spatial road traffic data sequence could be mapped into a high dimensional feature space. Thirdly, the referenced and current spatial road traffic data sequences were extracted and the Euclidean distances in the feature space between them were obtained. Finally, the road traffic states were estimated from weighted averages of the selected k road traffic states, which corresponded to the nearest Euclidean distances. Several typical links in Beijing were adopted for case studies. The final results of the experiments show that the accuracy of this algorithm for estimating speed and volume is 95.27% and 91.32% respectively, which prove that this road traffic states estimation approach based on kernel-KNN matching of road traffic spatial characteristics is feasible and can achieve a high accuracy.展开更多
The design, analysis and parallel implementation of particle filter(PF) were investigated. Firstly, to tackle the particle degeneracy problem in the PF, an iterated importance density function(IIDF) was proposed, wher...The design, analysis and parallel implementation of particle filter(PF) were investigated. Firstly, to tackle the particle degeneracy problem in the PF, an iterated importance density function(IIDF) was proposed, where a new term associating with the current measurement information(CMI) was introduced into the expression of the sampled particles. Through the repeated use of the least squares estimate, the CMI can be integrated into the sampling stage in an iterative manner, conducing to the greatly improved sampling quality. By running the IIDF, an iterated PF(IPF) can be obtained. Subsequently, a parallel resampling(PR) was proposed for the purpose of parallel implementation of IPF, whose main idea was the same as systematic resampling(SR) but performed differently. The PR directly used the integral part of the product of the particle weight and particle number as the number of times that a particle was replicated, and it simultaneously eliminated the particles with the smallest weights, which are the two key differences from the SR. The detailed implementation procedures on the graphics processing unit of IPF based on the PR were presented at last. The performance of the IPF, PR and their parallel implementations are illustrated via one-dimensional numerical simulation and practical application of passive radar target tracking.展开更多
In order to improve the performance of the probability hypothesis density(PHD) algorithm based particle filter(PF) in terms of number estimation and states extraction of multiple targets, a new probability hypothesis ...In order to improve the performance of the probability hypothesis density(PHD) algorithm based particle filter(PF) in terms of number estimation and states extraction of multiple targets, a new probability hypothesis density filter algorithm based on marginalized particle and kernel density estimation is proposed, which utilizes the idea of marginalized particle filter to enhance the estimating performance of the PHD. The state variables are decomposed into linear and non-linear parts. The particle filter is adopted to predict and estimate the nonlinear states of multi-target after dimensionality reduction, while the Kalman filter is applied to estimate the linear parts under linear Gaussian condition. Embedding the information of the linear states into the estimated nonlinear states helps to reduce the estimating variance and improve the accuracy of target number estimation. The meanshift kernel density estimation, being of the inherent nature of searching peak value via an adaptive gradient ascent iteration, is introduced to cluster particles and extract target states, which is independent of the target number and can converge to the local peak position of the PHD distribution while avoiding the errors due to the inaccuracy in modeling and parameters estimation. Experiments show that the proposed algorithm can obtain higher tracking accuracy when using fewer sampling particles and is of lower computational complexity compared with the PF-PHD.展开更多
Urban air pollution has brought great troubles to physical and mental health,economic development,environmental protection,and other aspects.Predicting the changes and trends of air pollution can provide a scientific ...Urban air pollution has brought great troubles to physical and mental health,economic development,environmental protection,and other aspects.Predicting the changes and trends of air pollution can provide a scientific basis for governance and prevention efforts.In this paper,we propose an interval prediction method that considers the spatio-temporal characteristic information of PM_(2.5)signals from multiple stations.K-nearest neighbor(KNN)algorithm interpolates the lost signals in the process of collection,transmission,and storage to ensure the continuity of data.Graph generative network(GGN)is used to process time-series meteorological data with complex structures.The graph U-Nets framework is introduced into the GGN model to enhance its controllability to the graph generation process,which is beneficial to improve the efficiency and robustness of the model.In addition,sparse Bayesian regression is incorporated to improve the dimensional disaster defect of traditional kernel density estimation(KDE)interval prediction.With the support of sparse strategy,sparse Bayesian regression kernel density estimation(SBR-KDE)is very efficient in processing high-dimensional large-scale data.The PM_(2.5)data of spring,summer,autumn,and winter from 34 air quality monitoring sites in Beijing verified the accuracy,generalization,and superiority of the proposed model in interval prediction.展开更多
As a production quality index of hematite grinding process,particle size(PS)is hard to be measured in real time.To achieve the PS estimation,this paper proposes a novel data driven model of PS using stochastic configu...As a production quality index of hematite grinding process,particle size(PS)is hard to be measured in real time.To achieve the PS estimation,this paper proposes a novel data driven model of PS using stochastic configuration network(SCN)with robust technique,namely,robust SCN(RSCN).Firstly,this paper proves the universal approximation property of RSCN with weighted least squares technique.Secondly,three robust algorithms are presented by employing M-estimation with Huber loss function,M-estimation with interquartile range(IQR)and nonparametric kernel density estimation(NKDE)function respectively to set the penalty weight.Comparison experiments are first carried out based on the UCI standard data sets to verify the effectiveness of these methods,and then the data-driven PS model based on the robust algorithms are established and verified.Experimental results show that the RSCN has an excellent performance for the PS estimation.展开更多
针对核密度估计载荷外推全局固定带宽的局限性,提出一种基于KANN-DBSCAN(K-average nearest neighbor density-based spatial clustering of applications with noise)改进带宽取值的核密度估计(kernel density estimation, KDE)载荷外...针对核密度估计载荷外推全局固定带宽的局限性,提出一种基于KANN-DBSCAN(K-average nearest neighbor density-based spatial clustering of applications with noise)改进带宽取值的核密度估计(kernel density estimation, KDE)载荷外推方法。通过KANN-DBSCAN聚类算法对载荷数据进行分组聚类,采用拇指法求得不同簇间的最优带宽,然后进行核密度估计,再采用蒙特卡洛模拟进行外推。以某电动汽车在用户道路的实测载荷数据为应用对象,对外推方法的合理性进行检验。从统计参数检验量、拟合度检验和伪损伤检验3个指标对外推效果进行评估。结果表明:相比固定带宽的核密度估计外推方法,基于KANN-DBSCSN核密度估计的外推方法获得的外推载荷在统计参数上与实测载荷更为接近,均值、标准差和最大值的误差分别仅为1.9%、 4.3%和1.9%;幅值累计频次曲线拟合度R2均大于0.99,伪损伤均接近1。结果验证了该聚类方法在核密度估计载荷外推的有效性,有助于编制汽车在用户道路上的载荷谱,为具有相似载荷分布特点的机械零部件载荷外推提供了参考。展开更多
A novel particle filter bandwidth adaption for kernel particle filter (BAKPF) is proposed. Selection of the kernel bandwidth is a critical issue in kernel density estimation (KDE). The plug-in method is adopted to...A novel particle filter bandwidth adaption for kernel particle filter (BAKPF) is proposed. Selection of the kernel bandwidth is a critical issue in kernel density estimation (KDE). The plug-in method is adopted to get the global fixed bandwidth by optimizing the asymptotic mean integrated squared error (AMISE) firstly. Then, particle-driven bandwidth selection is invoked in the KDE. To get a more effective allocation of the particles, the KDE with adap- tive bandwidth in the BAKPF is used to approximate the posterior probability density function (PDF) by moving particles toward the posterior. A closed-form expression of the true distribution is given. The simulation results show that the proposed BAKPF performs better than the standard particle filter (PF), unscented particle filter (UPF) and the kernel particle filter (KPF) both in efficiency and estimation precision.展开更多
由于烧结过程中存在众多不确定性因素,使得机理分析和点预测结果的可靠性不足.基于此提出随机森林-极限树-核密度估计(random forest-extreme tree-kernel density estimation,RF-ET-KDE)算法对物理指标(粒度、水分)进行区间预测.首先,...由于烧结过程中存在众多不确定性因素,使得机理分析和点预测结果的可靠性不足.基于此提出随机森林-极限树-核密度估计(random forest-extreme tree-kernel density estimation,RF-ET-KDE)算法对物理指标(粒度、水分)进行区间预测.首先,采用数据预处理和特征选择操作筛选出最适合建模的特征变量.其次,使用基于Stacking的RF-ET算法对指标进行点预测,该算法使得模型有较高的准确性和泛化性.然后,采用KDE算法计算指标的预测误差,得到了一定置信水平下的分布区间和区间预测结果.最后,用所建模型与其余组合模型进行对比.结果表明,RF-ET算法有较高的点预测效果,KDE算法可以很好地量化指标的误差,可以得到较高可靠度的区间预测结果.展开更多
交叉熵法可显著加速电网可靠性评估,但往往聚焦于独立随机变量,若将其拓展至相关性变量可进一步提升加速性能。为有效获取相关性变量的重要抽样密度函数以实现其重要抽样,针对相关性建模中广泛使用的核密度估计模型(kernel density esti...交叉熵法可显著加速电网可靠性评估,但往往聚焦于独立随机变量,若将其拓展至相关性变量可进一步提升加速性能。为有效获取相关性变量的重要抽样密度函数以实现其重要抽样,针对相关性建模中广泛使用的核密度估计模型(kernel density estimation,KDE)开展了交叉熵优化研究。因KDE模型不属于指数分布家族,传统交叉熵优化难以实施,故利用复合抽样算法特点提出了新颖的直接交叉熵优化方法,推导出KDE模型最优权重参数的解析表达式。因权重参数数量级较小,直接优化易导致准确性退化,故基于子集模拟思想进一步提出间接交叉熵优化方法,将较小的权重参数优化转换成较大的条件概率优化,提升了优化准确性。通过MRTS79和MRTS96可靠性测试系统的评估分析,验证了所提方法在含相关性变量电网可靠性评估中的高效加速性能。展开更多
文摘Traditionally, it is widely accepted that measurement error usually obeys the normal distribution. However, in this paper a new idea is proposed that the error in digitized data which is a major derived data source in GIS does not obey the normal distribution but the p-norm distribution with a determinate parameter. Assuming that the error is random and has the same statistical properties, the probability density function of the normal distribution, Laplace distribution and p-norm distribution are derived based on the arithmetic mean axiom, median axiom and p-median axiom, which means that the normal distribution is only one of these distributions but not the least one. Based on this ideal distribution fitness tests such as Skewness and Kurtosis coefficient test, Pearson chi-square chi(2) test and Kolmogorov test for digitized data are conducted. The results show that the error in map digitization obeys the p-norm distribution whose parameter is close to 1.60. A least p-norm estimation and the least square estimation of digitized data are further analyzed, showing that the least p-norm adjustment is better than the least square adjustment for digitized data processing in GIS.
基金Projects(LQ16E080012,LY14F030012)supported by the Zhejiang Provincial Natural Science Foundation,ChinaProject(61573317)supported by the National Natural Science Foundation of ChinaProject(2015001)supported by the Open Fund for a Key-Key Discipline of Zhejiang University of Technology,China
文摘The accurate estimation of road traffic states can provide decision making for travelers and traffic managers. In this work,an algorithm based on kernel-k nearest neighbor(KNN) matching of road traffic spatial characteristics is presented to estimate road traffic states. Firstly, the representative road traffic state data were extracted to establish the reference sequences of road traffic running characteristics(RSRTRC). Secondly, the spatial road traffic state data sequence was selected and the kernel function was constructed, with which the spatial road traffic data sequence could be mapped into a high dimensional feature space. Thirdly, the referenced and current spatial road traffic data sequences were extracted and the Euclidean distances in the feature space between them were obtained. Finally, the road traffic states were estimated from weighted averages of the selected k road traffic states, which corresponded to the nearest Euclidean distances. Several typical links in Beijing were adopted for case studies. The final results of the experiments show that the accuracy of this algorithm for estimating speed and volume is 95.27% and 91.32% respectively, which prove that this road traffic states estimation approach based on kernel-KNN matching of road traffic spatial characteristics is feasible and can achieve a high accuracy.
基金Project(61372136) supported by the National Natural Science Foundation of China
文摘The design, analysis and parallel implementation of particle filter(PF) were investigated. Firstly, to tackle the particle degeneracy problem in the PF, an iterated importance density function(IIDF) was proposed, where a new term associating with the current measurement information(CMI) was introduced into the expression of the sampled particles. Through the repeated use of the least squares estimate, the CMI can be integrated into the sampling stage in an iterative manner, conducing to the greatly improved sampling quality. By running the IIDF, an iterated PF(IPF) can be obtained. Subsequently, a parallel resampling(PR) was proposed for the purpose of parallel implementation of IPF, whose main idea was the same as systematic resampling(SR) but performed differently. The PR directly used the integral part of the product of the particle weight and particle number as the number of times that a particle was replicated, and it simultaneously eliminated the particles with the smallest weights, which are the two key differences from the SR. The detailed implementation procedures on the graphics processing unit of IPF based on the PR were presented at last. The performance of the IPF, PR and their parallel implementations are illustrated via one-dimensional numerical simulation and practical application of passive radar target tracking.
基金Project(61101185) supported by the National Natural Science Foundation of ChinaProject(2011AA1221) supported by the National High Technology Research and Development Program of China
文摘In order to improve the performance of the probability hypothesis density(PHD) algorithm based particle filter(PF) in terms of number estimation and states extraction of multiple targets, a new probability hypothesis density filter algorithm based on marginalized particle and kernel density estimation is proposed, which utilizes the idea of marginalized particle filter to enhance the estimating performance of the PHD. The state variables are decomposed into linear and non-linear parts. The particle filter is adopted to predict and estimate the nonlinear states of multi-target after dimensionality reduction, while the Kalman filter is applied to estimate the linear parts under linear Gaussian condition. Embedding the information of the linear states into the estimated nonlinear states helps to reduce the estimating variance and improve the accuracy of target number estimation. The meanshift kernel density estimation, being of the inherent nature of searching peak value via an adaptive gradient ascent iteration, is introduced to cluster particles and extract target states, which is independent of the target number and can converge to the local peak position of the PHD distribution while avoiding the errors due to the inaccuracy in modeling and parameters estimation. Experiments show that the proposed algorithm can obtain higher tracking accuracy when using fewer sampling particles and is of lower computational complexity compared with the PF-PHD.
基金Project(2020YFC2008605)supported by the National Key Research and Development Project of ChinaProject(52072412)supported by the National Natural Science Foundation of ChinaProject(2021JJ30359)supported by the Natural Science Foundation of Hunan Province,China。
文摘Urban air pollution has brought great troubles to physical and mental health,economic development,environmental protection,and other aspects.Predicting the changes and trends of air pollution can provide a scientific basis for governance and prevention efforts.In this paper,we propose an interval prediction method that considers the spatio-temporal characteristic information of PM_(2.5)signals from multiple stations.K-nearest neighbor(KNN)algorithm interpolates the lost signals in the process of collection,transmission,and storage to ensure the continuity of data.Graph generative network(GGN)is used to process time-series meteorological data with complex structures.The graph U-Nets framework is introduced into the GGN model to enhance its controllability to the graph generation process,which is beneficial to improve the efficiency and robustness of the model.In addition,sparse Bayesian regression is incorporated to improve the dimensional disaster defect of traditional kernel density estimation(KDE)interval prediction.With the support of sparse strategy,sparse Bayesian regression kernel density estimation(SBR-KDE)is very efficient in processing high-dimensional large-scale data.The PM_(2.5)data of spring,summer,autumn,and winter from 34 air quality monitoring sites in Beijing verified the accuracy,generalization,and superiority of the proposed model in interval prediction.
基金Projects(61603393,61741318)supported in part by the National Natural Science Foundation of ChinaProject(BK20160275)supported by the Natural Science Foundation of Jiangsu Province,China+1 种基金Project(2015M581885)supported by the Postdoctoral Science Foundation of ChinaProject(PAL-N201706)supported by the Open Project Foundation of State Key Laboratory of Synthetical Automation for Process Industries of Northeastern University,China
文摘As a production quality index of hematite grinding process,particle size(PS)is hard to be measured in real time.To achieve the PS estimation,this paper proposes a novel data driven model of PS using stochastic configuration network(SCN)with robust technique,namely,robust SCN(RSCN).Firstly,this paper proves the universal approximation property of RSCN with weighted least squares technique.Secondly,three robust algorithms are presented by employing M-estimation with Huber loss function,M-estimation with interquartile range(IQR)and nonparametric kernel density estimation(NKDE)function respectively to set the penalty weight.Comparison experiments are first carried out based on the UCI standard data sets to verify the effectiveness of these methods,and then the data-driven PS model based on the robust algorithms are established and verified.Experimental results show that the RSCN has an excellent performance for the PS estimation.
文摘针对核密度估计载荷外推全局固定带宽的局限性,提出一种基于KANN-DBSCAN(K-average nearest neighbor density-based spatial clustering of applications with noise)改进带宽取值的核密度估计(kernel density estimation, KDE)载荷外推方法。通过KANN-DBSCAN聚类算法对载荷数据进行分组聚类,采用拇指法求得不同簇间的最优带宽,然后进行核密度估计,再采用蒙特卡洛模拟进行外推。以某电动汽车在用户道路的实测载荷数据为应用对象,对外推方法的合理性进行检验。从统计参数检验量、拟合度检验和伪损伤检验3个指标对外推效果进行评估。结果表明:相比固定带宽的核密度估计外推方法,基于KANN-DBSCSN核密度估计的外推方法获得的外推载荷在统计参数上与实测载荷更为接近,均值、标准差和最大值的误差分别仅为1.9%、 4.3%和1.9%;幅值累计频次曲线拟合度R2均大于0.99,伪损伤均接近1。结果验证了该聚类方法在核密度估计载荷外推的有效性,有助于编制汽车在用户道路上的载荷谱,为具有相似载荷分布特点的机械零部件载荷外推提供了参考。
基金supported by the National Natural Science Foundation of China (60736043 60805012)the Fundamental Research Funds for the Central Universities (K50510020032)
文摘A novel particle filter bandwidth adaption for kernel particle filter (BAKPF) is proposed. Selection of the kernel bandwidth is a critical issue in kernel density estimation (KDE). The plug-in method is adopted to get the global fixed bandwidth by optimizing the asymptotic mean integrated squared error (AMISE) firstly. Then, particle-driven bandwidth selection is invoked in the KDE. To get a more effective allocation of the particles, the KDE with adap- tive bandwidth in the BAKPF is used to approximate the posterior probability density function (PDF) by moving particles toward the posterior. A closed-form expression of the true distribution is given. The simulation results show that the proposed BAKPF performs better than the standard particle filter (PF), unscented particle filter (UPF) and the kernel particle filter (KPF) both in efficiency and estimation precision.
文摘由于烧结过程中存在众多不确定性因素,使得机理分析和点预测结果的可靠性不足.基于此提出随机森林-极限树-核密度估计(random forest-extreme tree-kernel density estimation,RF-ET-KDE)算法对物理指标(粒度、水分)进行区间预测.首先,采用数据预处理和特征选择操作筛选出最适合建模的特征变量.其次,使用基于Stacking的RF-ET算法对指标进行点预测,该算法使得模型有较高的准确性和泛化性.然后,采用KDE算法计算指标的预测误差,得到了一定置信水平下的分布区间和区间预测结果.最后,用所建模型与其余组合模型进行对比.结果表明,RF-ET算法有较高的点预测效果,KDE算法可以很好地量化指标的误差,可以得到较高可靠度的区间预测结果.
文摘交叉熵法可显著加速电网可靠性评估,但往往聚焦于独立随机变量,若将其拓展至相关性变量可进一步提升加速性能。为有效获取相关性变量的重要抽样密度函数以实现其重要抽样,针对相关性建模中广泛使用的核密度估计模型(kernel density estimation,KDE)开展了交叉熵优化研究。因KDE模型不属于指数分布家族,传统交叉熵优化难以实施,故利用复合抽样算法特点提出了新颖的直接交叉熵优化方法,推导出KDE模型最优权重参数的解析表达式。因权重参数数量级较小,直接优化易导致准确性退化,故基于子集模拟思想进一步提出间接交叉熵优化方法,将较小的权重参数优化转换成较大的条件概率优化,提升了优化准确性。通过MRTS79和MRTS96可靠性测试系统的评估分析,验证了所提方法在含相关性变量电网可靠性评估中的高效加速性能。