针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行...针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。展开更多
内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中...内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中数据对象数目阈值P_(min)参数的自动选取,可提高聚类分析的精度和效率。基于2010—2019年长江干线下游散货船舶事故数据开展案例研究,对各典型事故黑点段的事故特征和事故原因进行分析,得到8个事故黑点。此外,采用Getis-Ord General G聚类识别事故黑点中的高等级事故区域,得到事故黑点及高等级事故主要分布于江心洲、桥区、港口码头区域。研究结果与实际情况基本吻合,一定程度上表明了该方法在内河水上交通事故分布特征分析上的科学性和实用性。展开更多
为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of a...为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of applications with noise,DBSCAN)算法,建立船舶会遇识别模型。在DBSCAN算法对邻域内的船舶数量进行统计时,计算船舶间的最近会遇距离(distance to closest point of approach,DCPA)和最近会遇时间(time to closest point of approach,TCPA),初步筛选邻域内的噪声点;基于模糊综合评价模型计算船舶会遇风险,对邻域内的船舶进行二次筛选,实现船舶会遇态势的提取。结果表明:改进后的DBSCAN算法过滤掉传统DBSCAN算法识别到的非会遇局面,并且在同一会遇局面下的船舶数量均保持在4艘以内;输出的会遇船舶风险演变趋势对实际水域内高风险船舶的监控适用性较好,能有效辅助船舶避碰。所提识别模型对保障航行安全和提高海事监管效率具有重要意义。展开更多
针对航空发动机滑油箱油量测量值易受多个参数影响导致滑油消耗率难以计算和预测的问题,提出了一种改进的滑油量数据提取规则和滑油消耗率预测方法。基于密度聚类算法(Density-based spatial clustering of applications with noise,DBS...针对航空发动机滑油箱油量测量值易受多个参数影响导致滑油消耗率难以计算和预测的问题,提出了一种改进的滑油量数据提取规则和滑油消耗率预测方法。基于密度聚类算法(Density-based spatial clustering of applications with noise,DBSCAN)等方法对发动机数据进行了清洗,获取平稳飞行状态下滑油量数据。使用最小二乘法对滑油量进行拟合,得到了滑油消耗率,平均拟合优度达到了0.86。在此基础上,利用多层感知器(Multi-layer perception,MLP)建立了滑油消耗率与飞行状态参数之间的关系,预测结果与实际值的平均绝对百分比误差为1.15%。本文提出的方法能够满足实际工程需求,为评估航空发动机滑油系统的健康状况提供了可靠参考。展开更多
针对装甲车辆运动状态复杂性、战场态势不确定性、战术迷惑和欺骗性导致装甲车辆集群运动轨迹难以准确预测的问题,提出一种基于密度的空间聚类应用(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)和长短时记忆(L...针对装甲车辆运动状态复杂性、战场态势不确定性、战术迷惑和欺骗性导致装甲车辆集群运动轨迹难以准确预测的问题,提出一种基于密度的空间聚类应用(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)和长短时记忆(Long Short Term Memory,LSTM)神经网络的装甲车辆集群轨迹预测方法。根据装甲车辆的斜坡上行驶、转向和车-车交互行驶状态,建立运动学模型。选取机动特征、环境特征和车-车交互特征等轨迹特征信息,基于双层LSTM网络预测单个装甲车辆的轨迹。基于DBSCAN算法将多条单装预测轨迹进行分段、相似度计算和聚类,获得集群代表轨迹作为装甲车辆集群的预测轨迹。仿真结果表明,所提方法能够有效预测装甲车辆集群轨迹,实现料敌于先、谋敌于前。展开更多
构建了系列球形中空结构的纳米线(NW),采用分子动力学(MD)对每个模型300个不同初始态的样本开展拉伸形变模拟。并利用基于密度的噪声应用空间聚类(density-based spatial clustering of applications with noise,DBSCAN)机器学习算法,...构建了系列球形中空结构的纳米线(NW),采用分子动力学(MD)对每个模型300个不同初始态的样本开展拉伸形变模拟。并利用基于密度的噪声应用空间聚类(density-based spatial clustering of applications with noise,DBSCAN)机器学习算法,获得了初始滑移面的位置。基于大数据统计,分析了初始滑移位置分布以及断裂位置分布两者之间的相关性。研究结果表明:当内部中空半径较小时,断裂位置分布形成于塑性形变阶段,初始滑移分布与断裂位置分布之间无显著的相关性;但是对于脆性特征明显的大中空半径的NW,高能内表面诱导产生的滑移面迅速积累,产生颈缩并导致最终的断裂。因此当内部中空结构达到一定尺寸时初始滑移位置的分布与最终断裂位置的分布之间有明确的因果关系。展开更多
为了解决判别聚落群过于依赖考古专家人工划分的问题,以郑洛地区新石器时代聚落遗址为例,采用基于密度的DBSCAN(density-based spatial clustering of applications with noise)算法对聚落遗址进行空间聚类研究。通过对郑洛地区四个文...为了解决判别聚落群过于依赖考古专家人工划分的问题,以郑洛地区新石器时代聚落遗址为例,采用基于密度的DBSCAN(density-based spatial clustering of applications with noise)算法对聚落遗址进行空间聚类研究。通过对郑洛地区四个文化时期聚落遗址的分布分析,发现郑洛地区的主体聚落群从研究区东部的嵩山以南地区,转移到郑洛地区中部的伊洛河流域,并且在伊洛河流域长期定居下来,不断发展扩大;大型聚落遗址主要分布在主体聚落群里,除了裴李岗文化时期部分大型聚落较孤立;从仰韶文化后期到龙山文化时期,聚落遗址分布呈主从式环状分布格局;大多数聚落群的走向都和河流分布一致。研究表明,利用DBSCAN算法进行聚落遗址聚类是可行的,通过聚类得到郑洛地区新石器时代四个文化时期聚落遗址的分布特征。展开更多
For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic...For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.展开更多
Clustering is one of the unsupervised learning problems.It is a procedure which partitions data objects into groups.Many algorithms could not overcome the problems of morphology,overlapping and the large number of clu...Clustering is one of the unsupervised learning problems.It is a procedure which partitions data objects into groups.Many algorithms could not overcome the problems of morphology,overlapping and the large number of clusters at the same time.Many scientific communities have used the clustering algorithm from the perspective of density,which is one of the best methods in clustering.This study proposes a density-based spatial clustering of applications with noise(DBSCAN)algorithm based on the selected high-density areas by automatic fuzzy-DBSCAN(AFD)which works with the initialization of two parameters.AFD,by using fuzzy and DBSCAN features,is modeled by the selection of high-density areas and generates two parameters for merging and separating automatically.The two generated parameters provide a state of sub-cluster rules in the Cartesian coordinate system for the dataset.The model overcomes the problems of clustering such as morphology,overlapping,and the number of clusters in a dataset simultaneously.In the experiments,all algorithms are performed on eight data sets with 30 times of running.Three of them are related to overlapping real datasets and the rest are morphologic and synthetic datasets.It is demonstrated that the AFD algorithm outperforms other recently developed clustering algorithms.展开更多
文摘针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。
文摘内河水上交通事故时有发生,对水路运输安全、高效发展带来威胁。研究提出一种基于自适应参数的DBSCAN(Density-Based Spatial Clustering of Applications with Noise)方法,用于识别内河事故黑点水域。该方法支持对邻域半径ε和邻域中数据对象数目阈值P_(min)参数的自动选取,可提高聚类分析的精度和效率。基于2010—2019年长江干线下游散货船舶事故数据开展案例研究,对各典型事故黑点段的事故特征和事故原因进行分析,得到8个事故黑点。此外,采用Getis-Ord General G聚类识别事故黑点中的高等级事故区域,得到事故黑点及高等级事故主要分布于江心洲、桥区、港口码头区域。研究结果与实际情况基本吻合,一定程度上表明了该方法在内河水上交通事故分布特征分析上的科学性和实用性。
文摘为解决大数据下船舶会遇识别算法效率不高且存在误判等问题,提出一种融合国际海上避碰规则(International Regulations for Preventing Collisions at Sea,COLREGs)的带噪声的基于密度的空间聚类(density-based spatial clustering of applications with noise,DBSCAN)算法,建立船舶会遇识别模型。在DBSCAN算法对邻域内的船舶数量进行统计时,计算船舶间的最近会遇距离(distance to closest point of approach,DCPA)和最近会遇时间(time to closest point of approach,TCPA),初步筛选邻域内的噪声点;基于模糊综合评价模型计算船舶会遇风险,对邻域内的船舶进行二次筛选,实现船舶会遇态势的提取。结果表明:改进后的DBSCAN算法过滤掉传统DBSCAN算法识别到的非会遇局面,并且在同一会遇局面下的船舶数量均保持在4艘以内;输出的会遇船舶风险演变趋势对实际水域内高风险船舶的监控适用性较好,能有效辅助船舶避碰。所提识别模型对保障航行安全和提高海事监管效率具有重要意义。
文摘针对航空发动机滑油箱油量测量值易受多个参数影响导致滑油消耗率难以计算和预测的问题,提出了一种改进的滑油量数据提取规则和滑油消耗率预测方法。基于密度聚类算法(Density-based spatial clustering of applications with noise,DBSCAN)等方法对发动机数据进行了清洗,获取平稳飞行状态下滑油量数据。使用最小二乘法对滑油量进行拟合,得到了滑油消耗率,平均拟合优度达到了0.86。在此基础上,利用多层感知器(Multi-layer perception,MLP)建立了滑油消耗率与飞行状态参数之间的关系,预测结果与实际值的平均绝对百分比误差为1.15%。本文提出的方法能够满足实际工程需求,为评估航空发动机滑油系统的健康状况提供了可靠参考。
文摘针对装甲车辆运动状态复杂性、战场态势不确定性、战术迷惑和欺骗性导致装甲车辆集群运动轨迹难以准确预测的问题,提出一种基于密度的空间聚类应用(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)和长短时记忆(Long Short Term Memory,LSTM)神经网络的装甲车辆集群轨迹预测方法。根据装甲车辆的斜坡上行驶、转向和车-车交互行驶状态,建立运动学模型。选取机动特征、环境特征和车-车交互特征等轨迹特征信息,基于双层LSTM网络预测单个装甲车辆的轨迹。基于DBSCAN算法将多条单装预测轨迹进行分段、相似度计算和聚类,获得集群代表轨迹作为装甲车辆集群的预测轨迹。仿真结果表明,所提方法能够有效预测装甲车辆集群轨迹,实现料敌于先、谋敌于前。
文摘构建了系列球形中空结构的纳米线(NW),采用分子动力学(MD)对每个模型300个不同初始态的样本开展拉伸形变模拟。并利用基于密度的噪声应用空间聚类(density-based spatial clustering of applications with noise,DBSCAN)机器学习算法,获得了初始滑移面的位置。基于大数据统计,分析了初始滑移位置分布以及断裂位置分布两者之间的相关性。研究结果表明:当内部中空半径较小时,断裂位置分布形成于塑性形变阶段,初始滑移分布与断裂位置分布之间无显著的相关性;但是对于脆性特征明显的大中空半径的NW,高能内表面诱导产生的滑移面迅速积累,产生颈缩并导致最终的断裂。因此当内部中空结构达到一定尺寸时初始滑移位置的分布与最终断裂位置的分布之间有明确的因果关系。
文摘为了解决判别聚落群过于依赖考古专家人工划分的问题,以郑洛地区新石器时代聚落遗址为例,采用基于密度的DBSCAN(density-based spatial clustering of applications with noise)算法对聚落遗址进行空间聚类研究。通过对郑洛地区四个文化时期聚落遗址的分布分析,发现郑洛地区的主体聚落群从研究区东部的嵩山以南地区,转移到郑洛地区中部的伊洛河流域,并且在伊洛河流域长期定居下来,不断发展扩大;大型聚落遗址主要分布在主体聚落群里,除了裴李岗文化时期部分大型聚落较孤立;从仰韶文化后期到龙山文化时期,聚落遗址分布呈主从式环状分布格局;大多数聚落群的走向都和河流分布一致。研究表明,利用DBSCAN算法进行聚落遗址聚类是可行的,通过聚类得到郑洛地区新石器时代四个文化时期聚落遗址的分布特征。
基金supported by the National Key Research and Development Program of China(2018YFB1003700)the Scientific and Technological Support Project(Society)of Jiangsu Province(BE2016776)+2 种基金the“333” project of Jiangsu Province(BRA2017228 BRA2017401)the Talent Project in Six Fields of Jiangsu Province(2015-JNHB-012)
文摘For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value.
文摘Clustering is one of the unsupervised learning problems.It is a procedure which partitions data objects into groups.Many algorithms could not overcome the problems of morphology,overlapping and the large number of clusters at the same time.Many scientific communities have used the clustering algorithm from the perspective of density,which is one of the best methods in clustering.This study proposes a density-based spatial clustering of applications with noise(DBSCAN)algorithm based on the selected high-density areas by automatic fuzzy-DBSCAN(AFD)which works with the initialization of two parameters.AFD,by using fuzzy and DBSCAN features,is modeled by the selection of high-density areas and generates two parameters for merging and separating automatically.The two generated parameters provide a state of sub-cluster rules in the Cartesian coordinate system for the dataset.The model overcomes the problems of clustering such as morphology,overlapping,and the number of clusters in a dataset simultaneously.In the experiments,all algorithms are performed on eight data sets with 30 times of running.Three of them are related to overlapping real datasets and the rest are morphologic and synthetic datasets.It is demonstrated that the AFD algorithm outperforms other recently developed clustering algorithms.